38
Percona Toolkit (It's Basically Magic) SDPHP | Business.com | 05-28-14

SDPHP - Percona Toolkit (It's Basically Magic)

Embed Size (px)

DESCRIPTION

Intro talk on the Percona Toolkit, as set of tools for managing things DBAs and developers need to do with MySQL.

Citation preview

Page 1: SDPHP - Percona Toolkit (It's Basically Magic)

Percona Toolkit(It's Basically Magic)

SDPHP | Business.com | 05-28-14

Notes:

Who Am I?

https://twitter.com/robertswisher

https://plus.google.com/+RobertSwisher

https://www.linkedin.com/in/robertswisher

[email protected]

Notes:

Page 2: SDPHP - Percona Toolkit (It's Basically Magic)

Percona Toolkit(It's Basically Magic)

SDPHP | Business.com | 05-28-14

Notes:

Who Am I?

https://twitter.com/robertswisher

https://plus.google.com/+RobertSwisher

https://www.linkedin.com/in/robertswisher

[email protected]

Notes:

Page 3: SDPHP - Percona Toolkit (It's Basically Magic)

Percona?Who the hell are they?!

Notes:

Formerly known as Maatkit & Aspersa

Baron Shwartz literally wrote the book on MySQL

Open-source collection of scripts to help common

tasks that every DBA and developer has to do.

- Development

- Profiling

- Configuration

- Monitoring

- Replication

- Same code, same developers, new branding

- Source now on LaunchPad (like Percona Server)

(https://launchpad.net/percona-toolkit)

What is Percona Toolkit?

(You should use Percona Server too!)

Notes:

Page 4: SDPHP - Percona Toolkit (It's Basically Magic)

Percona?Who the hell are they?!

Notes:

Formerly known as Maatkit & Aspersa

Baron Shwartz literally wrote the book on MySQL

Open-source collection of scripts to help common

tasks that every DBA and developer has to do.

- Development

- Profiling

- Configuration

- Monitoring

- Replication

- Same code, same developers, new branding

- Source now on LaunchPad (like Percona Server)

(https://launchpad.net/percona-toolkit)

What is Percona Toolkit?

(You should use Percona Server too!)

Notes:

Page 5: SDPHP - Percona Toolkit (It's Basically Magic)

Basically anyone using running MySQL who has lots of data

Who Uses It?

(Or anyone smart and lazy like all of us)

Notes:

InstallationAs of writing current version is 2.2.7

Yum

Apt

or source

Notes:

Page 6: SDPHP - Percona Toolkit (It's Basically Magic)

Basically anyone using running MySQL who has lots of data

Who Uses It?

(Or anyone smart and lazy like all of us)

Notes:

InstallationAs of writing current version is 2.2.7

Yum

Apt

or source

Notes:

Page 7: SDPHP - Percona Toolkit (It's Basically Magic)

Tools

Notes:

What Do You Use It For?

- Schema changes

- Data archival

- Query optimization

- Data consistency

- Performance debugging

- General maintenance

Notes:

Page 8: SDPHP - Percona Toolkit (It's Basically Magic)

Tools

Notes:

What Do You Use It For?

- Schema changes

- Data archival

- Query optimization

- Data consistency

- Performance debugging

- General maintenance

Notes:

Page 9: SDPHP - Percona Toolkit (It's Basically Magic)

Schema Changes

- Always creates a copy of table before 5.6

(except fast index creation in 5.5 or 5.1 with innodb plugin)

- Table is locked during the change

- BIG tables = BIG TROUBLE (millions of rows take hours or more)

- Used to require trickery like ALTER on slave, promote to master,

ALTER on old master, promote to master again

(Gets really ugly with master-master or tiered replication)

Notes:

pt-online-schema-change

Triggers are trouble, but can be handled (dropped by default)

Foreign keys are trouble, but can be handled (dropped and rebuilt)

Takes longer than ALTER TABLE (up to 4x)

ALWAYS backup first

Notes:

Page 10: SDPHP - Percona Toolkit (It's Basically Magic)

Schema Changes

- Always creates a copy of table before 5.6

(except fast index creation in 5.5 or 5.1 with innodb plugin)

- Table is locked during the change

- BIG tables = BIG TROUBLE (millions of rows take hours or more)

- Used to require trickery like ALTER on slave, promote to master,

ALTER on old master, promote to master again

(Gets really ugly with master-master or tiered replication)

Notes:

pt-online-schema-change

Triggers are trouble, but can be handled (dropped by default)

Foreign keys are trouble, but can be handled (dropped and rebuilt)

Takes longer than ALTER TABLE (up to 4x)

ALWAYS backup first

Notes:

Page 11: SDPHP - Percona Toolkit (It's Basically Magic)

pt-online-schema-change

-- dry-run and --execute mutually exclusive

Use nohup with -- password `cat /tmp/pass`

Tune --max-lag and --max load for busy systems

Example:

nohup pt-online-schema-change --dry-run \

--alter 'CHANGE `foo` \

`foo` varchar(24) COLLATE 'latin1_bin' NULL AFTER `bar`' \

--password `cat /tmp/pass` --print --nocheck-replication-filters \

--max-load "Threads_connected:60,Threads_running:20" \

D=your_db,t=really_big_table &

Notes:

Notes:

Page 12: SDPHP - Percona Toolkit (It's Basically Magic)

pt-online-schema-change

-- dry-run and --execute mutually exclusive

Use nohup with -- password `cat /tmp/pass`

Tune --max-lag and --max load for busy systems

Example:

nohup pt-online-schema-change --dry-run \

--alter 'CHANGE `foo` \

`foo` varchar(24) COLLATE 'latin1_bin' NULL AFTER `bar`' \

--password `cat /tmp/pass` --print --nocheck-replication-filters \

--max-load "Threads_connected:60,Threads_running:20" \

D=your_db,t=really_big_table &

Notes:

Notes:

Page 13: SDPHP - Percona Toolkit (It's Basically Magic)

Notes:

Data Archival

- LOTS of writing to BIG tables = BAD

- Pruning BIG tables to only frequently accessed data = GOOD

- BIG tables more prone to corruption

- Deleting from BIG tables = SLOOOOOOW

- Long running transactions = REALLY SLOOOOOOOOOW

- DELETE locks MyISAM

Notes:

Page 14: SDPHP - Percona Toolkit (It's Basically Magic)

Notes:

Data Archival

- LOTS of writing to BIG tables = BAD

- Pruning BIG tables to only frequently accessed data = GOOD

- BIG tables more prone to corruption

- Deleting from BIG tables = SLOOOOOOW

- Long running transactions = REALLY SLOOOOOOOOOW

- DELETE locks MyISAM

Notes:

Page 15: SDPHP - Percona Toolkit (It's Basically Magic)

pt-archiver

Create destination table first

--dry-run exists, but --execute doesn't

If you use an auto-increment column, edit the schema

--limit is good for sequential data, but be careful if bouncing around

Use --progress to track

May want to archive from slave, then purge from master

ALWAYS backup first

Notes:

Notes:

Page 16: SDPHP - Percona Toolkit (It's Basically Magic)

pt-archiver

Create destination table first

--dry-run exists, but --execute doesn't

If you use an auto-increment column, edit the schema

--limit is good for sequential data, but be careful if bouncing around

Use --progress to track

May want to archive from slave, then purge from master

ALWAYS backup first

Notes:

Notes:

Page 17: SDPHP - Percona Toolkit (It's Basically Magic)

pt-archiverSlave:

pt-archiver --dry-run --ask-pass --progress 5000 \

--statistics --bulk-insert --no-delete \

--limit 5000 --source D=your_db,t=big_table \

--dest D=your_db,t=big_table_archive \

--where "timestamp < '2013-01-01'"

Master:

pt-archiver --dry-run --ask-pass --progress 5000 \

--statistics --bulk-delete --purge \

--limit 5000 --source D=your_db,t=big_table \

--where "timestamp < '2013-01-01'"

Notes:

Notes:

Page 18: SDPHP - Percona Toolkit (It's Basically Magic)

pt-archiverSlave:

pt-archiver --dry-run --ask-pass --progress 5000 \

--statistics --bulk-insert --no-delete \

--limit 5000 --source D=your_db,t=big_table \

--dest D=your_db,t=big_table_archive \

--where "timestamp < '2013-01-01'"

Master:

pt-archiver --dry-run --ask-pass --progress 5000 \

--statistics --bulk-delete --purge \

--limit 5000 --source D=your_db,t=big_table \

--where "timestamp < '2013-01-01'"

Notes:

Notes:

Page 19: SDPHP - Percona Toolkit (It's Basically Magic)

Query OptimizationNotes:

Query Optimization

--filter perl code that must return true for query to appear

--limit shows only the top % of worst queries

Notes:

Page 20: SDPHP - Percona Toolkit (It's Basically Magic)

Query OptimizationNotes:

Query Optimization

--filter perl code that must return true for query to appear

--limit shows only the top % of worst queries

Notes:

Page 21: SDPHP - Percona Toolkit (It's Basically Magic)

# Query 1: 0.00 QPS, 0.01x concurrency, ID 0x76F9EC92751F314A at byte 80096643

# This item is included in the report because it matches --limit.

# Scores: V/M = 188.68

# Time range: 2012-02-01 09:20:24 to 2013-10-04 10:47:56

# Attribute pct total min max avg 95% stddev median

# ============ === ======= ======= ======= ======= ======= ======= =======

# Count 1 490

# Exec time 14 384617s 9s 4869s 785s 1292s 385s 833s

# Lock time 2 11s 169us 6s 22ms 6ms 290ms 316us

# Rows sent 0 711.60k 0 4.82k 1.45k 4.27k 1.44k 685.39

# Rows examine 10 30.01G 0 123.80M 62.71M 117.57M 44.73M 75.78M

# Rows affecte 0 0 0 0 0 0 0 0

# Rows read 0 711.60k 0 4.82k 1.45k 4.27k 1.44k 685.39

# Bytes sent 0 21.90M 0 167.52k 45.77k 143.37k 52.07k 8.46k

# Tmp tables 8 1.91k 2 4 3.99 3.89 0.16 3.89

# Tmp disk tbl 0 0 0 0 0 0 0 0

# Tmp tbl size 3 3.36G 0 7.98M 7.02M 7.65M 1.26M 7.65M

# Query size 0 471.35k 982 986 985.03 964.41 0 964.41

# String:

# Databases bdc_ccm

# Hosts

# InnoDB trxID 13E9F1B2 (1/0%), 1402493D (1/0%)... 488 more

# Last errno 0

# Users semuser (488/99%), jackie.lam (1/0%)... 1 more

# Query_time distribution

# 1us

# 10us

# 100us

# 1ms

# 10ms

# 100ms

# 1s #

# 10s+ ################################################################

# Tables

# SHOW TABLE STATUS FROM `bdc_ccm` LIKE 'click_log_inbound'\G

# SHOW CREATE TABLE `bdc_ccm`.`click_log_inbound`\G

# SHOW TABLE STATUS FROM `bdc_ccm` LIKE 'click_log_outbound_tp'\G

# SHOW CREATE TABLE `bdc_ccm`.`click_log_outbound_tp`\G

# SHOW TABLE STATUS FROM `bdc_ccm` LIKE 'click_log_outbound'\G

# SHOW CREATE TABLE `bdc_ccm`.`click_log_outbound`\G

# EXPLAIN /*!50100 PARTITIONS*/

select date(a.timestamp), a.referrer, count(b.inbound_id), sum(b.cpc), 'AS' as tag

from click_log_inbound a, click_log_outbound_tp b

where a.id = b.inbound_id

and a.timestamp between '2012-02-08 00:00:00' and '2012-02-09 23:59:59'

and b.partner like 'adsense'

and b.flag = 0

group by date(a.timestamp), a.referrer

union all

select date(a.timestamp), a.referrer, count(b.inbound_id), sum(b.cpc) , 'FL' as tag

from click_log_inbound a, click_log_outbound b

where a.id = b.inbound_id

and a.timestamp between '2012-02-08 00:00:00' and '2012-02-09 23:59:59'

and flag = 0

group by date(a.timestamp), a.referrer

union all

select date(a.timestamp), a.referrer, count(b.inbound_id), sum(b.cpc), 'TP' as tag

from click_log_inbound a, click_log_outbound_tp b

where a.id = b.inbound_id

and a.timestamp between '2012-02-08 00:00:00' and '2012-02-09 23:59:59'

and b.partner in ('capterra', 'bdc_network')

and b.flag = 0

group by date(a.timestamp), a.referrer\G

Notes:

Data Consistency

- Replication isn't perfect

- Replication filters

- master-master replication

- 1062 “DUPLICATE KEY ERROR”

- Server crashes

- Non-deterministic (aka not idempotent) writes

Notes:

Page 22: SDPHP - Percona Toolkit (It's Basically Magic)

# Query 1: 0.00 QPS, 0.01x concurrency, ID 0x76F9EC92751F314A at byte 80096643

# This item is included in the report because it matches --limit.

# Scores: V/M = 188.68

# Time range: 2012-02-01 09:20:24 to 2013-10-04 10:47:56

# Attribute pct total min max avg 95% stddev median

# ============ === ======= ======= ======= ======= ======= ======= =======

# Count 1 490

# Exec time 14 384617s 9s 4869s 785s 1292s 385s 833s

# Lock time 2 11s 169us 6s 22ms 6ms 290ms 316us

# Rows sent 0 711.60k 0 4.82k 1.45k 4.27k 1.44k 685.39

# Rows examine 10 30.01G 0 123.80M 62.71M 117.57M 44.73M 75.78M

# Rows affecte 0 0 0 0 0 0 0 0

# Rows read 0 711.60k 0 4.82k 1.45k 4.27k 1.44k 685.39

# Bytes sent 0 21.90M 0 167.52k 45.77k 143.37k 52.07k 8.46k

# Tmp tables 8 1.91k 2 4 3.99 3.89 0.16 3.89

# Tmp disk tbl 0 0 0 0 0 0 0 0

# Tmp tbl size 3 3.36G 0 7.98M 7.02M 7.65M 1.26M 7.65M

# Query size 0 471.35k 982 986 985.03 964.41 0 964.41

# String:

# Databases bdc_ccm

# Hosts

# InnoDB trxID 13E9F1B2 (1/0%), 1402493D (1/0%)... 488 more

# Last errno 0

# Users semuser (488/99%), jackie.lam (1/0%)... 1 more

# Query_time distribution

# 1us

# 10us

# 100us

# 1ms

# 10ms

# 100ms

# 1s #

# 10s+ ################################################################

# Tables

# SHOW TABLE STATUS FROM `bdc_ccm` LIKE 'click_log_inbound'\G

# SHOW CREATE TABLE `bdc_ccm`.`click_log_inbound`\G

# SHOW TABLE STATUS FROM `bdc_ccm` LIKE 'click_log_outbound_tp'\G

# SHOW CREATE TABLE `bdc_ccm`.`click_log_outbound_tp`\G

# SHOW TABLE STATUS FROM `bdc_ccm` LIKE 'click_log_outbound'\G

# SHOW CREATE TABLE `bdc_ccm`.`click_log_outbound`\G

# EXPLAIN /*!50100 PARTITIONS*/

select date(a.timestamp), a.referrer, count(b.inbound_id), sum(b.cpc), 'AS' as tag

from click_log_inbound a, click_log_outbound_tp b

where a.id = b.inbound_id

and a.timestamp between '2012-02-08 00:00:00' and '2012-02-09 23:59:59'

and b.partner like 'adsense'

and b.flag = 0

group by date(a.timestamp), a.referrer

union all

select date(a.timestamp), a.referrer, count(b.inbound_id), sum(b.cpc) , 'FL' as tag

from click_log_inbound a, click_log_outbound b

where a.id = b.inbound_id

and a.timestamp between '2012-02-08 00:00:00' and '2012-02-09 23:59:59'

and flag = 0

group by date(a.timestamp), a.referrer

union all

select date(a.timestamp), a.referrer, count(b.inbound_id), sum(b.cpc), 'TP' as tag

from click_log_inbound a, click_log_outbound_tp b

where a.id = b.inbound_id

and a.timestamp between '2012-02-08 00:00:00' and '2012-02-09 23:59:59'

and b.partner in ('capterra', 'bdc_network')

and b.flag = 0

group by date(a.timestamp), a.referrer\G

Notes:

Data Consistency

- Replication isn't perfect

- Replication filters

- master-master replication

- 1062 “DUPLICATE KEY ERROR”

- Server crashes

- Non-deterministic (aka not idempotent) writes

Notes:

Page 23: SDPHP - Percona Toolkit (It's Basically Magic)

pt-table-checksum

Requires STATEMENT based replication for tiered replication

Replication filters are dangerous because a failed query can

break replication

May want to use nohup since it can be slow

Notes:

Notes:

Page 24: SDPHP - Percona Toolkit (It's Basically Magic)

pt-table-checksum

Requires STATEMENT based replication for tiered replication

Replication filters are dangerous because a failed query can

break replication

May want to use nohup since it can be slow

Notes:

Notes:

Page 25: SDPHP - Percona Toolkit (It's Basically Magic)

pt-table-syncNotes:

pt-table-sync

--dry-run and --execute mutually exclusive

ALWAYS backup first

In a tiered replication setup or master-master

take extra care to think through what will be done

Run on master to sync all slaves

pt-table-sync --execute --replicate test.checksum master1

Run on master for slaves individually to sync to master

pt-table-sync --execute --sync-to-master slave1

Notes:

Page 26: SDPHP - Percona Toolkit (It's Basically Magic)

pt-table-syncNotes:

pt-table-sync

--dry-run and --execute mutually exclusive

ALWAYS backup first

In a tiered replication setup or master-master

take extra care to think through what will be done

Run on master to sync all slaves

pt-table-sync --execute --replicate test.checksum master1

Run on master for slaves individually to sync to master

pt-table-sync --execute --sync-to-master slave1

Notes:

Page 27: SDPHP - Percona Toolkit (It's Basically Magic)

pt-table-syncNotes:

Notes:

Page 28: SDPHP - Percona Toolkit (It's Basically Magic)

pt-table-syncNotes:

Notes:

Page 29: SDPHP - Percona Toolkit (It's Basically Magic)

Performance Debugging

- Problems can be random

- Problems only last for a few seconds,

you can't connect and observe fast enough

- Problems like to happen at odd hours;

ETL, rollups, reporting, etc

- You can't ALWAYS log on

Notes:

pt-stalk

- Creates a lot of files

- Output inspected with pt-sift

Notes:

Page 30: SDPHP - Percona Toolkit (It's Basically Magic)

Performance Debugging

- Problems can be random

- Problems only last for a few seconds,

you can't connect and observe fast enough

- Problems like to happen at odd hours;

ETL, rollups, reporting, etc

- You can't ALWAYS log on

Notes:

pt-stalk

- Creates a lot of files

- Output inspected with pt-sift

Notes:

Page 31: SDPHP - Percona Toolkit (It's Basically Magic)

pt-stalk

Run as root

--daemonize fork and run in the background

-- sleep length to sleep between collects

-- cycles the number of cycles the var must be true to collect

--variable Threads_running and Execution_time are good ones

--disk-bytes-free don't collect if this threshold is hit

(best practice would be to set --log and --dest to a different disk

than your data lives on the same as other mysql logs)

Notes:

Notes:

Page 32: SDPHP - Percona Toolkit (It's Basically Magic)

pt-stalk

Run as root

--daemonize fork and run in the background

-- sleep length to sleep between collects

-- cycles the number of cycles the var must be true to collect

--variable Threads_running and Execution_time are good ones

--disk-bytes-free don't collect if this threshold is hit

(best practice would be to set --log and --dest to a different disk

than your data lives on the same as other mysql logs)

Notes:

Notes:

Page 33: SDPHP - Percona Toolkit (It's Basically Magic)

pt-sift

Pass it the path to the dir used with --data

(default /var/lib/pt-stalk)

Interactive program

Lots of data points collected from the time of the incident

Notes:

Notes:

Page 34: SDPHP - Percona Toolkit (It's Basically Magic)

pt-sift

Pass it the path to the dir used with --data

(default /var/lib/pt-stalk)

Interactive program

Lots of data points collected from the time of the incident

Notes:

Notes:

Page 35: SDPHP - Percona Toolkit (It's Basically Magic)

General Admin / Maintenance

pt-slave-restart - try to restart a slave skipping errors

if replication fails

pt-summary - gives a general summary of the MySQL instance

pt-upgrade - tests logged queries against a new MySQL version

pt-config-diff - show formatted diff of my.cnf files

pt-heartbeat - update table on master with heartbeat data from slaves

pt-kill - kill MySQL threads according to filters

pt-index-usage - report on index structure and usage

pt-variable-advisor - looks at runtime vars and makes suggestions

Notes:

http:s//cloud.percona.com to sign up for beta

Percona Cloud ToolsNotes:

Page 36: SDPHP - Percona Toolkit (It's Basically Magic)

General Admin / Maintenance

pt-slave-restart - try to restart a slave skipping errors

if replication fails

pt-summary - gives a general summary of the MySQL instance

pt-upgrade - tests logged queries against a new MySQL version

pt-config-diff - show formatted diff of my.cnf files

pt-heartbeat - update table on master with heartbeat data from slaves

pt-kill - kill MySQL threads according to filters

pt-index-usage - report on index structure and usage

pt-variable-advisor - looks at runtime vars and makes suggestions

Notes:

http:s//cloud.percona.com to sign up for beta

Percona Cloud ToolsNotes:

Page 37: SDPHP - Percona Toolkit (It's Basically Magic)

Notes:

QUESTIONS?

Notes:

Page 38: SDPHP - Percona Toolkit (It's Basically Magic)

Notes:

QUESTIONS?

Notes: