Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
© The Pythian Group Inc., 2018 1
Mon Apr 23th 2018 - Percona Live Santa Clara, CA, USA
Matthias Crauwels / Pep Pla
MySQL break/fix lab
© The Pythian Group Inc., 2018 2
Who are we?
© The Pythian Group Inc., 2018 3
3© The Pythian Group Inc., 2017
Matthias Crauwels● Living in Ghent, Belgium● Bachelor Computer Science● ~20 years Linux user / admin● 10+ years PHP developer● 5+ years MySQL DBA● ~1 year at Pythian as MySQL
Database Consultant● Father of Leander
© The Pythian Group Inc., 2018 4
4© The Pythian Group Inc., 2017
Pep PlaBorn in Vinaròs, a small village near the Mediterranean and currently living in Barcelona.
Most of the time I’m busy with my three kids, my partner and our two cats.
And in my spare time I’m a DBC at Pythian, surrounded by some of the most brilliant DBAs in the world.
© The Pythian Group Inc., 2018 5
5© The Pythian Group Inc., 2017
ABOUT PYTHIAN
Pythian’s 400+ IT professionals help companies adopt and manage disruptive technologies to better compete
© 2017 Pythian. Confidential 6
Years in Business
20Pythian Experts in 35 Countries
400+Current Clients
Globally
350+
7© 2018 Pythian. Confidential
AI / ML / BLOCKCHAIN
Intelligent analytics and decision making
Software autonomy
Disruptive data technologies
CLOUD MIGRATION & OPERATIONS
Plan, Migrate, Manage, Optimize, Innovate
Multi-cloud, Hybrid-Cloud, Cloud Native
ANALYTIC DATA SYSTEMS
Kick AaaS cloud-native, pre-packaged analytics platform
Custom analytics platform design, implementation and support services–for on-premises and cloud
Data science consulting and implementation services
OPERATIONAL DATA SYSTEMS
Database services–architecture to ongoing management
On prem and in the cloud
Oracle, MS SQL, MySQL, Cassandra, MongoDB, Hadoop, AWS/Azure/Google DBaaS
© The Pythian Group Inc., 2018 88© The Pythian Group Inc., 2017
Running Database Services on DC/OSGabriel Ciciliani - PythianRoom M1 - Tuesday 11:30AM - 12:20PM
Securing Your Data: All Steps for Encrypting Your MongoDB DatabaseIgor Donchovski - PythianRoom F - Wednesday 11:00AM - 11:50AM
Hands on ProxySQLRené Cannaò - ProxySQL, Derek Downey - PythianRoom 4 - Monday - 09:30AM - 12:30PM
How to Scale MongoDBIgor Donchovski - PythianRoom F - Tuesday 11:30AM - 12:20PM
Other Pythian talks this conference
© The Pythian Group Inc., 2018 9
AGENDA
9© The Pythian Group Inc., 2017
● Introductions● Getting connected● Basic MySQL troubleshooting● Replication troubleshooting● Advanced topics
© The Pythian Group Inc., 2018 10
We break it (already done)
You fix it… (but we’ll help you!)
© The Pythian Group Inc., 2018 11© The Pythian Group Inc., 2018 11
We distribute an EC2 instance for each one of you
● Use an ssh client to connect to the instance
● Username: demo-user
● Password: plscdemo
● Don’t fix other things and follow the sequence of the slides
● One standalone MySQL instance
● Several MySQL instances using dbdeployer. (https://github.com/datacharmer/dbdeployer)
Getting connectedIP address list on
http://bit.ly/2HKJcsqCommand reference (for copy/paste)
http://bit.ly/2HKNuQt
© The Pythian Group Inc., 2018 12
12© The Pythian Group Inc., 2017
Basic MySQL troubleshooting
© The Pythian Group Inc., 2018 13
AGENDA
13© The Pythian Group Inc., 2017
● MySQL instance not starting○ misconfiguration○ file permissions○ corrupted files
● Connectivity issues○ misconfiguration○ recover lost password○ server gone away
© The Pythian Group Inc., 2018 14© The Pythian Group Inc., 2018 14
[root@mysql ~]# service mysqld start
Initializing MySQL database
2018-04-07T10:45:53.364642Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).mysqld: Can't create/write to file '/var/tmp/ibYTHEZv' (Errcode: 13 - Permission denied)2018-04-07T10:45:53.367682Z 0 [ERROR] InnoDB: Unable to create temporary file; errno: 132018-04-07T10:45:53.367700Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error2018-04-07T10:45:53.367704Z 0 [ERROR] Plugin 'InnoDB' init function returned error.2018-04-07T10:45:53.367707Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.2018-04-07T10:45:53.367710Z 0 [ERROR] Failed to initialize builtin plugins.
2018-04-07T10:45:53.367712Z 0 [ERROR] Aborting
Initialization of MySQL database failed.Perhaps /etc/my.cnf is misconfigured.
[root@mysql ~]# ps -ef | grep mysqldroot 2711 2599 0 17:13 pts/0 00:00:00 grep --color=auto mysqld
Starting mysqld
© The Pythian Group Inc., 2018 15© The Pythian Group Inc., 2018 15
[root@mysql ~]# ls -ld /var/tmp/drwxrwx--- 2 root root 4096 Apr 7 10:49 /var/tmp/
[root@mysql ~]# chmod a+rwx /var/tmp/
[root@mysql ~]# ls -ld /var/tmp/drwxrwxrwx 2 root root 4096 Apr 7 10:49 /var/tmp/
Fixing tmp dir permissions
© The Pythian Group Inc., 2018 16© The Pythian Group Inc., 2018 16
[root@mysql ~]# service mysqld start
Initializing MySQL database2018-04-07T10:51:50.441437Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).2018-04-07T10:51:51.154035Z 0 [ERROR] InnoDB: mmap(137428992 bytes) failed; errno 122018-04-07T10:51:51.353940Z 0 [ERROR] InnoDB: Cannot allocate memory for the buffer pool2018-04-07T10:51:51.354044Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error2018-04-07T10:51:51.354105Z 0 [ERROR] Plugin 'InnoDB' init function returned error.2018-04-07T10:51:51.354138Z 0 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed.2018-04-07T10:51:51.354168Z 0 [ERROR] Failed to initialize builtin plugins.2018-04-07T10:51:51.354189Z 0 [ERROR] Aborting
Initialization of MySQL database failed.Perhaps /etc/my.cnf is misconfigured.
[root@mysql ~]# ps -ef | grep mysqldroot 20954 19560 0 10:55 pts/0 00:00:00 grep --color=auto mysqld[root@mysql ~]#
Starting mysqld
© The Pythian Group Inc., 2018 17© The Pythian Group Inc., 2018 17
[root@mysql ~]# perror 12OS error code 12: Cannot allocate memory
[root@mysql ~]# free -m total used free shared buffers cachedMem: 993 188 804 0 24 102-/+ buffers/cache: 61 931Swap: 0 0 0
[root@mysql ~]# grep innodb_buffer_pool_size /etc/my.cnfinnodb_buffer_pool_size = 100G
[root@mysql ~]# sed -i -e 's/100G/128M/' /etc/my.cnf[root@mysql ~]# grep innodb_buffer_pool_size /etc/my.cnfinnodb_buffer_pool_size = 128M
Fixing innodb_buffer_pool_size
© The Pythian Group Inc., 2018 18© The Pythian Group Inc., 2018 18
[root@mysql ~]# service mysqld start
Initializing MySQL database2018-04-07T11:01:34.590140Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 100 1002018-04-07T11:01:38.031833Z 0 [Warning] InnoDB: New log files created, LSN=457902018-04-07T11:01:38.274845Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.2018-04-07T11:01:38.338383Z 0 [ERROR] unknown variable 'tmpd1r=/var/tmp'2018-04-07T11:01:38.338402Z 0 [ERROR] Aborting
Initialization of MySQL database failed.Perhaps /etc/my.cnf is misconfigured.
[root@example ~]# ps -ef | grep mysqldroot 2711 2599 0 17:13 pts/0 00:00:00 grep --color=auto mysqld
Starting mysqld
© The Pythian Group Inc., 2018 19© The Pythian Group Inc., 2018 19
[root@mysql ~]# grep tmpd /etc/my.cnf[root@mysql ~]# grep tmpd /etc/mysql/my.cnfgrep: /etc/mysql/my.cnf: No such file or directory
Where is the config file?
https://dev.mysql.com/doc/refman/5.7/en/option-files.html
Check the reference manual...
strace
… or find it yourself
© The Pythian Group Inc., 2018 20© The Pythian Group Inc., 2018 20
● Option “-e trace=open,stat” will help to filter the long output of the strace.
[root@mysql ~]# strace -e trace=open,stat /usr/sbin/mysqld...stat("/etc/my.cnf", {st_mode=S_IFREG|0644, st_size=569, ...}) = 0open("/etc/my.cnf", O_RDONLY) = 3stat("/etc/mysql/.my.cnf", {st_mode=S_IFREG|0644, st_size=25, ...}) = 0open("/etc/mysql/.my.cnf", O_RDONLY) = 4stat("/etc/mysql/my.cnf", 0x7ffc96d78020) = -1 ENOENT (No such file or directory)stat("/root/.my.cnf", 0x7ffc96d78020) = -1 ENOENT (No such file or directory)...
Strace
© The Pythian Group Inc., 2018 21© The Pythian Group Inc., 2018 21
[root@mysql ~]# strace -e stat /usr/sbin/mysqld --print-defaults/usr/sbin/mysqld would have been started with the following arguments:--datadir=/var/lib/msql --socket=/var/lib/mysql/mysql.sock --query_cache_type=0 --query_cache_size=0 --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=1 --innodb_flush_method=O_DIRECT --innodb_open_files=4096 --innodb_purge_threads=1 --innodb_log_file_size=128M --innodb_log_files_in_group=2 --innodb_buffer_pool_size=256M --symbolic-links=0 --tmpd1r=/var/tmp
+++ exited with 0 +++
stat("/etc/my.cnf", {st_mode=S_IFREG|0644, st_size=569, ...}) = 0stat("/etc/mysql/.my.cnf", {st_mode=S_IFREG|0644, st_size=25, ...}) = 0stat("/etc/mysql/my.cnf", 0x7ffecb574ca0) = -1 ENOENT (No such file or directory)stat("/root/.my.cnf", 0x7ffecb574ca0) = -1 ENOENT (No such file or directory)
Strace: mysqld --print-defaults
© The Pythian Group Inc., 2018 22© The Pythian Group Inc., 2018 22
[root@mysql ~]# cat /etc/mysql/.my.cnf[mysqld]tmpd1r=/var/tmp
[root@mysql ~]# sed -i -e 's/tmpd1r/tmpdir/' /etc/mysql/.my.cnf
[root@mysql ~]# cat /etc/mysql/.my.cnf[mysqld]tmpdir=/var/tmp
Fixing tmpdir variable
© The Pythian Group Inc., 2018 23© The Pythian Group Inc., 2018 23
[root@mysql ~]# strace -e stat /usr/sbin/mysqld --print-defaults/usr/sbin/mysqld would have been started with the following arguments:--datadir=/var/lib/msql --socket=/var/lib/mysql/mysql.sock --query_cache_type=0 --query_cache_size=0 --innodb_file_per_table=1 --innodb_flush_log_at_trx_commit=1 --innodb_flush_method=O_DIRECT --innodb_open_files=4096 --innodb_purge_threads=1 --innodb_log_file_size=128M --innodb_log_files_in_group=2 --innodb_buffer_pool_size=128M --symbolic-links=0 --tmpd1r=/var/tmp
+++ exited with 0 +++
Strace: mysqld --print-defaults, again
© The Pythian Group Inc., 2018 24© The Pythian Group Inc., 2018 24
[root@mysql ~]# grep datadir /etc/my.cnfdatadir=/var/lib/msql
[root@mysql ~]# sed -i -e 's/datadir=\/var\/lib\/msql/datadir=\/var\/lib\/mysql/' /etc/my.cnf
[root@mysql ~]# grep datadir /etc/my.cnfdatadir=/var/lib/mysql
Fix DATADIR path
© The Pythian Group Inc., 2018 25© The Pythian Group Inc., 2018 25
[root@mysql ~]# service mysqld startStarting mysqld: [FAILED]
Wait where is my error output?
[root@mysql ~]# grep log-error /etc/my.cnflog-error=/var/log/mysqld.log
Starting mysqld
© The Pythian Group Inc., 2018 26© The Pythian Group Inc., 2018 26
[root@mysql ~]# tail -25 /var/log/mysqld.log2018-04-07T11:18:21.283999Z 0 [Note] /usr/libexec/mysql57/mysqld (mysqld 5.7.21) starting as process 22111 ......2018-04-07T11:18:21.289409Z 0 [Note] InnoDB: Initializing buffer pool, total size = 128M, instances = 1, chunk size = 128M2018-04-07T11:18:21.304322Z 0 [Note] InnoDB: Completed initialization of buffer pool...2018-04-07T11:18:21.317576Z 0 [ERROR] InnoDB: The innodb_system data file 'ibdata1' must be writable2018-04-07T11:18:21.317590Z 0 [ERROR] InnoDB: The innodb_system data file 'ibdata1' must be writable2018-04-07T11:18:21.317596Z 0 [ERROR] InnoDB: Plugin initialization aborted with error Generic error...2018-04-07T11:18:21.918587Z 0 [ERROR] Aborting...2018-04-07T11:18:21.918874Z 0 [Note] /usr/libexec/mysql57/mysqld: Shutdown complete
Examing the error log
© The Pythian Group Inc., 2018 27© The Pythian Group Inc., 2018 27
[root@mysql ~]# ls -hal /var/lib/mysql/total 109Mdrwxr-xr-x 5 mysql mysql 4.0K Apr 7 11:20 .drwxr-xr-x 19 root root 4.0K Apr 7 11:16 ..-rw-r----- 1 mysql mysql 56 Apr 7 10:49 auto.cnf-rw------- 1 mysql mysql 1.7K Apr 7 10:49 ca-key.pem-rw-r--r-- 1 mysql mysql 1.1K Apr 7 10:49 ca.pem-rw-r--r-- 1 mysql mysql 1.1K Apr 7 10:49 client-cert.pem-rw------- 1 mysql mysql 1.7K Apr 7 10:49 client-key.pem-rw-r----- 1 mysql mysql 350 Apr 7 10:49 ib_buffer_pool-rw-r----- 1 42 42 12M Apr 7 10:49 ibdata1-rw-r----- 1 42 42 48M Apr 7 10:49 ib_logfile0-rw-r----- 1 42 42 48M Apr 7 10:49 ib_logfile1drwxr-x--- 2 mysql mysql 4.0K Apr 7 10:49 mysql-rw-r--r-- 1 mysql mysql 7 Apr 7 10:49 mysql_upgrade_infodrwxr-x--- 2 mysql mysql 4.0K Nov 22 13:49 performance_schema-rw------- 1 mysql mysql 1.7K Apr 7 10:49 private_key.pem-rw-r--r-- 1 mysql mysql 452 Apr 7 10:49 public_key.pem-rw-r--r-- 1 mysql mysql 1.1K Apr 7 10:49 server-cert.pem-rw------- 1 mysql mysql 1.7K Apr 7 10:49 server-key.pemdrwxr-x--- 2 mysql mysql 12K Apr 7 10:49 sys
Checking permissions
© The Pythian Group Inc., 2018 28© The Pythian Group Inc., 2018 28
[root@mysql ~]# chown mysql:mysql /var/lib/mysql/ibdata1[root@mysql ~]# chown mysql:mysql /var/lib/mysql/ib_logfile*
[root@mysql ~]# ls -hal /var/lib/mysql/ib*-rw-r----- 1 mysql mysql 350 Apr 7 10:49 /var/lib/mysql/ib_buffer_pool-rw-r----- 1 mysql mysql 12M Apr 7 10:49 /var/lib/mysql/ibdata1-rw-r----- 1 mysql mysql 48M Apr 7 10:49 /var/lib/mysql/ib_logfile0-rw-r----- 1 mysql mysql 48M Apr 7 10:49 /var/lib/mysql/ib_logfile1
Fixing file permissions
© The Pythian Group Inc., 2018 29© The Pythian Group Inc., 2018 29
[root@mysql ~]# service mysqld startStarting mysqld: [FAILED]
[root@mysql ~]# tail -100 /var/log/mysqld.log...2018-04-07T11:55:57.123099Z 0 [Note] /usr/libexec/mysql57/mysqld (mysqld 5.7.21) starting as process 23322 ...2018-04-07T11:55:57.397874Z 0 [ERROR] /usr/libexec/mysql57/mysqld: Can't find file: './mysql/user.frm' (errno: 13 - Permission denied)2018-04-07T11:55:57.397903Z 0 [ERROR] Fatal error: Can't open and lock privilege tables: Can't find file: './mysql/user.frm' (errno: 13 - Permission denied)2018-04-07T11:55:57.398000Z 0 [ERROR] Aborting
Starting mysqld
© The Pythian Group Inc., 2018 30© The Pythian Group Inc., 2018 30
[root@mysql ~]# perror 13OS error code 13: Permission denied
[root@mysql ~]# ls -hl /var/lib/mysql/mysql/user.*-rw-r----- 1 root root 11K Apr 7 10:49 /var/lib/mysql/mysql/user.frm-rw-r----- 1 root root 340 Apr 7 10:49 /var/lib/mysql/mysql/user.MYD-rw-r----- 1 root root 4.0K Apr 7 10:49 /var/lib/mysql/mysql/user.MYI
[root@mysql ~]# chown mysql:mysql /var/lib/mysql/mysql/user.*
[root@mysql ~]# ls -hl /var/lib/mysql/mysql/user.*-rw-r----- 1 mysql mysql 11K Apr 7 10:49 /var/lib/mysql/mysql/user.frm-rw-r----- 1 mysql mysql 340 Apr 7 10:49 /var/lib/mysql/mysql/user.MYD-rw-r----- 1 mysql mysql 4.0K Apr 7 10:49 /var/lib/mysql/mysql/user.MYI
ERROR 13
© The Pythian Group Inc., 2018 31© The Pythian Group Inc., 2018 31
[root@mysql ~]# service mysqld startStarting mysqld: [ OK ]
[root@mysql ~]# ps -ef | grep mysqldroot 22297 1 0 11:32 pts/0 00:00:00 /bin/sh /usr/libexec/mysql57/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysqlmysql 22645 22297 0 11:32 pts/0 00:00:00 /usr/libexec/mysql57/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql57/plugin --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sockroot 22719 19560 0 11:36 pts/0 00:00:00 grep --color=auto mysqld
YAY!
Starting mysqld
© The Pythian Group Inc., 2018 32© The Pythian Group Inc., 2018 32
[root@mysql ~]# mysqlERROR 2002 (HY000): Can't connect to local MySQL server through socket '/tmp/mysql.sock' (2)
[root@mysql ~]# perror 2OS error code 2: No such file or directory
[root@mysql ~]# ls -hl /tmp/mysql.sockls: cannot access /tmp/mysql.sock: No such file or directory[root@mysql ~]#
Let’s try connecting
© The Pythian Group Inc., 2018 33© The Pythian Group Inc., 2018 33
[root@mysql ~]# grep socket /var/log/mysqld.log | tail -n 1Version: '5.7.21' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL)
[root@mysql ~]# lsof -n | grep mysql | grep unixmysqld 22645 mysql 19u unix 0xffff88003cf82400 0t0 97542 /var/lib/mysql/mysql.sock
[root@mysql ~]# grep -B 1 socket /etc/my.cnf[client]socket=/tmp/mysql.sock
[root@mysql ~]# sed -i -e 's/\/tmp\/mysql.sock/\/var\/lib\/mysql\/mysql.sock/' /etc/my.cnf
Let’s try connecting
© The Pythian Group Inc., 2018 34© The Pythian Group Inc., 2018 34
[root@mysql ~]# mysqlERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)
[root@mysql ~]# strace -e trace=open mysql...open("/etc/my.cnf", O_RDONLY) = 3open("/etc/mysql/.my.cnf", O_RDONLY) = 4open("/root/.my.cnf", O_RDONLY) = 3...ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)+++ exited with 1 +++
[root@mysql ~]# cat ~/.my.cnf[client]password=dummypass
Let’s try connecting
© The Pythian Group Inc., 2018 35© The Pythian Group Inc., 2018 35
[root@mysql ~]# mysql --no-defaultsERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)
[root@mysql ~]# mysql -pEnter password:ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)
Now what?
© The Pythian Group Inc., 2018 36© The Pythian Group Inc., 2018 36
[root@mysql ~]# sed -i 's/\[mysqld\]/&\nskip-grant-tables/' /etc/my.cnf
[root@mysql ~]# cat /etc/my.cnf[mysqld]skip-grant-tables...
[root@mysql ~]# service mysqld restartStopping mysqld: [ OK ]Starting mysqld: [ OK ][root@mysql ~]#
Recovering root password
© The Pythian Group Inc., 2018 37© The Pythian Group Inc., 2018 37
[root@mysql ~]# mysqlWelcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 3Server version: 5.7.21 MySQL Community Server (GPL)
...
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> ALTER USER root@localhost IDENTIFIED WITH 'mysql_native_password' BY 'newpass';ERROR 1290 (HY000): The MySQL server is running with the --skip-grant-tables option so it cannot execute this statement
mysql> UPDATE mysql.user SET plugin = 'mysql_native_password', authentication_string = PASSWORD('newpass') WHERE user='root';Query OK, 1 row affected, 1 warning (0.00 sec)Rows matched: 1 Changed: 1 Warnings: 1
Recovering root password
© The Pythian Group Inc., 2018 38© The Pythian Group Inc., 2018 38
Very insecure right now![root@mysql ~]# mysql -p123456mysql: [Warning] Using a password on the command line interface can be insecure.Welcome to the MySQL monitor. Commands end with ; or \g....mysql>
Remove skip-grant-tables again from /etc/my.cnf[root@mysql ~]# sed -i 's/skip-grant-tables//' /etc/my.cnf[root@mysql ~]# service mysqld restartStopping mysqld: [ OK ]Starting mysqld: [ OK ][root@mysql ~]# sed -i 's/password=dummypass/password=newpass/' ~/.my.cnf
Recovering root password
© The Pythian Group Inc., 2018 39© The Pythian Group Inc., 2018 39
[root@mysql ~]# cat ~/.my.cnf[client]password=newpass
[root@mysql ~]# mysqlWelcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 5Server version: 5.7.21 MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or itsaffiliates. Other names may be trademarks of their respectiveowners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
Great success!
© The Pythian Group Inc., 2018 40© The Pythian Group Inc., 2018 40
[root@mysql ~]# tail -50 /var/log/mysqld.log...2018-04-07T11:32:17.977681Z 0 [ERROR] Column count of performance_schema.events_waits_current is wrong. Expected 19, found 16. Created with MySQL 50551, now running 50721. Please use mysql_upgrade to fix this error.2018-04-07T11:32:17.977791Z 0 [ERROR] Column count of performance_schema.events_waits_history is wrong. Expected 19, found 16. Created with MySQL 50551, now running 50721. Please use mysql_upgrade to fix this error.2018-04-07T11:32:17.977851Z 0 [ERROR] Column count of performance_schema.events_waits_history_long is wrong. Expected 19, found 16. Created with MySQL 50551, now running 50721. Please use mysql_upgrade to fix this error....2018-04-07T11:32:17.982462Z 0 [Note] /usr/libexec/mysql57/mysqld: ready for connections.Version: '5.7.21' socket: '/var/lib/mysql/mysql.sock' port: 3306 MySQL Community Server (GPL)2018-04-07T11:32:17.986730Z 0 [Note] InnoDB: Buffer pool(s) load completed at 180407 11:32:17
It’s running… but… not really in good shape...
Let’s check error log to make sure all is well
© The Pythian Group Inc., 2018 41© The Pythian Group Inc., 2018 41
[root@mysql ~]# mysql_upgradeChecking if update is needed.Checking server version.Running queries to upgrade MySQL server.Checking system database.mysql.columns_priv OK...mysql.user OKThe sys schema is already up to date (version 1.5.1).Checking databases.sys.sys_config OKUpgrade process completed successfully.Checking if update is needed.[root@mysql ~]#
mysql_upgrade
© The Pythian Group Inc., 2018 42© The Pythian Group Inc., 2018 42
[root@mysql ~]# service mysqld restartStopping mysqld: [ OK ]Starting mysqld: [ OK ]
[root@mysql ~]# tail -50 /var/log/mysqld.log | grep -i error[root@mysql ~]#
No more errors!
© The Pythian Group Inc., 2018 43© The Pythian Group Inc., 2018 43
[root@mysql ~]# echo "SELECT '1234567890'" | mysql -N1234567890
[root@mysql ~]# ( echo -n "SELECT '" ; for i in `seq 1 2` ; do echo -n "1234567890" ; done ; echo -n "'") | mysql -N12345678901234567890
[root@mysql ~]# ( echo -n "SELECT '" ; for i in `seq 1 400000` ; do echo -n "1234567890" ; done ; echo -n "'") | mysql -N | wc 1 1 4000001
[root@mysql ~]# ( echo -n "SELECT '" ; for i in `seq 1 450000` ; do echo -n "1234567890" ; done ; echo -n "'") | mysql -N | wcERROR 2006 (HY000) at line 1: MySQL server has gone away
MySQL server has gone away
© The Pythian Group Inc., 2018 44© The Pythian Group Inc., 2018 44
[root@mysql ~]# mysql -e "SHOW GLOBAL VARIABLES LIKE 'max_allowed_packet'"+--------------------+---------+| Variable_name | Value |+--------------------+---------+| max_allowed_packet | 4194304 |+--------------------+---------+
[root@mysql ~]# mysql -e "SET GLOBAL max_allowed_packet=5242880;"
[root@mysql ~]# mysql -e "SHOW GLOBAL VARIABLES LIKE 'max_allowed_packet'"+--------------------+---------+| Variable_name | Value |+--------------------+---------+| max_allowed_packet | 5242880 |+--------------------+---------+
[root@mysql ~]# ( echo -n "SELECT '" ; for i in `seq 1 450000` ; do echo -n "1234567890" ; done ; echo -n "'") | mysql -N | wc 1 1 4500001[root@mysql ~]#
max_allowed_packet
© The Pythian Group Inc., 2018 45© The Pythian Group Inc., 2018 45
[root@mysql ~]# mysql -e "SELECT SLEEP(1000);" &[1] 17979
[root@mysql ~]# kill -6 `pidof mysqld`ERROR 2013 (HY000) at line 1: Lost connection to MySQL server during query
[root@mysql ~]# tail -n 100 /var/log/mysqld.log...08:21:39 UTC - mysqld got signal 6 ;This could be because you hit a bug. It is also possible that this binaryor one of the libraries it was linked against is corrupt, improperly built,or misconfigured. This error can also be caused by malfunctioning hardware.Attempting to collect some information that could help diagnose the problem.As this is a crash and something is definitely wrong, the informationcollection process might fail....
Let’s simulate a mysqld crash
© The Pythian Group Inc., 2018 46© The Pythian Group Inc., 2018 46
● Too big packet (max_allowed_packet)● Server crashed (or killed)
● OOM killer● Bugs● ...
● Session got terminated / killed● Session timing out (wait_timeout)
Always check the logs:
● mysql error log is your best friend (log-error variable)● dmesg, syslog or any core dumps may also contain info
Other reasons when “MySQL has gone away” occurs
© The Pythian Group Inc., 2018 47© The Pythian Group Inc., 2018 47
“I was low on disk space so I deleted some log files”
Not really a problem while you don’t restart MySQL
[root@mysql ~]# lsof | grep mysqld | grep ib_logmysqld 30174 mysql 3uW REG 202,1 134217728 394960 /var/lib/mysql/ib_logfile0 (deleted)mysqld 30174 mysql 8uW REG 202,1 134217728 394963 /var/lib/mysql/ib_logfile1 (deleted)
Recreated on restart
2018-04-14T01:21:34.032635Z 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 128 MB2018-04-14T01:21:35.171245Z 0 [Note] InnoDB: Setting log file ./ib_logfile1 size to 128 MB2018-04-14T01:21:37.273595Z 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile02018-04-14T01:21:37.273646Z 0 [Warning] InnoDB: New log files created, LSN=125042225
… if MySQL was cleanly shutdown and innodb_fast_shutdown is not set to 2!
https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_fast_shutdown
Accidental deletes - ib_logfile
© The Pythian Group Inc., 2018 48© The Pythian Group Inc., 2018 48
“I was low on disk space so I deleted some log files”
When MySQL was not cleanly shutdown (or crashed)
2018-04-14T01:24:45.872244Z 0 [ERROR] InnoDB: Your database may be corrupt or you may have copied the InnoDB tablespace but not the InnoDB log files. Please refer to http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.html for information about forcing recovery.…InnoDB: If you get repeated assertion failures or crashes, evenInnoDB: immediately after the mysqld startup, there may beInnoDB: corruption in the InnoDB tablespace. Please refer toInnoDB: http://dev.mysql.com/doc/refman/5.7/en/forcing-innodb-recovery.htmlInnoDB: about forcing recovery.
Be prepared to recover from backup...
Accidental deletes - ib_logfile
© The Pythian Group Inc., 2018 49© The Pythian Group Inc., 2018 49
“I was low on disk space so I deleted some log files”
Not really a problem while you don’t restart MySQL
[root@mysql ~]# lsof | grep mysqld | grep ib_logmysqld 30174 mysql 3uW REG 202,1 134217728 394960 /var/lib/mysql/ib_logfile0 (deleted)mysqld 30174 mysql 8uW REG 202,1 134217728 394963 /var/lib/mysql/ib_logfile1 (deleted)
It’s recoverable
[root@mysql ~]# ls -hal /proc/`pidof mysqld`/fd | grep ib_loglrwx------ 1 root root 64 Apr 14 01:34 3 -> /var/lib/mysql/ib_logfile0 (deleted)lrwx------ 1 root root 64 Apr 14 01:34 8 -> /var/lib/mysql/ib_logfile1 (deleted)
mysql> FLUSH TABLES WITH READ LOCK;< leave session open and monitor SHOW ENGINE INNODB STATUS >< wait until all logs are flushed >
[root@mysql ~]# cp /proc/`pidof mysqld`/fd/3 /var/lib/mysql/ib_logfile0[root@mysql ~]# cp /proc/`pidof mysqld`/fd/8 /var/lib/mysql/ib_logfile1[root@mysql ~]# chown mysql:mysql /var/lib/mysql/ib_logfile*
< Cleanly restart MySQL >
[root@mysql ~]# ls -hl /var/lib/mysql/ib_logfile*-rw-r----- 1 mysql mysql 128M Apr 14 01:35 /var/lib/mysql/ib_logfile0-rw-r----- 1 mysql mysql 128M Apr 14 01:35 /var/lib/mysql/ib_logfile1
Accidental deletes - ib_logfile
© The Pythian Group Inc., 2018 50© The Pythian Group Inc., 2018 50
● ibdata files (ibdata1 or *.ibd in the data dir) contain your data● They can be recovered in the same way as the ib_logfile, but...● ...DON’T delete them! Really!
Accidental deletes - ibdata files
© The Pythian Group Inc., 2018 51
51© The Pythian Group Inc., 2017
Replication troubleshooting
© The Pythian Group Inc., 2018 52
AGENDA
52© The Pythian Group Inc., 2017
● Replication concepts
○ SBR/RBR
○ GTID
○ Replication threads
● Basic troubleshooting
○ Broken replication
○ Validate replication environment
● Advanced replication
○ Replay events
© The Pythian Group Inc., 2018 53© The Pythian Group Inc., 2018 53
● Binlog● Sequential● Committed transactions● Statement format● Row format
● Replication● Slave retrieves transactions from the master binlog into the relay log● Slave applies transactions from the relay log
● GTID● Unique identifier each transaction
● Replication threads● Io_thread and sql_thread
Replication concepts
© The Pythian Group Inc., 2018 54© The Pythian Group Inc., 2018 54
[demo-user@mysql ~]$ ./start_environment.sh
[demo-user@mysql ~]$ ./replication_step_1.sh
Please execute the following commands
© The Pythian Group Inc., 2018 55© The Pythian Group Inc., 2018 55
Execution results
Please execute the following commands
[demo-user@mysql ~]$ ./start_environment.sh# executing 'start' on /home/demo-user/sandboxes/gtid-replexecuting 'start' on master. sandbox server startedexecuting 'start' on slave 1. sandbox server started# executing 'start' on /home/demo-user/sandboxes/normal-replexecuting 'start' on master. sandbox server startedexecuting 'start' on slave 1. sandbox server started[demo-user@mysql ~]$ ./replication_step_1.shsleep(1)0
© The Pythian Group Inc., 2018 56© The Pythian Group Inc., 2018 56
● dbdeployer● https://github.com/datacharmer/dbdeployer● created by Giuseppe Maxia (The Data Charmer)● rewrite of MySQL Sandbox in Go● not for production instances● allows you to quickly setup test instances:
■ no root access required■ supports replication topologies■ supports GTID
Replication tools
© The Pythian Group Inc., 2018 57© The Pythian Group Inc., 2018 57
[root@mysql ~]# dbdeployerdbdeployer makes MySQL server installation an easy task.Runs single, multiple, and replicated sandboxes.
Usage: dbdeployer [command]
Available Commands: admin sandbox management tasks defaults tasks related to dbdeployer defaults delete delete an installed sandbox deploy deploy sandboxes global Runs a given command in every sandbox help Help about any command sandboxes List installed sandboxes unpack unpack a tarball into the binary directory usage Shows usage of installed sandboxes versions List available versions
Flags: --config string configuration file (default "/root/.dbdeployer/config.json") -h, --help help for dbdeployer --sandbox-binary string Binary repository (default "/root/opt/mysql") --sandbox-home string Sandbox deployment direcory (default "/root/sandboxes") --version version for dbdeployer
Use "dbdeployer [command] --help" for more information about a command.
dbdeployer
© The Pythian Group Inc., 2018 58© The Pythian Group Inc., 2018 58
[demo-user@mysql ~]$ dbdeployer versionsBasedir: /home/demo-user/opt/mysql5.7.21
[demo-user@mysql ~]$ dbdeployer sandboxesgtid-repl : master-slave 5.7.21 [16747 16748]normal-repl : master-slave 5.7.21 [16743 16744]
[demo-user@mysql ~]$ dbdeployer global status# Running "status_all" on gtid-replREPLICATION /home/demo-user/sandboxes/gtid-replmaster : master off - (16747)node1 : node1 off - (16748)
# Running "status_all" on normal-replREPLICATION /home/demo-user/sandboxes/normal-replmaster : master off - (16743)node1 : node1 off - (16744)
[demo-user@mysql ~]$ dbdeployer global start
dbdeployer
© The Pythian Group Inc., 2018 59© The Pythian Group Inc., 2018 59
[demo-user@mysql ~]$ cd sandboxes/normal-rep1/[demo-user@mysql normal-rep1]$ ls -hltotal 76K-rwxr--r-- 1 demo-user demo-user 1.4K Apr 8 06:53 check_slaves-rwxr--r-- 1 demo-user demo-user 993 Apr 8 06:53 clear_all-rwxr--r-- 1 demo-user demo-user 1.3K Apr 8 06:53 initialize_slaves-rwxr--r-- 1 demo-user demo-user 807 Apr 8 06:53 mdrwxr-xr-x 4 demo-user demo-user 4.0K Apr 8 06:53 master-rwxr--r-- 1 demo-user demo-user 807 Apr 8 06:53 n1-rwxr--r-- 1 demo-user demo-user 805 Apr 8 06:53 n2drwxr-xr-x 4 demo-user demo-user 4.0K Apr 8 06:53 node1-rwxr--r-- 1 demo-user demo-user 839 Apr 8 06:53 restart_all-rwxr--r-- 1 demo-user demo-user 805 Apr 8 06:53 s1-rw-rw-r-- 1 demo-user demo-user 169 Apr 8 06:53 sbdescription.json-rwxr--r-- 1 demo-user demo-user 982 Apr 8 06:53 send_kill_all-rwxr--r-- 1 demo-user demo-user 1.1K Apr 8 06:53 start_all-rwxr--r-- 1 demo-user demo-user 1.2K Apr 8 06:53 status_all-rwxr--r-- 1 demo-user demo-user 956 Apr 8 06:53 stop_all-rwxr--r-- 1 demo-user demo-user 4.5K Apr 8 06:53 test_replication-rwxr--r-- 1 demo-user demo-user 1.1K Apr 8 06:53 test_sb_all-rwxr--r-- 1 demo-user demo-user 978 Apr 8 06:53 use_all
dbdeployer
© The Pythian Group Inc., 2018 60© The Pythian Group Inc., 2018 60
[demo-user@mysql normal-repl]$ ./mWelcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 6Server version: 5.7.21-log MySQL Community Server (GPL)
...Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
master [localhost] {msandbox} ((none)) >
[demo-user@mysql normal-repl]$ ./s1Welcome to the MySQL monitor. Commands end with ; or \g.Your MySQL connection id is 7Server version: 5.7.21-log MySQL Community Server (GPL)
...Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
slave1 [localhost] {msandbox} ((none)) >
dbdeployer
© The Pythian Group Inc., 2018 61© The Pythian Group Inc., 2018 61
- Show slave status
Basic Replication Troubleshootingslave1 [localhost] {root} ((none)) > show slave status\G************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: rsandbox Master_Port: 16743 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 5141 Relay_Log_File: mysql-relay.000002 Relay_Log_Pos: 476 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 5141 Relay_Log_Space: 679 Seconds_Behind_Master: 0 Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Master_Server_Id: 100 Master_UUID: 00016743-1111-1111-1111-111111111111 Master_Info_File: /home/vagrant/sandboxes/pl18/node1/data/master.info Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp:
© The Pythian Group Inc., 2018 62© The Pythian Group Inc., 2018 62
- Show master status
- Show slave hosts
Basic Replication Troubleshootingmaster [localhost] {root} ((none)) > show master status\G*************************** 1. row *************************** File: mysql-bin.000001 Position: 5141 Binlog_Do_DB: Binlog_Ignore_DB: Executed_Gtid_Set: 1 row in set (0.00 sec)
master [localhost] {root} ((none)) > show slave hosts\G*************************** 1. row *************************** Server_id: 300 Host: Port: 16745 Master_id: 100Slave_UUID: 00016745-3333-3333-3333-333333333333*************************** 2. row *************************** Server_id: 200 Host: Port: 16744 Master_id: 100Slave_UUID: 00016744-2222-2222-2222-2222222222222 rows in set (0.00 sec)
© The Pythian Group Inc., 2018 63© The Pythian Group Inc., 2018 63
- Error log
- Replication catalog tables
Basic Replication Troubleshooting
2018-04-17T11:41:11.568378Z 19 [Note] 'CHANGE MASTER TO FOR CHANNEL '' executed'. Previous state master_host='127.0.0.1', master_port= 16743, master_log_file='mysql-bin.000001', master_log_pos= 5141, master_bind=''. New state master_host='127.0.0.1', master_port= 16743, master_log_file='mysql-bin.000001', master_log_pos= 4985, master_bind=''.2018-04-17T11:41:17.104913Z 31 [Warning] Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
slave1 [localhost] {root} (mysql) > select * from slave_master_info\G*************************** 1. row *************************** Number_of_lines: 25 Master_log_name: mysql-bin.000003 Master_log_pos: 1507 Host: 127.0.0.1 User_name: rsandbox User_password: rsandbox Port: 16743 Connect_retry: 60...
© The Pythian Group Inc., 2018 64© The Pythian Group Inc., 2018 64
- Connectivity- Verify access- Check master configuration
Basic Replication Troubleshooting
slave1 [localhost] {root} ((none)) > show slave status\G*************************** 1. row *************************** Slave_IO_State: Connecting to master
Error log:
2018-04-19T19:50:44.486470Z 8 [ERROR] Slave I/O for channel '': error connecting to master '[email protected]:3306' - retry-time: 60 retries: 1, Error_code: 2003
Slave:
[demo-user@mysql node1]$ nc -vw 10 127.0.0.1 3306nc: connect to 127.0.0.1 port 3306 (tcp) failed: Connection refused
Master:
show global variables like 'port';
© The Pythian Group Inc., 2018 65© The Pythian Group Inc., 2018 65
- Connectivity (cont)- Fix the port
Basic Replication Troubleshooting
slave1 [localhost] {root} ((none)) > stop slave io_thread;
Query OK, 0 rows affected (0.00 sec)
slave1 [localhost] {root} ((none)) > change master to master_port=16743;
Query OK, 0 rows affected (0.00 sec)
slave1 [localhost] {root} ((none)) > start slave io_thread;
Query OK, 0 rows affected (0.00 sec)
© The Pythian Group Inc., 2018 66© The Pythian Group Inc., 2018 66
- Wrong privileges/credentials
Basic Replication Troubleshooting
Error log
2018-04-19T20:48:35.674488Z 26 [ERROR] Slave I/O for channel '': error connecting to master '[email protected]:16743' - retry-time: 60 retries: 1, Error_code: 1045
© The Pythian Group Inc., 2018 67© The Pythian Group Inc., 2018 67
- Wrong privileges/credentials (cont)
Basic Replication Troubleshooting
[demo-user@mysql normal-repl]$ ./master/use -u rsandbox -prsandboxmysql: [Warning] Using a password on the command line interface can be insecure.ERROR 1045 (28000): Access denied for user 'rsandbox'@'localhost' (using password: YES)
master [localhost] {root} ((none)) > select user,host,authentication_string from mysql.user where user='rsandbox';+----------+-------+-------------------------------------------+| user | host | authentication_string |+----------+-------+-------------------------------------------+| rsandbox | 127.% | *B07EB15A2E7BD9620DAE47B194D5B9DBA14377AD |+----------+-------+-------------------------------------------+1 row in set (0.00 sec)
master [localhost] {root} ((none)) > select password('rsandbox');+-------------------------------------------+| password('rsandbox') |+-------------------------------------------+| *B07EB15A2E7BD9620DAE47B194D5B9DBA14377AD |+-------------------------------------------+1 row in set, 1 warning (0.00 sec)
© The Pythian Group Inc., 2018 68© The Pythian Group Inc., 2018 68
- Wrong privileges/credentials (cont)
Basic Replication Troubleshooting
master [localhost] {root} ((none)) > show grants for rsandbox@'127.%';+------------------------------------------------------+| Grants for rsandbox@127.% |+------------------------------------------------------+| GRANT REPLICATION SLAVE ON *.* TO 'rsandbox'@'127.%' |+------------------------------------------------------+1 row in set (0.00 sec)
master [localhost] {root} ((none)) > select user,host from mysql.user;+---------------+-----------+| user | host |+---------------+-----------+| msandbox | 127.% || msandbox_ro | 127.% || msandbox_rw | 127.% || rsandbox | 127.% || | 127.0.0.1 || msandbox | localhost |...
© The Pythian Group Inc., 2018 69© The Pythian Group Inc., 2018 69
- Wrong privileges/credentials (cont)
Basic Replication Troubleshooting
master [localhost] {root} ((none)) > drop user ''@'127.0.0.1';Query OK, 0 rows affected (0.00 sec)
© The Pythian Group Inc., 2018 70© The Pythian Group Inc., 2018 70
- Server id
Basic Replication Troubleshooting
2018-04-19T21:02:22.375910Z 30 [ERROR] Slave I/O for channel '': Got fatal error 1236 from master when reading data from binary log: 'Misconfigured master - master server_id is 0', Error_code: 1236
© The Pythian Group Inc., 2018 71© The Pythian Group Inc., 2018 71
- Server id (cont)
Basic Replication Troubleshooting
master [localhost] {root} ((none)) > set global server_id=100;Query OK, 0 rows affected (0.00 sec)
slave1 [localhost] {root} ((none)) > start slave io_thread;Query OK, 0 rows affected (0.00 sec)
slave1 [localhost] {root} ((none)) > show slave status\G*************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 127.0.0.1 Master_User: rsandbox Master_Port: 16743...
© The Pythian Group Inc., 2018 72© The Pythian Group Inc., 2018 72
- Skipping events/Filtering replication
Basic Replication Troubleshooting
Last_SQL_Error: Could not execute Write_rows event on table pl18.test_table; Duplicate entry '5' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000002, end_log_pos 2748
© The Pythian Group Inc., 2018 73© The Pythian Group Inc., 2018 73
- Skipping events/Filtering replication (cont)- Diagnose issue- Decide strategy
- Data from master is good (recommended)- Data from slave is good (not recommended)
Basic Replication Troubleshooting
slave1 [localhost] {root} ((none)) > select * from pl18.test_table where ident=5;+-------+--------------------------+---------------------+| ident | text | timestamp |+-------+--------------------------+---------------------+| 5 | Good data, do not remove | 2018-04-21 08:17:03 |+-------+--------------------------+---------------------+1 row in set (0.00 sec)
© The Pythian Group Inc., 2018 74© The Pythian Group Inc., 2018 74
- Skipping events/Filtering replication (cont)
Basic Replication Troubleshooting
[demo-user@mysql node1]$ mysqlbinlog --base64-output=decode-rows --verbose --start-position 2704 --stop-position 2992 /home/demo-user/sandboxes/normal-repl/node1/data/mysql-relay.000002/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;DELIMITER /*!*/;# at 2704#180421 8:21:09 server id 0 end_log_pos 2556 CRC32 0x2601842e Anonymous_GTID last_committed=9 sequence_number=10
rbr_only=yes/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;SET @@SESSION.GTID_NEXT= 'ANONYMOUS'/*!*/;# at 2769#180421 8:21:09 server id 0 end_log_pos 2628 CRC32 0x4b9b9f85 Query thread_id=14 exec_time=0 error_code=0SET TIMESTAMP=1524298869/*!*/;SET @@session.pseudo_thread_id=14/*!*/;SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1, @@session.autocommit=1/*!*/;SET @@session.sql_mode=1436549152/*!*/;SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;/*!\C utf8 *//*!*/;SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;SET @@session.lc_time_names=0/*!*/;SET @@session.collation_database=DEFAULT/*!*/;
© The Pythian Group Inc., 2018 75© The Pythian Group Inc., 2018 75
- Skipping events/Filtering replication (cont)
Basic Replication Troubleshooting
BEGIN/*!*/;# at 2841#180421 8:21:09 server id 0 end_log_pos 2686 CRC32 0x7ccbff02 Table_map: `pl18`.`test_table` mapped to number 109# at 2899#180421 8:21:09 server id 0 end_log_pos 2748 CRC32 0x36e491e1 Write_rows: table id 109 flags: STMT_END_F### INSERT INTO `pl18`.`test_table`### SET### @1=5### @2='3VHQ8D28CH1X'### @3=1524298869# at 2961#180421 8:21:09 server id 0 end_log_pos 2779 CRC32 0x73cb18aa Xid = 82COMMIT/*!*/;SET @@SESSION.GTID_NEXT= 'AUTOMATIC' /* added by mysqlbinlog */ /*!*/;DELIMITER ;# End of log file/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/;
© The Pythian Group Inc., 2018 76© The Pythian Group Inc., 2018 76
- Skipping events/Filtering replication (cont)
Basic Replication Troubleshooting
slave1 [localhost] {root} ((none)) > set global sql_slave_skip_counter=1;Query OK, 0 rows affected (0.00 sec)
slave1 [localhost] {root} ((none)) > start slave;Query OK, 0 rows affected (0.01 sec)
© The Pythian Group Inc., 2018 77© The Pythian Group Inc., 2018 77
- Skipping events/Filtering replication (cont)
Basic Replication Troubleshooting
Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table:
In my.cnf
slave-skip-errors = <error_code>,<error_code>
--slave-skip-errors=1062,1053--slave-skip-errors=all--slave-skip-errors=ddl_exist_errors
© The Pythian Group Inc., 2018 78© The Pythian Group Inc., 2018 78
- And now for something completely different:
Basic Replication Troubleshooting
Please execute the following script:
./replication_step_2.sh
© The Pythian Group Inc., 2018 79© The Pythian Group Inc., 2018 79
- Relay log corruption
Basic Replication Troubleshooting
Last_Error: Relay log read failure: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by running 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a bug in the master's or slave's MySQL code. If you want to check the master's binary log or slave's relay log, you will be able to know their names by issuing 'SHOW SLAVE STATUS' on this slave.
Relay_Log_File: mysql-relay.000002
© The Pythian Group Inc., 2018 80© The Pythian Group Inc., 2018 80
- Relay log corruption (cont)
Basic Replication Troubleshooting
slave1 [localhost] {root} ((none)) > show relaylog events;+--------------------+-----+----------------+-----------+-------------+---------------------------------------+| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |+--------------------+-----+----------------+-----------+-------------+---------------------------------------+| mysql-relay.000001 | 4 | Format_desc | 200 | 123 | Server ver: 5.7.21-log, Binlog ver: 4 || mysql-relay.000001 | 123 | Previous_gtids | 200 | 154 | || mysql-relay.000001 | 154 | Rotate | 200 | 203 | mysql-relay.000002;pos=4 |+--------------------+-----+----------------+-----------+-------------+---------------------------------------+3 rows in set (0.00 sec)
© The Pythian Group Inc., 2018 81© The Pythian Group Inc., 2018 81
- Relay log corruption (cont)
Basic Replication Troubleshooting
slave1 [localhost] {root} ((none)) > change master to master_log_file='mysql-bin.000002', master_log_pos=595133;
Query OK, 0 rows affected (0.02 sec)
The value for master_log_file is Relay_Master_Log_File from show slave status
The value for master_log_pos is Exec_Master_Log_Pos from show slave status
© The Pythian Group Inc., 2018 82© The Pythian Group Inc., 2018 82
- Change position/master
Basic Replication Troubleshooting
A
B C D
B
C D A
© The Pythian Group Inc., 2018 83© The Pythian Group Inc., 2018 83
- Change position/master (cont)
Basic Replication Troubleshooting
A
B
C D
Make sure log_slave_updates is enabled in B .
© The Pythian Group Inc., 2018 84© The Pythian Group Inc., 2018 84
- Change position/master (cont)
Basic Replication Troubleshooting
For each slave to reposition:
1. Stop replication in that slave.2. Stop sql_thread in the (future) new master.3. Retrieve current binlog position in the future new master.
slave1 [localhost] {root} ((none)) > show master status;+------------------+----------+--------------+------------------+-------------------+| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |+------------------+----------+--------------+------------------+-------------------+| mysql-bin.000001 | 454 | | | |+------------------+----------+--------------+------------------+-------------------+
4. Retrieve relative position in the future new master using show slave status. Relevant fields are (again) Relay_Master_Log_File and Exec_master_log_pos. Now you can restart replication in the future new master.
5. Restart replication in the slave to move, but up to the position retrieved in the previous step: start slave sql_thread until master_log_file=<log_file>, master_log_pos=<position>
6. The slave has reached that position, issue a change master to master_host=<new_master>, master_log_file=<file_from_step3), master_log_position=<position_from_step3;
© The Pythian Group Inc., 2018 85© The Pythian Group Inc., 2018 85
- Change position/master (cont)
Basic Replication Troubleshooting
Once we have all the slaves pointing to the future new master, we need to promote it to the master role:
1. Make current master read_only to make sure no more changes are written. Retrieve current position with show master status.
2. Once the future master has reached that position, retrieve his position with show master status.3. Force the new master to stop replicating by issuing a reset master command. Disable read_only.4. Execute a change master in the former master with all the required parameters.
© The Pythian Group Inc., 2018 86© The Pythian Group Inc., 2018 86
- GTID replication- No explicit position- Each transaction has a unique identifier (universally unique)
- Identifies origin server across all the replication chain- Identifies transaction sequence (only committed transactions have a
transaction sequence). No gaps on the sequence.- Slaves have a record of transactions executed and transactions missing.- No master file or position is required
- PROS- No more position/file needed
- CONS- GTIDs are not purged- Any transaction executed will have a GTID associated and is possible
that it will be replicated anytime in the future.
Basic Replication Troubleshooting
© The Pythian Group Inc., 2018 87© The Pythian Group Inc., 2018 87
- GTID replication (cont)
Basic Replication Troubleshooting
./sandboxes/gtid-repl/node1/use -u root
Could not execute Write_rows event on table pl18.test_table; Duplicate entry '5' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.000004, end_log_pos 3410
© The Pythian Group Inc., 2018 88© The Pythian Group Inc., 2018 88
- GTID replication (cont)
Basic Replication Troubleshooting
Skip the replication event in GTID replication requires injecting an empty event in the replication flow.
But fixing the slave can also have some side effects. The fix will issue a transaction, this is a new GTID from the slave that could be replicated to subslaves or to slaves if the server is promoted to master.
© The Pythian Group Inc., 2018 89© The Pythian Group Inc., 2018 89
- GTID replication (cont)
Basic Replication Troubleshooting
Skip the replication event in GTID replication requires injecting an empty event in the replication flow.
Show slave status:
Executed_Gtid_Set: 00016747-1111-1111-1111-111111111111:1-28
© The Pythian Group Inc., 2018 90© The Pythian Group Inc., 2018 90
- GTID replication (cont)
Basic Replication Troubleshooting
mysqlbinlog --base64-output=decode-rows --verbose --start-position 2495 /home/demo-user/sandboxes/gtid-repl/node1/data/mysql-relay.000012 | less
Get the start position from show slave status:
Relay_Log_File: mysql-relay.000012
Relay_Log_Pos: 2495
© The Pythian Group Inc., 2018 91© The Pythian Group Inc., 2018 91
- GTID replication (cont)
Basic Replication Troubleshooting
[demo-user@mysql ~]$ mysqlbinlog --base64-output=decode-rows --verbose --start-position 2495 /home/demo-user/sandboxes/gtid-repl/node1/data/mysql-relay.000012 | head -100/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;DELIMITER /*!*/;# at 2495#180423 15:17:33 server id 100 end_log_pos 3218 CRC32 0x1c670834 GTID last_committed=12 sequence_number=13 rbr_only=yes/*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/;SET @@SESSION.GTID_NEXT= '00016747-1111-1111-1111-111111111111:29'/*!*/;# at 2560#180423 15:17:33 server id 100 end_log_pos 3290 CRC32 0xfcd836ef Query thread_id=10 exec_time=0 error_code=0SET TIMESTAMP=1524496653/*!*/;SET @@session.pseudo_thread_id=10/*!*/;SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=1, @@session.unique_checks=1, @@session.autocommit=1/*!*/;SET @@session.sql_mode=1436549152/*!*/;SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;/*!\C utf8 *//*!*/;SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=8/*!*/;SET @@session.lc_time_names=0/*!*/;SET @@session.collation_database=DEFAULT/*!*/;BEGIN/*!*/;# at 2632#180423 15:17:33 server id 100 end_log_pos 3348 CRC32 0x8245d434 Table_map: `pl18`.`test_table` mapped to number 109# at 2690#180423 15:17:33 server id 100 end_log_pos 3410 CRC32 0x1a04c47d Write_rows: table id 109 flags: STMT_END_F### INSERT INTO `pl18`.`test_table`### SET### @1=5### @2='2SLM045U1S5K'### @3=1524496653# at 2752
© The Pythian Group Inc., 2018 92© The Pythian Group Inc., 2018 92
- GTID replication (cont)
Basic Replication Troubleshooting
STOP SLAVE;SET GTID_NEXT= '00016747-1111-1111-1111-111111111111:29';BEGIN;COMMIT;SET GTID_NEXT="AUTOMATIC";START SLAVE;
© The Pythian Group Inc., 2018 93© The Pythian Group Inc., 2018 93
- GTID replication (cont)
Basic Replication Troubleshooting
To fix the replication in the slave, please do a set sql_log_bin=FALSE before executing any command that can perform any change that you don’t want to replicate.
© The Pythian Group Inc., 2018 94© The Pythian Group Inc., 2018 94
- Diagnose replication inconsistencies.
Advanced Replication Troubleshooting
pt-table-checksum --replicate=percona.checksums --ignore-databases mysql,sys,percona h=127.0.0.1,u=msandbox,p=msandbox,P=16743 --recursion-method dsn=h=127.0.0.1,P=16743,D=percona,t=dsns --nocheck-binlog-format
© The Pythian Group Inc., 2018 95© The Pythian Group Inc., 2018 95
- Diagnose replication inconsistencies (cont.)
Advanced Replication Troubleshooting
[demo-user@mysql node1]$ pt-table-checksum --replicate=percona.checksums --ignore-databases mysql,sys,percona h=127.0.0.1,u=msandbox,p=msandbox,P=16743 --recursion-method dsn=h=127.0.0.1,P=16743,D=percona,t=dsns --nocheck-binlog-formatChecking if all tables can be checksummed ...Starting checksum ... TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE04-23T16:06:58 0 1 4429 4 0 0.060 pl18.test_table
© The Pythian Group Inc., 2018 96© The Pythian Group Inc., 2018 96
- Diagnose replication inconsistencies (cont.)
Advanced Replication Troubleshooting
SELECT db, tbl, SUM(this_cnt) AS total_rows, COUNT(*) AS chunksFROM percona.checksumsWHERE ( master_cnt <> this_cnt OR master_crc <> this_crc OR ISNULL(master_crc) <> ISNULL(this_crc))GROUP BY db, tbl;
+------+------------+------------+--------+| db | tbl | total_rows | chunks |+------+------------+------------+--------+| pl18 | test_table | 1000 | 1 |+------+------------+------------+--------+
© The Pythian Group Inc., 2018 97© The Pythian Group Inc., 2018 97
- Fix replication inconsistencies.
Advanced Replication Troubleshooting
[demo-user@mysql ~]$ pt-table-sync --dry-run --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandbox
[demo-user@mysql ~]$ pt-table-sync --print --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandbox
© The Pythian Group Inc., 2018 98© The Pythian Group Inc., 2018 98
- Fix replication inconsistencies.
Advanced Replication Troubleshooting
[demo-user@mysql ~]$ pt-table-sync --dry-run --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandbox# NOTE: --dry-run does not show if data needs to be synced because it# does not access, compare or sync data. --dry-run only shows# the work that would be done.# Syncing via replication P=16744,h=127.0.0.1,p=...,u=msandbox in dry-run mode, without accessing or comparing data# DELETE REPLACE INSERT UPDATE ALGORITHM START END EXIT DATABASE.TABLE# 0 0 0 0 Chunk 16:23:58 16:23:58 0 pl18.test_table
[demo-user@mysql ~]$ pt-table-sync --print --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandboxREPLACE INTO `pl18`.`test_table`(`ident`, `text`, `timestamp`) VALUES ('5', '3VHQ8D28CH1X', '2018-04-21 08:21:09') /*percona-toolkit src_db:pl18 src_tbl:test_table src_dsn:P=16743,h=127.0.0.1,p=...,u=msandbox dst_db:pl18 dst_tbl:test_table dst_dsn:P=16744,h=127.0.0.1,p=...,u=msandbox lock:1 transaction:1 changing_src:percona.checksums replicate:percona.checksums bidirectional:0 pid:15150 user:demo-user host:mysql*/;
© The Pythian Group Inc., 2018 99© The Pythian Group Inc., 2018 99
- Fix replication inconsistencies.
Advanced Replication Troubleshooting
pt-table-sync --execute --replicate percona.checksums --sync-to-master h=127.0.0.1,P=16744,u=msandbox,p=msandbox
[demo-user@mysql ~]$ pt-table-checksum --replicate=percona.checksums --ignore-databases mysql,sys,percona h=127.0.0.1,u=msandbox,p=msandbox,P=16743 --recursion-method dsn=h=127.0.0.1,P=16743,D=percona,t=dsns --nocheck-binlog-formatChecking if all tables can be checksummed ...Starting checksum ... TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE04-23T16:30:18 0 0 4429 4 0 0.063 pl18.test_table
© The Pythian Group Inc., 2018 100© The Pythian Group Inc., 2018 100
- Make your life easier: orchestrator- https://github.com/github/orchestrator- Web interface, command line and Web API.
Advanced Replication Troubleshooting
© The Pythian Group Inc., 2018 101
101© The Pythian Group Inc., 2017
More advanced topics and tools
© The Pythian Group Inc., 2018 102
AGENDA
102© The Pythian Group Inc., 2017
● System bottlenecks
○ Verify OS metrics
○ Run diagnostics
● MySQL bottlenecks
○ MySQL Tools
○ External tools
● Configuration
○ Dynamic variables
○ Static variables
© The Pythian Group Inc., 2018 103© The Pythian Group Inc., 2018 103
● Current system status● Load● Swapping● NUMA● I/O wait
● Trends● Memory usage● CPU usage● Disk usage● Network usage
Bottlenecks Explained
© The Pythian Group Inc., 2018 104© The Pythian Group Inc., 2018 104
● Has anything changed recently?● Application updates?● Database updates?
■ Schema updates■ Configuration updates
● Hardware failures or updates?■ Disk failures■ Temperature warnings■ Memory errors
● OS changes■ Patches or updates installed■ New packages installed
Where to look at first?
© The Pythian Group Inc., 2018 105© The Pythian Group Inc., 2018 105
Graphs? Graphs! Grapps!!!
● Has traffic increased?● Has disk activity increased?● Check table growth● Check memory consumption
● Swap usage● Memory leaks● Buffer Pool Size and overhead
Try to correlate events using graphs!
Finding issues over time
© The Pythian Group Inc., 2018 106© The Pythian Group Inc., 2018 106
● ps (processlist)● top / htop● vmstat● iostat● lsof● dmesg● Ifstat● sar● strace● numactl
Diagnose OS
© The Pythian Group Inc., 2018 107© The Pythian Group Inc., 2018 107
[root@mysql ~]# ps PID TTY TIME CMD28460 pts/0 00:00:00 sudo28461 pts/0 00:00:00 su28462 pts/0 00:00:00 bash28556 pts/0 00:00:00 ps
[root@mysql ~]# ps -ef | grep mysqlmysql 17999 24982 0 Apr08 ? 00:01:49 /usr/libexec/mysql57/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql57/plugin --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sockroot 24982 1 0 Apr07 ? 00:00:00 /bin/sh /usr/libexec/mysql57/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysqlroot 28558 28462 0 22:53 pts/0 00:00:00 grep --color=auto mysql
Diagnose OS - ps (processlist)
© The Pythian Group Inc., 2018 108© The Pythian Group Inc., 2018 108
[root@mysql ~]# toptop - 22:57:43 up 6 days, 13:10, 1 user, load average: 0.00, 0.00, 0.00Tasks: 81 total, 1 running, 80 sleeping, 0 stopped, 0 zombieCpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%stMem: 1017060k total, 855112k used, 161948k free, 139948k buffersSwap: 0k total, 0k used, 0k free, 456684k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1 root 20 0 19648 2492 2164 S 0.0 0.2 0:04.31 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
● Press 1 to expand cpu to different cores● Press -u to filter a specific user (example mysql)● Press H to show different threads● Press < or > to change sort order
Diagnose OS - top
© The Pythian Group Inc., 2018 109© The Pythian Group Inc., 2018 109
Same functionality as top, but a little “fancier”
Diagnose OS - htop
© The Pythian Group Inc., 2018 110© The Pythian Group Inc., 2018 110
[root@mysql ~]# vmstat 1 10procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 2 0 181528 165412 163508 15768200 0 0 115 304 0 1 0 0 97 2 0 0 3 181528 167072 163216 15768148 0 0 19072 363636 1013 544 3 1 75 21 0 1 3 181528 173528 162864 15761096 0 0 24576 153132 1005 1533 4 1 70 25 0 0 4 181528 178924 162592 15754912 0 0 21760 206336 974 719 4 1 67 28 0 0 4 181528 168020 162476 15767500 0 0 24192 106496 740 420 4 0 75 21 0 0 4 181528 154252 162476 15780900 0 0 12928 206848 711 478 2 0 75 23 0 0 4 181528 173656 162204 15761588 0 0 5888 116844 518 384 1 0 71 27 0 0 4 181528 165524 162204 15770332 0 0 8576 176128 542 382 2 0 75 23 0 0 4 181528 157152 162204 15778872 0 0 8576 96256 415 269 2 0 75 23 0 0 3 181528 173116 161952 15761288 0 0 10112 12032 466 355 2 0 75 23 0
Diagnose OS - vmstat
© The Pythian Group Inc., 2018 111© The Pythian Group Inc., 2018 111
[root@mysql ~]# sar -P ALL 1 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)
11:56:36 PM CPU %user %nice %system %iowait %steal %idle11:56:37 PM all 80.00 0.00 20.00 0.00 0.00 0.0011:56:37 PM 0 80.00 0.00 20.00 0.00 0.00 0.00
11:56:37 PM CPU %user %nice %system %iowait %steal %idle11:56:38 PM all 82.00 0.00 18.00 0.00 0.00 0.0011:56:38 PM 0 82.00 0.00 18.00 0.00 0.00 0.00
11:56:38 PM CPU %user %nice %system %iowait %steal %idle11:56:39 PM all 82.00 0.00 18.00 0.00 0.00 0.0011:56:39 PM 0 82.00 0.00 18.00 0.00 0.00 0.00
Average: CPU %user %nice %system %iowait %steal %idleAverage: all 81.33 0.00 18.67 0.00 0.00 0.00Average: 0 81.33 0.00 18.67 0.00 0.00 0.00
Diagnose OS - sar (CPU)
© The Pythian Group Inc., 2018 112© The Pythian Group Inc., 2018 112
[root@mysql ~]# sar -r 1 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)
11:57:36 PM kbmemfree kbmemused %memused kbbuffers kbcached kbcommit %commit11:57:37 PM 64232 952828 93.68 101408 483896 859436 84.5011:57:38 PM 64232 952828 93.68 101408 483896 859436 84.5011:57:39 PM 64232 952828 93.68 101408 483896 859436 84.50Average: 64232 952828 93.68 101408 483896 859436 84.50
[root@mysql ~]# sar -S 1 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)
11:57:46 PM kbswpfree kbswpused %swpused kbswpcad %swpcad11:57:47 PM 0 0 0.00 0 0.0011:57:48 PM 0 0 0.00 0 0.0011:57:49 PM 0 0 0.00 0 0.00Average: 0 0 0.00 0 0.00
Diagnose OS - sar (memory)
© The Pythian Group Inc., 2018 113© The Pythian Group Inc., 2018 113
[root@mysql ~]# iostat -y -x 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle 79.67 0.00 20.33 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilxvda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle 81.67 0.00 18.33 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilxvda 0.00 3.00 0.00 1.33 0.00 42.67 32.00 0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle 81.33 0.00 18.67 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utilxvda 0.00 2.00 0.00 3.33 0.00 53.33 16.00 0.00 0.00 0.00 0.00
● -y omit first report (stats since boot)● -x extended stats● 3 interval in seconds
Diagnose OS - iostat
© The Pythian Group Inc., 2018 114© The Pythian Group Inc., 2018 114
[root@mysql ~]# sar -d 1 3Linux 4.9.81-35.56.amzn1.x86_64 (mysql) 04/13/2018 _x86_64_ (1 CPU)
11:59:53 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util11:59:54 PM dev202-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
11:59:54 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util11:59:55 PM dev202-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
11:59:55 PM DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %util11:59:56 PM dev202-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Average: DEV tps rd_sec/s wr_sec/s avgrq-sz avgqu-sz await svctm %utilAverage: dev202-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Diagnose OS - sar (I/O)
© The Pythian Group Inc., 2018 115© The Pythian Group Inc., 2018 115
[root@mysql ~]# lsof | grep mysqldmysqld 17999 mysql cwd DIR 202,1 4096 393649 /var/lib/mysql...mysqld 17999 mysql mem REG 202,1 40640 2694 /lib64/libcrypt-2.17.so...mysqld 17999 mysql 2w REG 202,1 85019 394788 /var/log/mysqld.logmysqld 17999 mysql 3uW REG 202,1 134217728 394960 /var/lib/mysql/ib_logfile0mysqld 17999 mysql 8uW REG 202,1 134217728 394963 /var/lib/mysql/ib_logfile1mysqld 17999 mysql 9uW REG 202,1 12582912 394959 /var/lib/mysql/ibdata1mysqld 17999 mysql 10uW REG 202,1 12582912 395351 /var/lib/mysql/ibtmp1mysqld 17999 mysql 11u REG 202,1 0 395350 /var/tmp/ibIZu8ES (deleted)...mysqld 17999 mysql 15u IPv6 643056 0t0 TCP *:mysql (LISTEN)mysqld 17999 mysql 16u unix 0xffff88003ce0b000 0t0 643057 /var/lib/mysql/mysql.sock...mysqld 17999 mysql 23u REG 202,1 5120 395018 /var/lib/mysql/mysql/db.MYImysqld 17999 mysql 24u REG 202,1 1464 395019 /var/lib/mysql/mysql/db.MYD...mysqld 17999 mysql 45uW REG 202,1 10485760 395474 /var/lib/mysql/sbtest/sbtest1.ibdmysqld 17999 mysql 48uW REG 202,1 10485760 395476 /var/lib/mysql/sbtest/sbtest2.ibdmysqld 17999 mysql 49uW REG 202,1 10485760 395478 /var/lib/mysql/sbtest/sbtest3.ibdmysqld 17999 mysql 50uW REG 202,1 10485760 395480 /var/lib/mysql/sbtest/sbtest4.ibd
Diagnose OS - lsof
© The Pythian Group Inc., 2018 116© The Pythian Group Inc., 2018 116
[root@mysql ~]# dmesg -T[Sat Apr 7 09:46:48 2018] Linux version 4.9.81-35.56.amzn1.x86_64 (mockbuild@gobi-build-64010) (gcc version 7.2.1 20170915 (Red Hat 7.2.1-2) (GCC) ) #1 SMP Fri Feb 16 00:18:48 UTC 2018[Sat Apr 7 09:46:48 2018] Command line: root=LABEL=/ console=tty1 console=ttyS0 selinux=0 nvme_core.io_timeout=4294967295...[Sat Apr 14 09:46:49 2018] Out of memory: Killed process 21000, UID 48, (httpd).[Sat Apr 14 09:46:49 2018] mysqld invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Call Trace: [<ffffffff802c1b64>] out_of_memory+0x8b/0x203 [<ffffffff8020fa5d>] __alloc_pages+0x27f/0x308
Option -T show time in human readable format (not supported on all OS’s)
Diagnose OS - dmesg
© The Pythian Group Inc., 2018 117© The Pythian Group Inc., 2018 117
[root@mysql ~]# ifstat eth0#kernelInterface RX Pkts/Rate TX Pkts/Rate RX Data/Rate TX Data/Rate RX Errs/Drop TX Errs/Drop RX Over/Rate TX Coll/Rateeth0 14283 0 592 0 21408K 0 41372 0 0 0 0 0 0 0 0 0
[root@mysql ~]# watch -n 1 ‘ifstat eth0’Every 1.0s: ifstat eth0 Fri Apr 13 23:46:46 2018
#kernelInterface RX Pkts/Rate TX Pkts/Rate RX Data/Rate TX Data/Rate RX Errs/Drop TX Errs/Drop RX Over/Rate TX Coll/Rateeth0 6645 0 243 0 9960K 0 17018 0 0 0 0 0 0 0 0 0
Diagnose OS - ifstat
© The Pythian Group Inc., 2018 118© The Pythian Group Inc., 2018 118
Trace execution of an executable
Options:
● -e open to filter specific system calls (like open())● -e trace=open,read to filter multiple system calls● -o file.txt save the execution trace to a file● -p pid execute strace to running process id
Example:[root@mysql ~]# strace -p 30231Process 30231 attachedclock_gettime(CLOCK_REALTIME, {1523664842, 698096903}) = 0gettimeofday({1523664842, 698196}, NULL) = 0recvfrom(53, "\n\0\0\0", 4, MSG_DONTWAIT, NULL, NULL) = 4gettimeofday({1523664842, 698368}, NULL) = 0
Diagnose OS - strace
© The Pythian Group Inc., 2018 119© The Pythian Group Inc., 2018 119
Non-Uniform Memory Access
● Hardware architecture with multiple physical CPUs● Memory speed depends on location relative to CPU
NUMA - what?
© The Pythian Group Inc., 2018 120© The Pythian Group Inc., 2018 120
● MySQL loads a lot of memory at startup (ex buffer_pool init)
● Node 0 free memory is exhausted while there is still free memory● Often leads to “swapping insanity”● Solution: load memory “interleaved”
NUMA - why relevant?
© The Pythian Group Inc., 2018 121© The Pythian Group Inc., 2018 121
The “NUMA-bible” by Jeremy Cole
● https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
● https://blog.jcole.us/2012/04/16/a-brief-update-on-numa-and-mysql/
Since MySQL 5.7.9: innodb_numa_interleave setting
● https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_numa_interleave
NUMA - interesting (must) reads
© The Pythian Group Inc., 2018 122© The Pythian Group Inc., 2018 122
[root@mysql ~]# numactl --hardwareavailable: 2 nodes (0-1)node 0 size: 32276 MBnode 0 free: 26856 MBnode 1 size: 32320 MBnode 1 free: 26897 MBnode distances:node 0 1 0: 10 21 1: 21 10
[root@mysql ~]# cat /proc/`pidof mysqld`/numa_maps | head -5558b3dc82000 default file=/usr/libexec/mysql57/mysqld mapped=3976 active=3953 N0=3976 kernelpagesize_kB=4558b3f767000 default file=/usr/libexec/mysql57/mysqld anon=235 dirty=235 N0=235 kernelpagesize_kB=4558b3f852000 default file=/usr/libexec/mysql57/mysqld anon=87 dirty=87 N0=87 kernelpagesize_kB=4558b3f8fc000 default anon=154 dirty=154 N0=154 kernelpagesize_kB=4558b3ff47000 default heap anon=6304 dirty=6304 N0=6304 kernelpagesize_kB=4
Diagnose OS - numactl
© The Pythian Group Inc., 2018 123© The Pythian Group Inc., 2018 123
● MySQL client (mysql)● mysqladmin● mysqlbinlog● Log files
● Error log (log_error)● Slow query log (slow_query_log_file )● General log (general_log_file )
MySQL bottlenecks - MySQL tools
© The Pythian Group Inc., 2018 124© The Pythian Group Inc., 2018 124
● percona-toolkit● pt-query-digest● pt-upgrade● pt-config-diff● pt-stalk● pt-pmp● ...
● tcpdump● innotop
MySQL bottlenecks - external tools
© The Pythian Group Inc., 2018 125© The Pythian Group Inc., 2018 125
mysql> SHOW FULL PROCESSLIST;+----+--------+-----------+--------+---------+------+-------------------+---------------------------------------------------------------------+| Id | User | Host | db | Command | Time | State | Info |+----+--------+-----------+--------+---------+------+-------------------+---------------------------------------------------------------------+| 53 | sbtest | localhost | sbtest | Execute | 0 | statistics | SELECT DISTINCT c FROM sbtest15 WHERE id BETWEEN ? AND ? ORDER BY c || 54 | sbtest | localhost | sbtest | Sleep | 0 | | NULL || 55 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | SELECT c FROM sbtest12 WHERE id=? || 56 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | SELECT c FROM sbtest21 WHERE id=? || 58 | root | localhost | NULL | Query | 0 | starting | SHOW FULL PROCESSLIST |+----+--------+-----------+--------+---------+------+-------------------+---------------------------------------------------------------------+5 rows in set (0.00 sec)
mysql> pager grep -v SleepPAGER set to 'grep -v Sleep'
mysql> SHOW FULL PROCESSLIST;+----+--------+-----------+--------+---------+------+-------------------+-------------------------------------------------+| Id | User | Host | db | Command | Time | State | Info |+----+--------+-----------+--------+---------+------+-------------------+-------------------------------------------------+| 54 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | COMMIT || 55 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | SELECT c FROM sbtest17 WHERE id BETWEEN ? AND ? || 56 | sbtest | localhost | sbtest | Execute | 0 | Sending to client | SELECT c FROM sbtest14 WHERE id=? || 58 | root | localhost | NULL | Query | 0 | starting | SHOW FULL PROCESSLIST |+----+--------+-----------+--------+---------+------+-------------------+-------------------------------------------------+5 rows in set (0.00 sec)
MySQL bottlenecks - processlist
© The Pythian Group Inc., 2018 126© The Pythian Group Inc., 2018 126
mysql> SHOW ENGINE INNODB STATUS\G=====================================2018-04-14 01:04:25 0x7f707b41f700 INNODB MONITOR OUTPUT=====================================BACKGROUND THREAD-----------------SEMAPHORES------------TRANSACTIONS------------FILE I/O--------INSERT BUFFER AND ADAPTIVE HASH INDEX-------------------------------------LOG---BUFFER POOL AND MEMORY----------------------ROW OPERATIONS--------------END OF INNODB MONITOR OUTPUT============================
1 row in set (0.00 sec)
mysql>
MySQL bottlenecks - innodb status
© The Pythian Group Inc., 2018 127© The Pythian Group Inc., 2018 127
Data dictionary - contains metadata on tablesExample: get table sizes
mysql> SELECT table_schema, table_name, engine, data_length / 1024 / 1024 as data_in_MB, index_length / 1024 / 1024 as index_in_MB FROM information_schema.TABLES WHERE table_schema NOT IN ('mysql', 'information_schema', 'performance_schema', 'sys');+--------------+------------+--------+------------+-------------+| table_schema | table_name | engine | data_in_MB | index_in_MB |+--------------+------------+--------+------------+-------------+| sbtest | sbtest1 | InnoDB | 0.01562500 | 0.00000000 || sbtest | sbtest2 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest3 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest4 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest5 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest6 | InnoDB | 2.51562500 | 0.15625000 || sbtest | sbtest7 | InnoDB | 0.01562500 | 0.00000000 || sbtest | sbtest8 | InnoDB | 0.01562500 | 0.00000000 |+--------------+------------+--------+------------+-------------+8 rows in set (0.00 sec)
MySQL bottlenecks - information_schema
© The Pythian Group Inc., 2018 128© The Pythian Group Inc., 2018 128
performance_schema provides insight into MySQL / InnoDB internals.
sys schema is a set of views defined to make searching in P_S a little more convenient.P_S:select if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) AS `user`,sum(`stmt`.`total`) AS `statements`,`sys`.`format_time`(sum(`stmt`.`total_latency`)) AS `statement_latency`,`sys`.`format_time`(ifnull((sum(`stmt`.`total_latency`) / nullif(sum(`stmt`.`total`),0)),0)) AS `statement_avg_latency`,sum(`stmt`.`full_scans`) AS `table_scans`,sum(`io`.`ios`) AS `file_ios`,`sys`.`format_time`(sum(`io`.`io_latency`)) AS `file_io_latency`,sum(`performance_schema`.`accounts`.`CURRENT_CONNECTIONS`) AS `current_connections`,sum(`performance_schema`.`accounts`.`TOTAL_CONNECTIONS`) AS `total_connections`,count(distinct `performance_schema`.`accounts`.`HOST`) AS `unique_hosts`,`sys`.`format_bytes`(sum(`mem`.`current_allocated`)) AS `current_memory`,`sys`.`format_bytes`(sum(`mem`.`total_allocated`)) AS `total_memory_allocated` from (((`performance_schema`.`accounts` left join `sys`.`x$user_summary_by_statement_latency` `stmt` on((if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) = `stmt`.`user`))) left join `sys`.`x$user_summary_by_file_io` `io` on((if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) = `io`.`user`))) left join `sys`.`x$memory_by_user_by_current_bytes` `mem` on((if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) = `mem`.`user`))) group by if(isnull(`performance_schema`.`accounts`.`USER`),'background',`performance_schema`.`accounts`.`USER`) order by sum(`stmt`.`total_latency`) desc
Sys: select * from sys.user_summary;
MySQL bottlenecks - performance_schema / sys
© The Pythian Group Inc., 2018 129© The Pythian Group Inc., 2018 129
Tool to view and search binary log files$ mysqlbinlog --base64-output=DECODE-ROWS -vvv mysql-bin.000003/*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/;...CREATE TABLE sbtest1( ... ) /*! ENGINE = innodb *//*!*/;# at 566#180414 18:41:33 server id 100 end_log_pos 638 CRC32 0x8a350686 Query thread_id=8 exec_time=0 error_code=0SET TIMESTAMP=1523731293/*!*/;BEGIN/*!*/;# at 638#180414 18:41:33 server id 100 end_log_pos 695 CRC32 0x0b24f37f Table_map: `test`.`sbtest1` mapped to number 108...### INSERT INTO `test`.`sbtest1`### SET### @1=2716 /* INT meta=0 nullable=0 is_null=0 */### @2=5007 /* INT meta=0 nullable=0 is_null=0 */### @3='55695626677-52169758534-77347375130-44672760375-20882749287-44162878068-93868043135-83242682565-21261977354-27900241166' /* STRING(120) meta=65144 nullable=0 is_null=0 */### @4='36579967600-35242135535-40368674184-39875850855-96100412304' /* STRING(60) meta=65084 nullable=0 is_null=0 */# at 516259#180414 18:41:33 server id 100 end_log_pos 516290 CRC32 0xdb1b3667 Xid = 27COMMIT/*!*/;# at 516290
MySQL bottlenecks - mysqlbinlog
© The Pythian Group Inc., 2018 130© The Pythian Group Inc., 2018 130
● Designed to capture “slow” queries● Define “slow” queries: long_query_time or min_examined_row_limit● Also: log_slow_admin_statements and log_queries_not_using_indexes● Slow query log options:
● slow_query_log: turn it ON or OFF● slow_query_log_file: target file
● Extra verbosity options on Percona Server and MariaDB● For profiling purposes we often set long_query_time to 0 to log all
queries that were executed.
MySQL bottlenecks - Slow query log
© The Pythian Group Inc., 2018 131© The Pythian Group Inc., 2018 131
● Generic network analysing tool● Captures network traffic on TCP level● Can be used to capture all MySQL traffic:
tcpdump -s 65535 -x -nn -q -tttt -i any -c 1000000 port 3306 > mysql.tcp.txt
● Warning this command output will not be usable with SSL encrypted connections. This will require decrypting the traffic first
● ssldump is an alternative to overcome the SSL issue
MySQL bottlenecks - tcpdump
© The Pythian Group Inc., 2018 132© The Pythian Group Inc., 2018 132
● Part of the percona-toolkit● Can be used to analyse slow_query_logs , tcpdump logs , …● Examples
● Using slow_query_logpt-query-digest slow.log
● Using tcpdump logpt-query-digest --type tcpdump mysql.tcp.txt
● Convert tcpdump log in to slow_query_logpt-query-digest --output tcpdump.slow.log --no-report --type tcpdump mysql.tcp.txt
MySQL bottlenecks - pt-query-digest
© The Pythian Group Inc., 2018 133© The Pythian Group Inc., 2018 133
When Load Cxns QPS Slow Se/In/Up/De% QCacheHit KCacheHit BpsIn BpsOutNow 0.00 9 9.68k 0 70/ 4/10/ 4 0.00% 100.00% 385.24k 19.01MTotal 0.00 151 427.48 0 70/ 4/ 9/ 4 0.00% 46.67% 22.85k 839.30k
Cmd ID State User Host DB Time QueryExecute 13 Sending data sbtest localhost sbtest 00:00 SELECT c FROM sbtest5 WHERE id BETWEEN ? AND ?Execute 14 Sending to client sbtest localhost sbtest 00:00 SELECT c FROM sbtest7 WHERE id BETWEEN ? AND ? ORDER BY cExecute 15 Sending to client sbtest localhost sbtest 00:00 UPDATE sbtest6 SET k=k+1 WHERE id=?Execute 17 updating sbtest localhost sbtest 00:00 UPDATE sbtest5 SET k=k+1 WHERE id=?Execute 18 starting sbtest localhost sbtest 00:00 COMMITExecute 19 updating sbtest localhost sbtest 00:00 UPDATE sbtest5 SET k=k+1 WHERE id=?Execute 20 Sending to client sbtest localhost sbtest 00:00 SELECT c FROM sbtest2 WHERE id=?
MySQL bottlenecks - innotop
© The Pythian Group Inc., 2018 134© The Pythian Group Inc., 2018 134
● innodb_buffer_pool_size● innodb_io_capacity● innodb_lock_wait_timeout● query_cache_size● table_open_cache● …
Make sure you persist these changes to the my.cnf file to ensure the value is preserved on restart. (Fixed in 8.0 using SET PERSIST <variable> = <value>;)
MySQL configuration - dynamic variables
© The Pythian Group Inc., 2018 135© The Pythian Group Inc., 2018 135
● innodb_buffer_pool_instances● open_files_limit● skip_name_resolve● tmpdir● ...
MySQL configuration - static variables
© The Pythian Group Inc., 2018 136
136© The Pythian Group Inc., 2017
THANK YOU We’re hiring!
https://www.pythian.com/careers/
© The Pythian Group Inc., 2018 137137
© The Pythian Group Inc., 2017
Example Text
● Example Bullet
TITLE
© The Pythian Group Inc., 2018 138
138© The Pythian Group Inc., 2017
TITLEExample Text
● Example Bullet