Transcript
  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    1/20

    Dealing with schemachanges on large data

    volumesDanil Zburivsky

    MySQL DBA, Team Lead

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    2/20

    2

    2010/2011 Pythian

    Why Companies Trust Pythian Recognized Leader:

    Global industry-leader in remote database administration services and consulting for Oracle,Oracle Applications, MySQL and SQL Server

    Work with over 150 multinational companies such as Forbes.com, Fox Interactive media, andMDS Inc. to help manage their complex IT deployments

    Expertise:

    One of the worlds largest concentrations of dedicated, full-time DBA expertise.

    Global Reach & Scalability:

    24/7/365 global remote support for DBA and consulting, systems administration, specialprojects or emergency response

    2

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    3/20

    3

    2010/2011 Pythian

    Agenda

    Why schema changes are painful on largedata sizes?

    Using standby for schema migrations

    Shadow tables approach

    3

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    4/20

    4

    2010/2011 Pythian

    Schema changes are slow

    Most of the schema changes on InnoDB tablesrequire table rebuild:

    Add index

    Add new column

    Drop columnRename column

    Table is locked during rebuild

    Becomes a problem when table is 20G in size

    There are some improvements in InnoDB plugin,but they dont solve the problem in general

    4

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    5/20

    5

    2010/2011 Pythian

    Using standby for schema changes

    SET SQL_LOG_BIN = 0;

    Apply schema changes on standbyFailover application to standby

    5

    master-master

    Primary Standby

    On Standby:

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    6/20

    6

    2010/2011 Pythian

    Using standby for schema changes

    Works fine for adding indexes

    Not as good for adding new columns

    Doesnt work for renaming or droppingcolumns: replication will break

    6

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    7/20

    7 2010/2011 Pythian

    Shadow table approach

    Create new empty table with similar structure

    Apply schema changes on new table

    Copy data from original table to new table

    Synchronize using triggers

    Swap tables

    7

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    8/20

    8 2010/2011 Pythian

    Use case

    System with about 500G of InnoDB data

    Major application release affecting about 30 tables

    Largest one is 20G in size

    Database changes included: new columns,renaming columns, deleting columns, new indexesand complex data transformations

    Estimated time for applying all changes directly

    was 7 hoursUsing shadow tables database changes wereapplied in about 1 hour

    8

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    9/20

    9 2010/2011 Pythian

    The process

    CREATE TABLE t_new LIKE t_original;

    ALTER TABLE t_new ADD COLUMN AB VARCHAR (100);

    9

    CREATE TABLE `t_original` (

    `id` int(11) NOT NULL,`A` varchar(50) DEFAULT NULL,`B` varchar(50) DEFAULT NULL,PRIMARY KEY (`id`)

    ) ENGINE=InnoDB DEFAULT CHARSET=latin1

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    10/20

    10 2010/2011 Pythian

    Triggers to keep data in sync

    LOCK TABLE t_original WRITE;

    CREATE TRIGGER t_original_ai AFTER INSERTON t_original

    FOR EACH ROWREPLACE INTO t_new (id, A, B, AB) VALUES(NEW.id, NEW.A, NEW.B, CONCAT (A,',',B));

    10

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    11/20

    11 2010/2011 Pythian

    Triggers to keep data in syncCREATE TRIGGER t_original_ad AFTER DELETE ONt_original

    FOR EACH ROWDELETE FROM t_new WHERE id = OLD.id;

    CREATE TRIGGER t_original_au AFTER UPDATE ONt_original

    FOR EACH ROWUPDATE t_new SET id = NEW.id,

    A = NEW.A,

    B = NEW.B,

    AB = CONCAT (A,',',B)

    WHERE id = OLD.id;

    UNLOCK TABLES;

    11

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    12/20

    12 2010/2011 Pythian

    Copy data

    12

    t_original t_new

    MIN(id)

    MAX(id)

    INSERT IGNORE INTO t_new(....)SELECT ... FROM t_originalWHERE id>=? LIMIT N

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    13/20

    13 2010/2011 Pythian

    Copy data. Sample code

    $lastId=$minid;

    $sql=prepare($sql);

    while ($rv > 1)

    {

    $dbh->do(START TRANSACTION);

    $sth1->execute($lastId);

    13

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    14/20

    14 2010/2011 Pythian

    Copy data. Sample code

    $sth = $dbh->prepare("SELECT id FROM t_originalWHERE id >='$lastId' LIMIT 5000");

    $sth->execute();

    $rv = $sth->rows;

    while ((my $nextId) = $sth->fetchrow_array())

    {$lastId = $nextId;

    $totalrows = $totalrows + 1;

    }

    dbh->do(COMMIT);

    print "Print rows inserted $totalrows. Next id=$lastId\n";

    }

    14

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    15/20

    15 2010/2011 Pythian

    Basic checks

    SELECT COUNT(*)

    FROM t_newUNION

    SELECT COUNT(*)

    FROM t_original;

    SELECT MAX(id), MIN(id)

    FROM t_new

    UNION

    SELECT MAX(id), MIN(id)

    FROM t_original;

    15

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    16/20

    16 2010/2011 Pythian

    During the release

    RENAME TABLE t_original TO t_old;

    RENAME TABLE t_new TO t_original;DROP TRIGGERS;

    16

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    17/20

    17 2010/2011 Pythian

    Limitations

    Requires a unique key

    No existing triggers

    Need enough disk space for shadowtables

    Foreign keys

    17

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    18/20

    18 2010/2011 Pythian

    Foreign key issue

    If there is an existing FK on t_original:FOREIGN KEY (`fkId`) REFERENCES `t_original` (`id`)

    It will be changed after RENAME to:FOREIGN KEY (`fkId`) REFERENCES `t_old` (`id`)

    Solution is a hack:

    SET FOREIGN_KEY_CHECKS=0;DROP TABLE t_old;RENAME TABLE t_original TO t_old;RENAME TABLE t_old TO t_original;

    18

  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    19/20

    19 2010/2011 Pythian

    Existing solutions

    http://code.openark.org/blog/mysql/online-alter-table-now-available-in-openark-kit

    http://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932

    19

    http://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932http://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932http://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932http://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932http://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932http://www.facebook.com/notes/mysql-at-facebook/online-schema-change-for-mysql/430801045932http://code.openark.org/blog/mysql/online-alter-table-now-available-in-openark-kithttp://code.openark.org/blog/mysql/online-alter-table-now-available-in-openark-kithttp://code.openark.org/blog/mysql/online-alter-table-now-available-in-openark-kithttp://code.openark.org/blog/mysql/online-alter-table-now-available-in-openark-kithttp://code.openark.org/blog/mysql/online-alter-table-now-available-in-openark-kithttp://code.openark.org/blog/mysql/online-alter-table-now-available-in-openark-kit
  • 8/7/2019 Dealing With Schema Changes on Large Data Volumes

    20/20

    20 2010/2011 Pythian

    Q & A

    Thank you!

    [email protected]

    http://www.pythian.com/news/author/zburivsky/Twitter: @zburivsky

    20

    mailto:[email protected]:[email protected]

Recommended