114
Tungsten Replicator Guide

Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

  • Upload
    others

  • View
    16

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Guide

Page 2: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Guide - Document issue 2.0.4 Tungsten version 2.0.4

Tungsten Replicator GuideTungsten version 2.0.4This document was published on 01 December 2011.www.continuent.com [http://www.continuent.com]This document is written for Tungsten Replicator version 2.0.4.Copyright © Continuent

The trademarks, logos, and service marks in this Document are the property of Continuent or other third parties. You are not permitted to use these Markswithout the prior written consent of Continuent or such appropriate third party. Continuent, Tungsten, uni/cluster, m/cluster, p/cluster, uc/connector, and theContinuent logo are trademarks or registered trademarks of Continuent in the United States, France, Finland and other countries.

All Materials on this Document are (and shall continue to be) owned exclusively by Continuent or other respective third party owners and are protected underapplicable copyrights, patents, trademarks, trade dress and/or other proprietary rights. Under no circumstances will you acquire any ownership rights or otherinterest in any Materials by or through your access or use of the Materials. All right, title and interest not expressly granted is reserved to Continuent.

All rights reserved.

Page 3: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Guide

Tungsten Replicator Guide - Document issue 2.0.4iii

Tungsten version 2.0.4

Table of Contents1. Introducing Tungsten Replicator .............................................................................................................. 1

1.1. Replication Concepts ................................................................................................................... 11.1.1. What is Database Replication? .......................................................................................... 11.1.2. How Does Replication Work? ............................................................................................ 21.1.3. Built-in Versus External Replication .................................................................................... 31.1.4. Log Versus Trigger-Based Replication ............................................................................... 3

1.2. Tungsten Replicator Architecture .................................................................................................. 31.3. MySQL Replication ...................................................................................................................... 41.4. PostgreSQL Warm Standby Replication ........................................................................................ 51.5. Definitions of PostgreSQL's Warm Standby, Hot Standby, WAL shipping and Streaming Replica-tion .................................................................................................................................................... 51.6. How Tungsten Replicator Implements PostgreSQL WAL Shipping .................................................. 6

2. Tungsten Replicator Installation and Configuration ................................................................................... 92.1. Replicator Configuration For All Database Types ........................................................................... 9

2.1.1. Installation Prerequisites ................................................................................................... 92.2. Installation and Configuration for MySQL ...................................................................................... 9

2.2.1. Installation Prerequisites ................................................................................................... 92.2.2. Setting Up a User Account .............................................................................................. 102.2.3. Preparing the Database .................................................................................................. 102.2.4. Installing and Configuring Tungsten Replicator ................................................................. 112.2.5. Setting Up a Simple Master/Slave Configuration ............................................................... 122.2.6. MySQL Database Housekeeping ..................................................................................... 14

2.3. Installation and Configuration for PostgreSQL Warm Standby ....................................................... 172.3.1. Installation Prerequisites .................................................................................................. 172.3.2. Setting Up a User Account .............................................................................................. 172.3.3. Preparing the Database .................................................................................................. 172.3.4. Installing and Configuring Tungsten Replicator ................................................................. 182.3.5. Setting Up a Simple Master/Slave Configuration ............................................................... 20

3. Basic Principles of Operation ................................................................................................................ 223.1. The Tungsten Replicator Process ............................................................................................... 22

3.1.1. Overview ........................................................................................................................ 223.1.2. Replication States ........................................................................................................... 22

3.2. Replicator Configuration ............................................................................................................. 233.2.1. The replicator.properties File ........................................................................................... 233.2.2. Dynamic Properties ......................................................................................................... 253.2.3. Open Replicator Plugin ................................................................................................... 253.2.4. Pipelines and Stages ...................................................................................................... 263.2.5. Stores ............................................................................................................................ 273.2.6. Extractors ....................................................................................................................... 293.2.7. Appliers .......................................................................................................................... 303.2.8. Filters ............................................................................................................................. 30

3.3. Replication Catalogs .................................................................................................................. 303.3.1. Tungsten Database Tables .............................................................................................. 303.3.2. Purging the Transactional History Log .............................................................................. 31

3.4. Backup and Restore .................................................................................................................. 323.4.1. Overview of Backups and Backup Storage ....................................................................... 323.4.2. Backup Configuration ...................................................................................................... 333.4.3. Running Backup and Restore Commands ........................................................................ 33

Page 4: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Guide

Tungsten Replicator Guide - Document issue 2.0.4iv

Tungsten version 2.0.4

3.4.4. Storage Organization and Management ........................................................................... 343.4.5. Creating a Backup When Starting the Replicator .............................................................. 35

3.5. Master Failover ......................................................................................................................... 353.6. Provisioning New Slaves ............................................................................................................ 36

3.6.1. Overview ........................................................................................................................ 363.6.2. Procedure for Automatically Provisioning New Slaves ....................................................... 373.6.3. Procedure for Manually Provisioning New Slaves .............................................................. 37

3.7. Consistency Checking ................................................................................................................ 383.7.1. Overview ........................................................................................................................ 383.7.2. Invoking Consistency Checks .......................................................................................... 383.7.3. Configuration .................................................................................................................. 39

3.8. Replicator Monitoring and Management APIs .............................................................................. 393.8.1. JMX/MBean Interface Architecture ................................................................................... 403.8.2. Basic JMX/MBeans ......................................................................................................... 403.8.3. JMX Clients for Tungsten Replicator ................................................................................ 41

4. Advanced Principles of Operation ......................................................................................................... 424.1. Specialized Pipeline Extensions ................................................................................................. 42

4.1.1. Dummy Replication ......................................................................................................... 424.1.2. Direct Replication ............................................................................................................ 434.1.3. MySQL to PostgreSQL/Greenplum Replication ................................................................. 44

4.2. THL Disk Storage Configuration and Management ...................................................................... 484.3. Replication Configuration for MySQL .......................................................................................... 49

4.3.1. Migration between MySQL and Tungsten Replication ........................................................ 504.3.2. MySQL Character Sets and Binary Data .......................................................................... 534.3.3. Enabling Relay Log Extraction ......................................................................................... 55

4.4. Common Applications of Replication ........................................................................................... 564.4.1. Using Database Replicas to Scale Reads ........................................................................ 564.4.2. Implementing Automated Failover .................................................................................... 574.4.3. Fast Database Upgrade and Migration ............................................................................. 574.4.4. Heterogeneous Replication .............................................................................................. 59

5. Replication Services ............................................................................................................................. 615.1. Principles of Operation .............................................................................................................. 615.2. Configuration ............................................................................................................................. 625.3. Replication Service Metadata ..................................................................................................... 625.4. Replication Service Event Logs .................................................................................................. 635.5. Management ............................................................................................................................. 635.6. JMX APIs .................................................................................................................................. 655.7. Diagnostic Messages ................................................................................................................. 65

6. Event Metadata and Sharding ............................................................................................................... 666.1. Principles of Operation .............................................................................................................. 666.2. Configuration ............................................................................................................................. 676.3. Management ............................................................................................................................. 67

7. Multi-Master Replication ........................................................................................................................ 697.1. Principles of Operation .............................................................................................................. 69

7.1.1. Local Master/Slave Operation .......................................................................................... 707.1.2. Bi-Directional Replication ................................................................................................. 717.1.3. Bi-Directional Replication with Slaves ............................................................................... 72

7.2. Configuration ............................................................................................................................. 737.3. Management ............................................................................................................................. 74

8. Parallel Apply ....................................................................................................................................... 768.1. Principles of Operation .............................................................................................................. 76

Page 5: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Guide

Tungsten Replicator Guide - Document issue 2.0.4v

Tungsten version 2.0.4

8.2. Configuration ............................................................................................................................. 778.2.1. Replication Service Configuration .................................................................................... 778.2.2. Choosing the Correct Number of Channels for Parallel Apply ............................................. 778.2.3. Configuring Partitioning Rules .......................................................................................... 778.2.4. Explicit Partitioning Assignment ....................................................................................... 788.2.5. Default Partition Assignment ............................................................................................ 788.2.6. Critical Shards ................................................................................................................ 78

8.3. Management ............................................................................................................................. 799. Tuning and Troubleshooting Tungsten Replicator ................................................................................... 80

9.1. Recogizing and Handling Errors ................................................................................................. 809.2. Tuning Replicator Memory ......................................................................................................... 809.3. Tuning Replicator Performance .................................................................................................. 81

9.3.1. Apply-Side Event Caching ............................................................................................... 819.3.2. Block Commit ................................................................................................................. 829.3.3. Master Connection Reset Period ..................................................................................... 82

9.4. Running out of Disk Space in DBMS Logs .................................................................................. 829.5. Data Inconsistencies Between Master and Slave Databases ........................................................ 83

9.5.1. Handling Log Consistency Check Failures ........................................................................ 839.5.2. Skipping a Failed SQL Update on the Slave ..................................................................... 84

9.6. Database Failure ....................................................................................................................... 859.6.1. Repairing Failed Slaves .................................................................................................. 859.6.2. Repairing a Failed Master ............................................................................................... 85

9.7. Re-initializing Tungsten Replicator State ..................................................................................... 869.8. PostgreSQL Troubleshooting ...................................................................................................... 87

10. Command Reference Guide ................................................................................................................ 8810.1. Running Tungsten Replicator from the Command Line Interface ................................................. 8810.2. Running Tungsten Replicator as an Operating System Service ................................................... 8910.3. Controlling a Running Tungsten Replicator Process ................................................................... 9010.4. Replicator THL Utility ............................................................................................................... 93

10.4.1. THL Utility Global Options ............................................................................................. 9410.4.2. THL Utility Commands .................................................................................................. 94

11. Extending the Tungsten Replicator System .......................................................................................... 9611.1. The ReplicatorPlugin Interface .................................................................................................. 9611.2. Replicator Plug-In Life Cycle .................................................................................................... 9611.3. Plug-In Setter Conventions ....................................................................................................... 9711.4. Logging from Plug-Ins .............................................................................................................. 9711.5. Advice on Writing Plug-Ins ....................................................................................................... 97

A. Tungsten Replicator Catalogs ............................................................................................................... 99A.1. consistency .......................................................................................................................... 99A.2. heartbeat .............................................................................................................................. 99A.3. history .................................................................................................................................. 99A.4. trep_commit_seqno ............................................................................................................. 100

B. Tungsten Open Replicator Specification .............................................................................................. 101B.1. Tungsten Open Replicator Plugin Interface ............................................................................... 101B.2. Tungsten Replicator Plugin Implementation ............................................................................... 101

B.2.1. Overview ...................................................................................................................... 101B.2.2. Configuration Properties ................................................................................................ 101

B.3. Script Replication Plugin Implementation .................................................................................. 102B.3.1. Overview ...................................................................................................................... 102B.3.2. Configuration Properties ................................................................................................ 102B.3.3. Script Syntax Reference ............................................................................................... 102

Page 6: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Guide

Tungsten Replicator Guide - Document issue 2.0.4vi

Tungsten version 2.0.4

B.3.4. Status Variables ........................................................................................................... 103B.3.5. Capabilities ................................................................................................................... 104

C. Tungsten Replicator Plug-ins .............................................................................................................. 105C.1. Transaction History Log (THL) Storage ..................................................................................... 105

C.1.1. THL JDBC Storage Plug-In ........................................................................................... 105C.2. Extractors ............................................................................................................................... 105

C.2.1. MySQL Extractor .......................................................................................................... 105C.3. Filters ..................................................................................................................................... 106

C.3.1. Case Mapping Filter ..................................................................................................... 106C.3.2. Database Transform Filter ............................................................................................. 106C.3.3. Logger Filter ................................................................................................................. 107C.3.4. MySQL Session Support Filter ...................................................................................... 107C.3.5. Time-Delay Filter .......................................................................................................... 108

C.4. Appliers .................................................................................................................................. 108C.4.1. MySQL Applier ............................................................................................................. 108

Page 7: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Introducing Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.41

Tungsten version 2.0.4

Chapter 1. Introducing Tungsten ReplicatorTungsten Replicator provides master/slave replication. It has a pluggable architecture and it supports a multiplicityof database management systems. The following chapters provide more detailed information on Tungsten Repli-cator.

1.1. Replication ConceptsThe basic concepts of the Tungsten Replicator system are explained in this chapter.

1.1.1. What is Database Replication?

Database replication is a highly flexible technology for copying updates automatically between databases. Theidea is that if you make a change to one database, the other database copies the update automatically. Replicationoccurs at the database level and does not require any special actions from client applications.

Propagating updates automatically is a simple idea, but it helps solve a surprisingly large number of problems. Seebelow for a summary figure of solution examples. The examples are also explained into more detail - clockwise- in the list below the figure.

Figure 1.1. Replication Benefits

1. Availability. Keeping multiple copies of data is one of the most effective ways to avoid database availabilityproblems. If one database fails, you can switch to another local copy or even to a copy located on another site.

2. Cross-site database operation. Applications like credit card processing use multiple open databases on differentsites, so that there is always a database available for transactions. Replication can help transfer copies betweendistributed databases or send updates to a centrally located copy.

3. Scaling. Replicated copies are live databases, so you can use them to distribute read traffic. For example, youcan run backups or reports on a replica without affecting other copies.

4. Upgrades. Replication allows users to upgrade a replica, which can then be switched over to the master copy.This is a classic technique to minimize downtime as well as provide a convenient back-out in the event ofproblems.

Page 8: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Introducing Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.42

Tungsten version 2.0.4

5. Heterogeneous database integration. It is quite common for data to be entered in one database type, such asPostgreSQL, and used in another, such as MySQL. Replication can copy data between databases and performtransformations necessary to ensure proper conversion.

6. Data warehouse loading. Replication applies updates in real-time, which is very useful as databases becometoo large to move using batch processes. Data warehouse loading is much easier with capabilities such astransforming data or copying updates to a central location.

7. Geographic distribution. Replication allows users to place two or more clusters in geographically separatedlocations to protect against site failure or site unreachability.

It is not surprising that database replication is considered essential technology to build and operate a wide varietyof business-critical applications. Tungsten Replicator is designed to solve the problems described above as wellas many others.

1.1.2. How Does Replication Work?

Tungsten Replicator uses master/slave replication. In master/slave replication, updates are handled by onedatabase server, known as the master, and propagated automatically to replicas, which are known as slaves. Thisis a very efficient way to make database copies and keep them up to date as they change.

Master/slave replication is based on a simple idea. Let us assume that two databases start with the same initialdata, which we call a snapshot. We make changes on one database, recording them in order so that they canbe replayed with exactly the same effect as the original changes. We call this a serialized order. If we replay theserialized order on the second database, we have master/slave replication.

Figure 1.2. Master/Slave Replication

Master/slave replication is popular for a number of reasons. First, databases can generate and reload snapshotsrelatively efficiently using backup tools. Second, databases not only serialize data very quickly but also write it intoa file-based log that can be read by external processes. Master/slave replication is therefore reasonably tractableto implement, even though the effort to do it well is not small.

Master/slave replication has a number of benefits for users. It runs very quickly, places few limitations on user SQL,and works well over high latency network connections typical of wide area networks (WAN). Also, as Tungsten

Page 9: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Introducing Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.43

Tungsten version 2.0.4

Replicator demonstrates, this type of replication does not require any changes to the database server itself, whichmeans that it works with off-the-shelf databases.

Master/slave replication also has some disadvantages. First, the master is a single point of failure. Special proce-dures are necessary to handle this and keep systems available. Second, slaves tend to lag the master. This is dueto the fact that masters can typically process updates faster than they can be replicated and applied to slaves.

Tungsten Replicator is designed to minimize drawbacks of the approach, for example by handling master failovercorrectly or providing mechanisms to help boost speed of updates on replicas. It also has features like data filteringand transformation, which make it better able to handle problems like heterogeneous data integration for whichmaster/slave replication is uniquely suited.

1.1.3. Built-in Versus External Replication

Replication technology is so important that most established databases offer it as a built-in feature. Built-in replica-tion has the advantage that it tends to work well for replication between databases of the same type. However, thereare common restrictions that prevent using it between database versions or across different operating systems. Inaddition, for commercial databases, replication tends to be a complex and expensive add-on.

Tungsten Replicator by contrast operates outside the database and can be viewed as external replication. Thisapproach has a number of advantages. It handles problems like availability and scaling but also imposes fewerlimitations such as version or platform restrictions and is ideally suited for replicating between databases of differenttypes.

1.1.4. Log Versus Trigger-Based Replication

Log-based replication reads SQL updates from the database recovery log, which contains the serialized list ofchanges that the database uses for its own recovery in the event of a restart. Log-based replication has the lowestperformance overhead of any replication method, handles a wider set of changes, and has the least managementimpact. It is, however, harder to implement because it requires reading log formats which are complex and rarelydocumented completely.

Trigger-based replication, on the other hand, installs triggers to capture table updates. This approach is often easierto implement but in most other respects is almost always a "second choice" for users. Triggers tend to slow themaster, add to management complexity, and have limited ability to handle changes to the database schema itself,also known as DDL.

Tungsten Replicator uses log-based replication due to the greatly improved performance and flexibility. Not all logreaders are available in open source; some are commercial extensions.

1.2. Tungsten Replicator Architecture

Tungsten Replicator is a process that runs on every host in the cluster and implements replication as described inthe previous sections. The figure below depicts the replication architecture:

Page 10: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Introducing Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.44

Tungsten version 2.0.4

Figure 1.3. Replication Architecture

The components in the figure are:

• Master DBMS - The Database Management System (DBMS), which acts as the master for the replication system.The master role can change, and any DBMS can be potentially elected as the master for the replication.

• Slave DBMS - The slave DBMS receives replication events from the master DBMS and applies them. There canbe any number of slaves in the replication system. Slaves are also commonly known as replicas.

• Replication Event Extractor - The replication event extractor extracts replication events from the master DBMSlogs. Events are either SQL statements or rows from the replicated database.

• Transaction History Log - The transaction history log provides persistent storage for replication events andcommunications with other transaction history logs in the cluster.

• Replication Event Applier - The replication event applier applies the replication events into the slave DBMS.

• Node Manager - Node manager refers to the manager for Tungsten components running either on the slave ormaster node. Node manager connects to the Tungsten service manager at the upper level.

Tungsten Replicator architecture is very flexible and allows addition of new extractors and appliers. Addition ofnew databases is quite straightforward. It also allows users to implement creative new uses for replication, suchas reading data from a database and applying it to an application rather than a database, or replicating from anapplication to a database. Extending Tungsten Replicator is discussed in Chapter 11, Extending the TungstenReplicator System.

1.3. MySQL Replication

Tungsten Replicator can extract events and hence replicate from any MySQL 5.0 or later database that has binarylogs (binlogs) enabled. Binary logs capture data for MySQL's native replication. Tungsten Replicator tails binlogfiles, parsing and extracting new events as the MySQL server writes them. These events are then stored in theTransaction History Log and propagated to other Tungsten Replicator instances.

The following diagram shows the Tungsten Replicator architecture for MySQL replication.

Page 11: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Introducing Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.45

Tungsten version 2.0.4

Figure 1.4. MySQL Replication Architecture

Tungsten Replicator has a number of advantages over native MySQL replication, including the following:

• Proper handling of master failover in the presence of multiple slaves. Tungsten Replicator puts global sequencenumbers on all SQL requests, which allows slaves to be promoted the master and then distribute events easilyto the remaining slaves.

• Replication from newer to older versions of MySQL, for example from MySQL version 5.0 to version 4.1.

• Handling flexible replication designs including one master to many slaves (fan-out), many masters to a singleslave (fan-in) and circular replication between two or more masters.

• Ability to replicate to and from other database types, with transformation and filtering of replicated data.

1.4. PostgreSQL Warm Standby ReplicationTungsten Replicator can manage replication between PostgreSQL databases based on warm standby, whichcopies Write Ahead Log (WAL) segments as they are written to one or more slaves. This feature is based on theTungsten Open Replicator, which is a variation of Tungsten Replicator that manages replication mechanisms thatare not based on Tungsten native replication.

Tungsten Replicator fully encapsulates all setup and management of the warm standby replication. Users canoperate warm standby replication using the same administrative commands used for native Tungsten replication.

Under warm standby the master database operated normally and is open for SQL read and write operations. Thestandby database in contrast is in a state of permanent recovery that lasts until it switches to the master role. Itmay not be opened for reads.

PostgreSQL warm standby operation is an excellent choice for users who need a simple and easy-to-managedatabase availability solution. Tungsten Replicator ensures that the warm standby operates smoothly and withoutdata loss in the event of a failover.

NoteThe Open Replicator mechanism is described in Appendix B, Tungsten Open Replicator Specification.

1.5. Definitions of PostgreSQL's Warm Standby, Hot Standby, WALshipping and Streaming Replication

There are two different ways to copy (replicate) data from PostgreSQL master to a slave:

Page 12: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Introducing Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.46

Tungsten version 2.0.4

1. WAL shipping - when PostgreSQL writes a new WAL file (Write Ahead Log, usually 16MB in size), Tungstenmaster sends (with rsync) this file to each of the slaves. When a new WAL file is received on a slave, thepg_standby process applies it to the database. This way the slave periodically catches up with the master.

The drawback of this method is that the slave receives updates periodically, rather than continuously, and thereis a gap while the master is generating a new WAL file, which is not yet completely filled up (each WAL filecontains many transactions).

2. Streaming Replication (introduced in PostgreSQL 9) - a PostgreSQL maintained TCP channel, through whichWAL XLOG records are continuously shipped from master to slaves, where they are instantly applied.

The data difference gap between slave and master is minimal in contrast to the WAL shipping approach.

Depending on the replication method chosen and whether the slaves are accessible for read requests, two standbymodes are defined:

1. Warm Standby - usually called a combination of WAL shipping and a slave, which is not accessible for reads(used only in case of a failover). This is a typical high availability (HA) setup used with PostgreSQL prior toversion 9.

2. Hot Standby - Hot Standby is a term used to describe the ability to connect to the slave and run queries whilePostgreSQL server is in continuous archive recovery. This allows to utilize slave servers:

• with WAL shipping - for reporting and similar purposes,

• with Streaming Replication - for read scaling.

1.6. How Tungsten Replicator Implements PostgreSQL WAL Shipping

When ./configure is successfully finished on one of the cluster nodes, PostgreSQL server instance is preparedto work in a role of master or slave. The following steps are taken depending on the role.

Install operation (implemented in the pg-wal-plugin):

1. postgresql.conf is modified to include Tungsten specific postgresql.tungsten.conf property file.

2. Master:

a. postgresql.tungsten.conf defines how PostgreSQL should archive newly generated WAL files (byusing Tungsten's pg-wal-archive command).

b. PostgreSQL server is restarted and internal use database tungsten is created.

3. Slave:

• recovery.conf file is prepared, which defines a way for a slave to recover newly received WAL files(by using Tungsten's pg-wal-restore command).

If option Auto-enable replicator at start-up was selected, online operation is called automatically. Ifnot, replicator will need to be put online manually.

Online operation (implemented in the pg-wal-plugin):

1. Master:

Page 13: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Introducing Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.47

Tungsten version 2.0.4

a. A trigger file, which stops the archive recovery mode (if any), is created. This is a must, if we are changingroles from slave to master.

b. Master PostgreSQL is started, if it is not already running.

2. Slave:

a. The provisioning operation is called:

i. The PostgreSQL server is stopped.

ii. The PostgreSQL archive folder is emptied.

iii. Master's WAL file log is switched to a new one (if the current one is not empty).

iv. Master is informed that a backup is starting.

v. The contents of the master's PostgreSQL data folder are copied (with rsync) to slave. The followingfiles are excluded: postgresql.conf, pg_xlog/ and pg_log/.

vi. Master is informed that backup has ended.

vii. The prepared recovery.conf file is provided for slave's PostgreSQL server.

viii. Trigger file, which stops archive recovery mode, is ensured to be deleted.

b. PostgreSQL server is started.

After the master and at least one slave is in online state, actual replication takes place:

1. Every time the master PostgreSQL server completely fills up a new WAL file, the pg-wal-archive commandis called, as defined in the postgresql.tungsten.conf file.

2. The pg-wal-archive command copies this new WAL file to outbox folders of the active slaves. Each slavecontains a separate outbox folder on a master under PostgreSQL archive location/outbox/slavetransfer name/.

3. The pg-wal-archive command checks whether each active slave has a corresponding pg-wal-archive-send process running. If not, the command starts one.

4. The pg-wal-archive-send process is working in parallel. Its task is to actually send (with rsync) the yetunsent WAL files to the slave it serves. It keeps scanning the slave's outbox folder and sends the WAL filesthat are there to the PostgreSQL archive location on a slave. When the WAL file is successfully sent,it is deleted from the outbox. If the slave unexpectedly disappears, the files accumulate in the outbox folderuntil the slave comes back or is removed from the cluster.

At any given time, there should be exactly as many pg-wal-archive-send processes in the operatingsystem, as there are active slaves.

5. In the meantime, PostgreSQL calls out the pg-wal-restore command on a slave as defined in therecover.conf file. The pg-wal-restore is a wrapper for the pg_standby command.

6. The pg_standby command monitors the PostgreSQL archive location folder for the next WAL fileto appear.

Page 14: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Introducing Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.48

Tungsten version 2.0.4

7. When a new WAL file is received by the slave, the pg_standby command applies it to the PostgreSQLdatabase, and increases the progress number for this Tungsten Replicator.

Page 15: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.49

Tungsten version 2.0.4

Chapter 2. Tungsten Replicator Installation andConfiguration2.1. Replicator Configuration For All Database Types

This chapter includes installation and configuration steps that apply to all database types.

NoteThe configure script distributed as part of the full Tungsten Replicator release handles configuration au-tomatically. We strongly recommend users run this rather than performing manual configuration file updates.Please consult Tungsten Installation and Configuration Guide for Tungsten 2.0 for more information.

2.1.1. Installation Prerequisites

Tungsten Replicator is written in Java and requires Sun JDK 6 or above. Before installing the replicator, you shouldobtain and install the JDK from http://java.sun.com. Download and install the full JDK.

When the JDK is correctly installed you should be able to run java -version from the command line and see outputlike the following:

$ java -versionjava version "1.6.0_18"OpenJDK Runtime Environment (IcedTea6 1.8) (6b18-1.8-4ubuntu3~9.10.2)OpenJDK 64-Bit Server VM (build 16.0-b13, mixed mode)

WarningRedHat Linux and distributions derived from them may include the GNU Java, which has very differentcommand line flags and behaves differently from Sun JDK versions. This version of Java can cause seri-ous confusion if it somehow gets in the execution path. We recommend you remove it unless there is acompelling reason for it to be present on your hosts.

2.2. Installation and Configuration for MySQL

This chapter explains how to install and configure Tungsten Replicator for MySQL. It is assumed that each TungstenReplicator instance runs on a separate database node.

NoteThe configure script distributed as part of the full Tungsten installation handles configuration file setupautomatically. We strongly recommend users run this rather than performing manual configuration file up-dates. Please consult Tungsten Installation and Configuration Guide for Tungsten 2.0 for more information..

2.2.1. Installation Prerequisites

Tungsten Replicator is certified for MySQL 5.0 and 5.1. MySQL must be installed on replicator hosts prior to bringingup the replicator.

Page 16: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.410

Tungsten version 2.0.4

Please refer to Section 2.1, “Replicator Configuration For All Database Types” for additional prerequisites requiredfor Tungsten Replicator.

2.2.2. Setting Up a User Account

We recommend setting up a non-privileged host account to run Tungsten Replicator. This account is called con-tinuent in the remainder of the documentation; however you may choose any name.

The continuent account must be able to read the MySQL binlog directory. On Linux and Solaris systems, youmay accomplish this by adding the continuent account to the mysql group, which permits read access to MySQLdatabase files.

In addition, the Tungsten Replicator release directory must be owned by the continuent account.

2.2.3. Preparing the Database

Prepare the database nodes as follows:

1. Install MySQL on the database nodes.

2. Edit your my.cnf configuration file to enable binlogging. A minimal my.cnf configuration file is shown belowbelow:

[mysqld]# Master replication settings.log-bin=mysql-binserver-id=1max_allowed_packet=48m

Warning

Tungsten has a number of required and recommended MySQL parameter settings. For a full descriptionplease consult Section 2.2.6.1, “MySQL Server Parameter Settings”.

3. Start (or restart) MySQL so that the new settings take effect.

4. Create a Tungsten Replicator database user and a corresponding database. By default this user is namedtungsten. The following example shows a typical setup.

mysql> grant all on *.* to tungsten@'localhost' identified by 'secret' with grant option;mysql> create database tungsten;

Note

The tungsten login replicates all SQL commands hence requires potentially all privileges. For exam-ple, if you wish to replicate CREATE USER or GRANT commands the account must have every privi-lege granted to these logins as well as the ability to grant them. Users can selectively restrict tungstenprivileges but should bear in mind that in some cases this may cause replicated commands to fail.

Page 17: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.411

Tungsten version 2.0.4

2.2.4. Installing and Configuring Tungsten Replicator

To install Tungsten Replicator, proceed as follows:

1. Login with the continuent account used to run the Tungsten Replicator. This will ensure that all files areowned by the correct login. You must also use this account to start the replicator.

Copy the distribution archive to the database nodes and unpack at the location of your choice. In the followingcode examples, we will use this location as the default directory.

On Linux, Solaris, MacOS X, and other Unix variants we recommend installing in directory /opt/continu-ent. On Windows, use for example the C:\Program Files directory.

Note

If you use Windows and cannot unpack the .zip distribution archive, try installing another file compres-sion program, such as 7-zip. You can also use the jar program distributed with the Java JDK.

2. Configure Tungsten Replicator instances.

a. In the unpacked distribution, cd to the conf directory and copy file replicator.properties.mysqlto replicator.properties. Here are sample commands for Linux and Solaris.

cd confcp replicator.properties.mysql replicator.propertieschmod 700 replicator.properties

Warning

The replicator.properties file contains passwords. To ensure security it should be ownedby the continuent account and have restricted permissions.

b. Edit replicator.properties and set properties required by the replicator. The following list summa-rizes the main properties that need to change.

Tip

In general you should configure slave and master properties identically except where noted below.This makes your configurations easier to understand and also helps with operations like failoverwhere replicators change roles.

• replicator.role must be set to master or slave depending on whether the replicator is extractingfrom a master database or applying on a slave.

• replicator.auto_enable may optionally be set to true. If so set the replicator will automaticallygo into the online state and begin replicating data on startup.

• replicator.source_id must be a unique name for each Replicator. The host name is a good valueif you have one Tungsten Replicator per host.

Page 18: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.412

Tungsten version 2.0.4

• The global properties replicator.global.db.user and replicator.global.db.passwordmust be set to the login and password of the local MySQL database.

• For masters replicator.master.listen.uri must contain a properly formed URI with theroutable name of the master host (not localhost) and a port on which the master listens for client con-nections.

• For slaves replicator.master.connect.uri must contain a URI that matches thereplicator.master.listen.uri of the master to which the slave will connect.

• The extractor binlog directory replicator.extractor.mysql.binlog_dir must be the samename as defined with the log_bin option in the my.cnf MySQL configuration file. The sample valueworks for many Linux distributions. Similarly the binlog_file_pattern must match the prefix ofMySQL binlog files.

• The built-in backup/restore capability is configured in replicator.properties. For more on backupconfiguration see Section 3.4.2, “Backup Configuration”.

Most other property values should not require change for standard MySQL installations. Thereplicator.properties file is well-commented and in most cases self-explanatory. For more infor-mation on replicator configuration, please refer to Section 3.2, “Replicator Configuration”.

Warning

Windows file names must either use forward slash characters (/) or double backslash characters (\\). Single backslash characters are interpreted as escape characters and will be removed leavingan invalid file name.

If you have followed these instructions so far and have MySQL installed, you should not need to make any otherchanges.

2.2.5. Setting Up a Simple Master/Slave Configuration

This chapter explains how to start Tungsten Replicator and set up a simple master/slave configuration.

Important

In Linux, Solaris, and Mac OS X, the login used to run Tungsten Replicator must have permissions to readMySQL binlog files. Add the Tungsten Replicator login to the mysql group, which will allow it to read butnot write to the logs.

Tungsten Replicator is run and configured by using shell scripts residing in the bin directory.

In Linux, Solaris, and Mac OS X, use the scripts below:

• replicator is used to start Tungsten Replicator as a service. This is the preferred way to run the replicatorprocess on Linux hosts.

• trepctl is used to control Tungsten Replicator.

In Windows, use the scripts below:

Page 19: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.413

Tungsten version 2.0.4

• trepstart.bat is used to start Tungsten Replicator.

• trepctl.bat is used to control Tungsten Replicator.

To start Tungsten Replicator in Linux, Solaris, or Mac OS X, proceed as follows:

1. Dump the master database and upload it to all slaves in the cluster. For example, issue the following commandon the master:

mysqldump -uuser -ppassword -hmaster_host --all-databases > mydump.sql

On the slave:

mysql -uuser -ppassword -hslave_host < mydump.sql

Note

On Debian based distributions, you may have to copy the password value in /etc/mysql/debian.cnf from the master to the slave after taking a backup. Otherwise MySQL scripts will not work.

Tip

Slaves may also be provisioned using the built-in backup capability of Tungsten Replicator. See Sec-tion 3.6, “Provisioning New Slaves” for more information.

2. On master and all slaves, start the Tungsten Replicator process:

replicator start

trepctl online

To start Tungsten Replicator in Windows, proceed as follows:

1. On master and all slaves, start the Tungsten Replicator process:

trepstart

2. In a separate command window, start replication.

trepctl online

Note

If you set the replicator.auto_enable property to true, the replicator will start automatically withoutneeding to enter trepctl online. This is very handy when running the replicator as a service using replicator.

This is all it takes to start Tungsten Replicator master and slaves. You should now have your master and slaveTungsten Replicators running and you can check the replication by making some data changes in the masterdatabase and verifying that the changes are reflected in the slave databases.

The use of trepctl/trepctl.bat command is documented in Chapter 10, Command Reference Guide. See alsoSection 10.2, “Running Tungsten Replicator as an Operating System Service”.

Page 20: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.414

Tungsten version 2.0.4

2.2.6. MySQL Database Housekeeping

Tungsten Replicator has certain requirements specific to MySQL that must be met for replication to function cor-rectly. This section provides an overview of administrative settings and other information that will ensure smoothoperation. For more information please refer to MySQL server documentation at http://www.mysql.com.

2.2.6.1. MySQL Server Parameter Settings

Tungsten Replicator has minimal dependencies on MySQL server parameters but does have a few required set-tings. Server parameters are normally set in my.cnf. Location of this file varies by OS platform and distribution; onLinux systems it is commonly located in /etc/my.cnf. Here is an example of the minimum my.cnf parametersettings for Tungsten Replicator.

[mysqld]# Master replication settings.log-bin=mysql-binserver-id=1

# InnoDB parameter settings for Tungsten. Buffer pool size may be# larger but should not be smaller for production deployments.innodb_buffer_pool_size = 512Mdefault-table-type=InnoDBinnodb_flush_log_at_trx_commit=2sync_binlog=1

# Recommended general settings. max_allowed_packet must be greater than# the size of the largest transaction.max_allowed_packet=48m

The following table summarizes recommended usage of the foregoing parameters.

Table 2.1. Minimum MySQL Server Parameters for Tungsten ReplicatorParameter Name Usage Notes Recommended Settingdefault-table-type Determines the default table type for CRE-

ATE TABLE commands. Tungsten com-mits the current slave position in a sidetable when applying data to slaves. If alltable types are InnoDB, this ensures thatTungsten will always restart at the cor-rect position following a database crash.Slave crash-safety cannot be guaranteedfor MyISAM.

InnoDB is strongly recommended.

innodb_buffer_pool_size Determines the memory allocated to theInnoDB buffer pool. If the value is too low,operations to clear the history table willfail.

512M is the minimum recommended val-ue for production deployments, even forthose applications largely based on My-ISAM tables. InnoDB-based applicationsshould set this value as high as possiblefor available memory.

Page 21: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.415

Tungsten version 2.0.4

Parameter Name Usage Notes Recommended Settinginnodb_flush_log_at_trx_commitSee discussion in Section 2.2.6.5, “Avoid-

ing Binlog Corruption”.log-bin Prefix of MySQL binlog files.

Must match the value ofreplicator.extractor.mysql.binlog_file_pattern.in replicator.properties.

Any value is OK

max_allowed_packet Determines the largest transaction thatcan be stored in the Tungsten Replicatortransaction history log. If using statementreplication this is approximately the num-ber of bytes in all statements of the largesttransaction plus about 10%. If using rowreplication, it is approximately twice the to-tal bytes of the row changes in the largesttransaction. Insufficiently large values willcause replication to fail with a MySQL"Packet for query is too large" error.

16m

server-id Not directly used by Tungsten Replicator.However, we recommend using a differ-ent value per server in accordance withstandard MySQL usage. This is especiallyhelpful if you run mixed configurations withTungsten and MySQL native replication.

Follow MySQL recommendations

sync_binlog See discussion in Section 2.2.6.5, “Avoid-ing Binlog Corruption”.

Additional optional parameters are discussed succeeding sections. For all other parameters, follow standardMySQL recommendations.

2.2.6.2. Statement versus Row Replication in MySQL 5.1

MySQL releases from version 3.23 to 5.0 used statement replication only. MySQL 5.1 offers a choice betweenstatement replication and row replication, which affects how SQL changes are recorded in the binlog. In general rowreplication is more flexible. However, row replication is newer and in some cases generates much larger amountsof data in the binlog.

MySQL 5.1 and greater versions support row vs. statement replication using the binlog_format variable. Thedefault value can be for the server using the my.cnf file or directly using a SET SQL command. The followingexample illustrates how to enable row replication globally through SQL.

mysql> set @@global.binlog_format='ROW';

The following example shows how to enable statement replication for a single SQL session.

mysql> set @@session.binlog_format='STATEMENT';

Page 22: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.416

Tungsten version 2.0.4

NoteRow replication and statement replication have trade-offs and special requirements in order to work properly.In general, these conditions apply to Tungsten Replicator as well, since they affect how SQL changes arewritten to the binlog. You should read these carefully in the MySQL product documentation. Recent versionsof MySQL default to 'mixed' mode, which uses both.

2.2.6.3. Handling SQL Mode Settings

MySQL permits variations in SQL syntax using the server sql_mode variable. These variations can lead to prob-lems on slaves which must match SQL Mode settings to apply updates correctly using JDBC, which is the Javaconnectivity interface used by Tungsten Replicator

Tungsten Replicator allows users to set JDBC options required to handle non-strict SQL mode settings using thereplicator.applier.mysql.url_options setting in the sample replicator.properties.mysql file.These settings work for all MySQL servers against which Tungsten Replicator is certified and do not normally needto be changed. Be aware, however, that if you do alter these options it may lead to errors as the slave may not beable to handle non-strict SQL updates on the master.

2.2.6.4. Truncating Binlogs

MySQL binlogs accumulate until deleted or otherwise archived by users. With MySQL native replication this meansyou must wait until all slaves have received events from binlog files, at which point they may be deleted. WithTungsten Replicator binlog management is much simpler. As soon as binlog file contents have been completelystored in the Tungsten Replicator Transaction History Log (THL) you may delete the file.

The THL stores the binlog position in the THL catalog table named trep_commit_seqno. You can query thistable to find out the current position of the THL with respect to the binlogs as follows:

mysql> select seqno, eventid from trep_commit_seqno order by seqno desc limit 1;+-------+-----------------+| seqno | eventid |+-------+-----------------+| 9402 | 000041:10428573 | +-------+-----------------+1 row in set (0.08 sec)

In this example, the THL has reached byte 10428573 of binlog file mysql-bin.000041. (Your prefix may differ;this is a standard form when running MySQL out of the box.) Files mysql-bin.000040 and before can be safelydeleted.

WarningDeleting a binlog that Tungsten Replicator is currently reading may cause unexpected failures or data in-consistencies.

2.2.6.5. Avoiding Binlog Corruption

MySQL binlogs are subject to problems in which the binlog content gets out of sync with the database content, typ-ically following a crash. When this happens the binlog may contain events that were not committed in the database

Page 23: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.417

Tungsten version 2.0.4

or the database may contain commits that are not represented by events in the binlog. Tungsten Replicator in thiscase may read the corrupt files leading to data inconsistencies in slaves.

Binlog corruption problems can be minimized by the following:

1. Use InnoDB rather than MyISAM and/or older releases. InnoDB performs a two-phase commit to the binlog thathelps avoid inconsistencies.

2. Set innodb_flush_lot_at_trx_commit=2 and sync_binlog=1 with battery-backed cache to allow Inn-oDB to operate quickly but with minimal risk of lost writes to disk.

There are also many excellent recommendations in the MySQL community concerning this problem.

2.3. Installation and Configuration for PostgreSQL Warm StandbyThis chapter explains how to install and configure Tungsten Replicator for PostgreSQL using Warm Standby. It isassumed that each Tungsten Replicator instance runs on a separate database node.

NotePostgreSQL replication management is a commercial feature. It is not supported in the community version.

2.3.1. Installation Prerequisites

Tungsten Replicator supports PostgreSQL version 8.2 to 8.4. Before attempting to bring up the replicator, ensurethat your host also has a copy of the pg_standby program, which is distributed in the PostgreSQL contrib package.

Please refer to Section 2.1, “Replicator Configuration For All Database Types” for additional prerequisites requiredfor Tungsten Replicator.

2.3.2. Setting Up a User Account

Tungsten Replicator must run using the postgres account. You must be able to login (or sudo) to this accountand all Tungsten Replicator files must be owned by this account.

Account postgres must be able to carry out passwordless SSH logins and SCP file transfers to other hostsbelonging to the cluster.

WarningRunning Tungsten Replicator with a different account can alter file permissions in a way that will preventPostgreSQL from starting. Do not use the root account!

2.3.3. Preparing the Database

Prepare the database nodes as follows:

1. Install PostgreSQL 8.4 (preferred) or an earlier release back to version 8.2

2. Ensure that the PostgreSQL server has a network listener installed. Edit postgresql.conf as shown below:

listen_addresses = '*' # what IP address(es) to listen on;port = 5432 # database port

Page 24: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.418

Tungsten version 2.0.4

3. Configure the postgres account to allow login without passwords both locally as well as between hosts. Youcan do this first by configuring pg_hba.conf as shown in the following example. This uses ident for locallogins and md5 passwords for remote logins.

# "local" is for Unix domain socket connections onlylocal all postgres ident

# TCP/IP connections on 172.16.238.0 subnet. host all all 172.16.238.1/24 md5host replication all 172.16.238.1/24 md5

You should next configure the postgres account .pgpass to supply a password automatically for the post-gres account.

*:*:*:postgres:secret

4. Add a database user that can perform DBMS backups and can be used for monitoring the database. Byconvention this account is named tungsten. Here are sample commands to create the database.

postgres=# create USER tungsten superuser password 'secret';CREATE ROLEpostgres=# create database tungsten owner tungsten;CREATE DATABASE

2.3.4. Installing and Configuring Tungsten Replicator

To install Tungsten Replicator with PostgreSQL warm standby, proceed as follows:

1. Login with the postgres account. (You must also use this account to start the replicator.)

Copy the distribution archive to the database nodes and unpack at the location of your choice. In the followingcode examples, we will use this location as the default directory.

On Linux, Solaris, MacOS X, and other Unix variants we recommend installing in directory /opt/contin-uent.

2. Configure Tungsten Replicator instances.

a. In the unpacked distribution, cd to the conf directory and copyreplicator.properties.postgresql to replicator.properties. Here are sample commandsfor Linux and Solaris.

cd confcp replicator.properties.postgresql replicator.propertieschmod 700 replicator.properties

Page 25: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.419

Tungsten version 2.0.4

Warning

The replicator.properties file and other configuration files contain password. To ensuresecurity these must be owned by the postgres account and have restricted permissions.

b. Edit replicator.properties and set properties required by the replicator. The following list summa-rizes the main properties that need to change.

Tip

In general you should configure slave and master properties identically except where noted below.This makes your configurations easier to understand and also helps with operations like failoverwhere replicators change roles.

• replicator.role must be set to master or slave depending on whether the replicator is extractingfrom a master database or applying on a slave.

• replicator.auto_enable may optionally be set to true. If so set the replicator will automaticallygo into the online state and begin replicating data on startup.

• replicator.source_id must be a unique name for each Replicator. The host name is a good valueif you have one Tungsten Replicator per host.

• On a standby replicator.master.connect.uri must be a URI of the form wal://masterhost/where masterhost is the host name where the master database runs. For a master the URI can bewal://localhost/.

• There are a set of values that must be configured to set script values. These values are shown in theexample.

• The built-in backup/restore capability is configured in replicator.properties. For more on backupconfiguration see Section 3.4.2, “Backup Configuration”.

Most other property values should not require change for standard PostgreSQL Warm Standby installa-tions. The replicator.properties file is well-commented and in most cases self-explanatory. Formore information on replicator configuration, please refer to Section 3.2, “Replicator Configuration”.

c. Copy sample.postgresql-wal.properties to postgresql-wal.properties and fill in proper-ties in the file. The following example shows correctly filled out values for a master named centos5a.

##################################### SAMPLE.POSTGRESQL-WAL.PROPERTIES ####################################### This file contains properties for WAL-based PostgreQL replication.

# PostgreSQL home directory. postgresql.data=/var/lib/pgsql/data

# PostgreSQL configuration file.

Page 26: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.420

Tungsten version 2.0.4

postgresql.conf=/var/lib/pgsql/data/postgresql.conf

# Standard archive log directory. postgresql.archive=/var/lib/pgsql/archive

# Master host name. postgresql.master.host=

# Master database administrator user name and password. postgresql.master.user=postgrespostgresql.master.password=

# Database role. Acceptable values are 'master' and 'standby'. postgresql.role=master

# PostgreSQL boot command. The postgres user must be able to execute # this command without a password. The script must accept standard # stop/start/restart options. postgresql.boot.script=/etc/init.d/postgresql

# Archive timeout. Maximum time before sending an unfilled WAL buffer to # standby. This is your maximum data loss. postgresql.archive_timeout=60

# Location of pg_standby executable. postgresql.pg_standby=pg_standby

# Location of pg_standby trigger file. postgresql.pg_standby.trigger=/tmp/pgsql.trigger

# Command prefix used for root commands. 'sudo' is most common for # non-priveged accounts. postgresql.root.prefix=sudo

2.3.5. Setting Up a Simple Master/Slave Configuration

This chapter explains how to start Tungsten Replicator and set up a simple master/slave configuration.

Tungsten Replicator is run and configured by using programs residing in the bin directory.

1. On the master host run the warm standby installation procedure to configure the PostgreSQL server and writeTungsten Replicator configuration files.

bin/pg-wal-plugin -c conf/postgresql-wal.properties -o install

NoteThe Tungsten configure program runs this part of installation automatically.

Page 27: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Installation and Configuration

Tungsten Replicator Guide - Document issue 2.0.421

Tungsten version 2.0.4

WarningThe installation procedure reboots the PostgreSQL server.

2. Start the master replicator and bring it online if auto-enabling is not set.

replicator starttrepctl online

3. On the standby host, run the warm standby installation procedure to configure the PostgreSQL server andwrite Tungsten Replicator configuration files.

bin/pg-wal-plugin -c conf/postgresql-wal.properties -o install

NoteThe Tungsten configure program runs this part of installation automatically.

4. Start the standby replicator and bring it online if auto-enabling is not set.

replicator starttrepctl online

NoteThe online operation provisions the standby from the master using rsync. Going online may thereforetake a long time for large databases.

NoteIf you set the replicator.auto_enable property to true, the replicator will start automatically withoutneeding to enter trepctl online. This is very handy when running the replicator as a service using replicator.

The use of trepctl command is documented in Chapter 10, Command Reference Guide. See also Section 10.2,“Running Tungsten Replicator as an Operating System Service”.

Page 28: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.422

Tungsten version 2.0.4

Chapter 3. Basic Principles of OperationThe basic operating principles of the Tungsten Replicator are explained in this chapter. Understanding these prin-ciples will allow you to set up and manage individual replicator instances.

3.1. The Tungsten Replicator Process

3.1.1. Overview

A Tungsten Replicator instance is an operating system process that manages extracting or applying SQL updates.You can configure replicator behavior through the replicator.properties configuration file, which TungstenReplicator reads automatically on start-up.

Tungsten Replicator supports two general replication mechanisms: native replication in which the replicator processreads and applies logs directly and managed replication in which it manages another replication method. Thecomponent responsible for the overall replication mechanism is called an Open Replicator Plugin. The overallreplicator process model (including states and adminstrative APIs) is identical regardless of the plugin type.

In most installations, the replicator functions in one of two roles: as a master or as a slave. The master extracts SQLchanges from a log and stores them for distribution as replication events. The slave receives replication eventsand applies them to a target, which is usually but not always a SQL database.

It is also possible for a replicator to have other roles. Tungsten native replication allows users to define arbitrary rolesusing pipelines, which are configurable flows of replication events. Non-native replication like Tungsten PostgreSQLreplication supports only the master and slave roles.

NoteSlaves can also function as masters to other slaves. For example, you can set up a configuration in whicha slave points to another slave which in turn points to the real master. The slave in the middle is knownas a relay slave.

3.1.2. Replication States

Tungsten Replicator has a simple operational model bases on clearly defined states and transitions between them.Replicator states and commands to change them are summarized below.

• START. Tungsten Replicator processes automatically enter this state on start-up, read thereplicator.properties file, and then go into the OFFLINE state.

• OFFLINE. In this state, the Tungsten Replicator is idle and does not process replication events or access databas-es. The Tungsten Replicator enters this state after successful configuration or following a successful offlinecommand. Users can issue an online command to start replication. The Tungsten Replicator may also enter thisstate automatically following a fatal replication error.

OFFLINE has two sub-states, which are listed below.

• OFFLINE:ERROR. This state indicates that Tungsten Replicator is off-line following an error of some kind.The error message that caused this is easily accessible and is preserved until a successful administrativecommand such as offline causes it to exit the error state.

Page 29: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.423

Tungsten version 2.0.4

• OFFLINE:NORMAL. This state indicates that Tungsten Replicator is off-line following a successful administra-tive operation such as offline or following normal start-up.

• GOING-ONLINE. In this state, the Tungsten Replicator is preparing to begin replication. It has fully allocatedresources and accesses databases.

GOING-ONLINE has two sub-states, which are listed below.

• GOING-ONLINE:RESTORING. This state indicates that Tungsten Replicator is performing a database restore.

• GOING-ONLINE:SYNCHRONIZING. The Tungsten Replicator enters this state whenever it is catching up witha master. This normally occurs when a user issues a online command. It also occurs in response to internalevents such as when a slave loses its connection to the master. The Tungsten Replicator transitions automat-ically to the ONLINE when it detects that it is synchronized with the master and ready to begin applying events.

• ONLINE. In this state the replicator is processing replication events. It accesses databases as necessary toperform its assigned role. The replicator also provides status information showing the current replication position,such as the current log sequence number being processed.

Replicators can leave the ONLINE state state in three main ways. First, a user may issue an offline commandto put the replicator into the GOING-OFFLINE state. Second, a fatal error may occur, which puts the replicatorinto the OFFLINE:ERROR state. Third, a replicator in a slave role may lose synchronization with its master. Thisputs the replicator into the GOING-ONLINE:SYNCHRONIZING state.

• GOING-OFFLINE. In this state, the Tungsten Replicator is releasing resources and shutting down replication.Assuming shutdown occurs normally, the replicator transitions automatically into the OFFLINE:NORMAL state.

NoteThe online command sends the replicator into the GOING-ONLINE:SYNCHRONIZING state before bring-ing it online. It may take some time before this state change is fully complete. Most other state changes arecomplete when the command returns. You can use the trepctl wait to wait for the replicator to go fully online.

3.2. Replicator Configuration

3.2.1. The replicator.properties File

The replicator.properties file contains static configuration information for Tungsten Replicator. By staticwe mean that the properties are read once and do not change again unless the file changes and is reread when thereplicator restarts or receives a configure command while in the OFFLINE state. The replicator.propertiesfile is located by default in directory conf and must be properly configured for the replicator process to run.

Configuration parameters have a well-defined form that allows for global parameters that apply to the replicator asa whole as well as parameters that are specific to individual plug-ins. The rules are as follows.

• All replication properties start with the prefix replicator. Property names consist of multiple parts separatedby periods.

• Global replicator properties have two parts. They are values that apply to the replicator process as a whole.Here are two examples.

Page 30: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.424

Tungsten version 2.0.4

replicator.source_id=master1replicator.schema=tungsten

• The replicator configuration defines a number of plugins that have logical names and properties. Within theconfiguration plugins are referenced by their logical names, as shown in the following example, which referencesplugins named thl-local, mysql, and mysqlsessions.

replicator.stage.thl-to-dbms=com.continuent.tungsten.\replicator.pipeline.SingleThreadStageTaskreplicator.stage.thl-to-dbms.extractor=thl-localreplicator.stage.thl-to-dbms.applier=mysqlreplicator.stage.thl-to-dbms.filters=mysqlsessions

• There is a 3-part property name for each plug-in that defines the Java class that implements the plug-in. Thisname has the following syntax:

replicator.[plug-in type].[logical name]=[java class]

Here are some example plug-in class definitions.

replicator.extractor.mysql= \com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractor

replicator.filter.dbswitcher= \com.continuent.tungsten.replicator.filter.DbNameSwizzlerreplicator.applier.mysql= \com.continuent.tungsten.replicator.applier.MySQLApplier

Tungsten Replicator plug-in classes must be in the class path. Replicator configuration will fail if a plug-in classcannot be found and instantiated.

• There is a 4-part property name for each configuration property on the plug-in itself. Each of these defines aplug-in property value that will be set when the plug-in is instantiated. The names of these parameters havethe following syntax:

replicator.[plug-in type].[plug-in name].[property]=[value]

Here are some example plug-in property definitions.

replicator.applier.mysql.host=localhostreplicator.applier.mysql.user=tungstenreplicator.applier.mysql.password=secret

Plug-in properties must exist on the plug-in implementation class. Replicator configuration will fail if any propertycannot be found and set on the plug-in to which it applies.

• The replicator.properties file allows properties to be used as variables that set other property values.The following example shows definition and use of such variables.

Page 31: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.425

Tungsten version 2.0.4

# Database connection information.replicator.global.db.user=tungstenreplicator.global.db.password=secret...replicator.store.thl.user=${replicator.global.db.user}replicator.store.thl.password=${replicator.global.db.password}

The sample replicator.properties files provided in the Tungsten distribution use such variables to setwidely duplicated variables like database logins and passwords from a single location.

3.2.2. Dynamic Properties

Tungsten Replicator allows the properties that control the replicator role to be set dynamically while the replicatorprocess is running. Tungsten Replicator stores such dynamic properties in file conf/dynamic.properties.When starting or processing a configure command, the replicator process first reads static properties inreplicator.properties followed by dynamic properties. In this way, dynamic values override the original stat-ic values.

Dynamic properties to control the replicator role are set using the trepctl setrole program, as illustrated in thefollowing example. Note that Tungsten Replicator must be in the OFFLINE state for changes to be accepted.

trepctl offlinetrepctl setrole -role slave value thl://guppy/trepctl online

Dynamic properties are cleared using the trepctl clear. This command resets all properties to their static valuesand deletes the dynamic.properties file. Like the setrole command it may only run when the replicator processis in the OFFLINE state.

trepctl offlinetrepctl cleartrepctl online

You may also clear dynamic properties by removing the dynamic.properties file and restarting the replicatorprocess.

3.2.3. Open Replicator Plugin

The Open Replicator Plugin controls the replication mechanism. There is currently a native Tungsten plugin thatimplements log reading and event application and a script plugin that manages other replication types. One andonly one of these must be selected using the following property syntax.

# Available OpenReplicator providersreplicator.plugin.tungsten=com.continuent.tungsten.replicator.management.\tungsten.TungstenPluginreplicator.plugin.script=com.continuent.tungsten.replicator.management.\script.ScriptPlugin

Page 32: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.426

Tungsten version 2.0.4

# Chosen OpenReplicator providerreplicator.plugin=tungsten

# Script provider's root dir and configuration file. Fill these# if ScriptPlugin is chosen replicator provider##replicator.script.root_dir=@SCRIPTROOT@#replicator.script.conf_file=@SCRIPTCONF@#replicator.script.processor=@SCRIPTFILE@

The script plugin requires additional properties to define the root directory for scripts and other information forcorrect operation. For more information on the script plugin and Open Replicator plugins in general see Appendix B,Tungsten Open Replicator Specification.

NoteAll other plugin components used in the replicator.properties file and described elsewhere in thisdocument apply only to the native Tungsten plugin. For example, PostgreSQL Warm Standby replicationdoes not use other plugin components.

3.2.4. Pipelines and Stages

Pipelines are replication processing flows consisting of one or more stages that work on events. Events flow fromone stage to the next. Each stage is scheduled independently and can run on a separate core of a multi-core/multi-user host. This architecture parallelizes replication processing in a simple and intuitive way.

Pipelines have symmetric interfaces that allow stages and their consistuent parts to be recombined easily. Forexample, stages use the same interfaces to extract events whether extracting from a database log, pulling eventsfrom the Tungsten Replicator log, or reading events from another replicator on the network. This symmetry allowsreuse of existing components in many different types of replication flows. Pipelines allow significant performanceimprovements by spreading CPU-intensive processing across stages running in different CPUs as well as enablingblock commit to reduce I/O.

There is a pipeline definition for each role supported by the replicator. By convention replicators have a masterand slave pipeline definition, but it is possible to define many other kinds of pipelines. Specialized pipelines arediscussed in more detail in Section 4.1, “Specialized Pipeline Extensions”.

Stages consist at a minimum of an extractor, which fetches events, and an applier, which applies the events. Thestage processing flow consists of a loop that fetches events as quickly as possible from the extractor and handsthem over to the applier. The extractor and applier components are interfaces and correspond to any operationthat reads events and writes events respectively.

Stages may also define one or more filters, which can inspect, change, or drop events entirely before they areapplied. Filters enable a wide range of operations events from suppressing undesirable SQL to implementing timedelays to transforming updates. Tungsten uses filters internally to implement operations like identifying sessionsand recognizing hearbeat events.

In addition to stages, pipelines may have one or more stores. Stores provide storage of replicated events beforethey are applied and may persist between replicator restarts or be in-memory only. Stores have extractor andapplier interfaces that allows them to serve as start and end-points of stages.

Page 33: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.427

Tungsten version 2.0.4

The following example shows how to define pipelines in the replicator.properties file. The first propertylists available pipelines. The properties thereafter define the slave pipeline, which consists of two stages. The firststage extracts events from a master transaction history log (THL) and applies them to the local transaction historylog. The second stage retrieves events from the local log and applies them to the slave database. At run-timethese two stages run concurrently, so that event application on slaves is not blocked by extraction from the remoteTHL and vice-versa.

# Generic pipelines.replicator.pipelines=master,slave,direct

...# Slave pipeline has two stages: extract from remote THL to local THL;# extract from local THL and apply to DBMS.replicator.pipeline.slave=remote-to-thl,thl-to-dbmsreplicator.pipeline.slave.stores=thl

replicator.stage.remote-to-thl=com.continuent.tungsten.replicator.pipeline.\SingleThreadStageTaskreplicator.stage.remote-to-thl.extractor=thl-remotereplicator.stage.remote-to-thl.applier=thl-local

replicator.stage.thl-to-dbms=com.continuent.tungsten.replicator.pipeline.\SingleThreadStageTaskreplicator.stage.thl-to-dbms.extractor=thl-localreplicator.stage.thl-to-dbms.applier=mysqlreplicator.stage.thl-to-dbms.filters=mysqlsessions

Replicator roles and pipeline names must correspond exactly. The replicator.role property selects thepipeline that will load and run as shown below. This property must specify the name of a pipeline defined elsewherein replicator.properties.

replicator.role=master

Stage definitions have specialized properties that need to be set to ensure proper operation. For more informationon these properties and example of use look at the replicator.properties file.

• autoSync - Specialized property used by certain pipelines to allow them to go fully online without an interveningGOING-ONLINE:SYNCHRONIZING state.

• blockCommitRowCount - Maximum number of events to apply at once. This parameter is critical for highperformance replication as it allows multiple events to be written persistently on slaves. Effective use typicallyrequires multiple stages with intervening queue stores. See Section 3.2.5.2, “Queue Stores”.

• syncTHLWithExtractor - controls whether to synchronize the sequence number of the last event applied withthe position of the THL store. This is set to false for slave but otherwise events.

3.2.5. Stores

Stores hold events in order to transfer them between stages. The store component interface is generic and caninclude implementations that store events in memory or write them persistently.

Page 34: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.428

Tungsten version 2.0.4

3.2.5.1. Transaction History Logs

The most commonly used store is the Transaction History Log, or THL. The THL maintains a persistent list ofreplication events in serial order, which means that if you apply them to a slave database it will result in the samecontents as the master.

THL contents are stored in catalog tables. The catalog location and other relevant THL information is defined inreplicator.properties as shown in the following example.

replicator.store.thl=com.continuent.tungsten.replicator.thl.THLreplicator.store.thl.storage=com.continuent.tungsten.replicator.thl.JdbcTHLStoragereplicator.store.thl.url=jdbc:mysql://localhost/tungsten?createDatabaseIfNotExist=truereplicator.store.thl.user=${replicator.global.db.user}replicator.store.thl.password=${replicator.global.db.password}

The THL storage plug-in is generic and works the same way for all supported database types. There are twostorage implementations for the THL: DBMS and disk storage. The implementation class is set as a parameter onthe THL configuration in replicator.properties.

NoteDisk THL storage is an enterprise feature. It is not available in community releases of Tungsten Replicator.

DBMS storage persists replication events in the history table. The implementation class iscom.continuent.tungsten.replicator.thl.JdbcTHLStorage as seen in the preceding example. Eachevent is characterized by a sequence number (column seqno), which increases with each new transaction that isadded to the THL. Sequence number values are identical across all THL copies, which means that if a master andslave have the same maximum sequence number then the slave's THL is fully up to date with the master.

Disk storage persists replication events in binary files on disk. The history table is not used. The implementa-tion class is com.continuent.tungsten.enterprise.replicator.thl.DiskTHLStorage. Disk storageis optimized for speed and supports very fast read/write operations. In function it is identical to DBMS storagethough there are a number of specialized options for configuration. These are treated in detail in Section 4.2, “THLDisk Storage Configuration and Management”.

WarningThe database or schema name for the THL and other catalog tables is given by the replicator.schemaproperty. This name must match the name of the database in the replicator.store.thl.url property.

NoteAs of Tungsten Replicator 2.0 the DBMS log is deprecated.

Like other store implementations the THL includes extractor and applier interfaces that are used to integrate withstages. The THL implementation includes a server with a network listener that permits slaves to connect to theTHL in order to fetch out events across the network. This is the basis for master/slave event transfer.

• THLStorageAdapter - Extracts from and applies events to a local THL.

• RemoteTHLExtractor - Extracts events from a THL server listening on a network port. The default port is 2112.This adapter cannot be used as an applier.

Page 35: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.429

Tungsten version 2.0.4

3.2.5.2. Queue Stores

Queue stores hold events in memory between stages. They serve as a mechanism to pass events between con-secutive stages. The following example shows the definition of a queue store named queue.

# In-memory storage to buffer events between stages.replicator.store.queue=com.continuent.tungsten.\replicator.storage.InMemoryQueueStorereplicator.store.queue.maxSize=100

The maxSize property defines the size of the queue. The queue accepts values up to this size and blocks thereafter.The size therefore establishes the size of the buffer between subsequent stages. The value is usually the same asthe blockCommitRowCount on stages that extract from the store, so that the queue matches the block commitsize.

Queue stores have a single adapter that works both as extractor and applier.

• InMemoryQueueAdapter - Extracts from and applies events to a queue store.

TipQueue stores are critical for achieving high performance, especially when using disk THL storage. Thedefault replicator.properties uses multiple stages with intervening queue stores for both master andslave pipelines.

3.2.6. Extractors

Each stage must have a configured extractor that fetches events for processing. For example, extractors readevents from database logs to start replication. In addition, there are extractors to read events locally from the THLas well as to fetch them from across a network.

Class name and properties for the extractor are configured according to the rules for plug-in configuration providedin Section 3.2.1, “The replicator.properties File”. The following example shows extractor property definition for theMySQL extractor.

replicator.extractor.mysql=\com.continuent.tungsten.replicator.extractor.mysql.MySQLExtractorreplicator.extractor.mysql.binlog_dir=/var/lib/mysqlreplicator.extractor.mysql.binlog_file_pattern=mysql-binreplicator.extractor.mysql.host=localhostreplicator.extractor.mysql.port=3306replicator.extractor.mysql.user=${replicator.global.db.user}replicator.extractor.mysql.password=${replicator.global.db.password}replicator.extractor.mysql.parseStatements=true

Extractor properties vary by extractor type. Extractors are documented in Section C.2, “Extractors”.

Page 36: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.430

Tungsten version 2.0.4

3.2.7. Appliers

Each stage must have a configured applier that disposes of events at the end of stage processing. There is ageneric applier that applies events to slave databases. However, there are also appliers to store events in the THLas well as other storage implementations.

Class name and properties for the applier are configured according to the rules for plug-in configuration providedin Section 3.2.1, “The replicator.properties File”. The following example shows property definition for the MySQLapplier. As with all plug-ins, unused properties may be omitted. They will assume default values.

# MySQL applier properties. replicator.applier.mysql=\com.continuent.tungsten.replicator.applier.MySQLApplierreplicator.applier.mysql.host=localhostreplicator.applier.mysql.port=3306replicator.applier.mysql.url_options=?jdbcCompliantTruncation=falsereplicator.applier.mysql.user=${replicator.global.db.user}replicator.applier.mysql.password=${replicator.global.db.password}

Applier properties vary by applier type. Appliers are documented in Section C.4, “Appliers”.

3.2.8. Filters

Each stage may have 0 or more filters. Filters can drop or transform SQL events, which is very handy for a widevariety of replication use cases.

Class name and properties for each filter are set according to the rules for plug-in configuration provided in Sec-tion 3.2.1, “The replicator.properties File”. Note that all filters use the same replicator.filter prefix regard-less of their actual role.

# Logging filter. replicator.filter.logger= \com.continuent.tungsten.replicator.filter.LoggingFilter

# Database transform filter. replicator.filter.dbtransform= \com.continuent.tungsten.replicator.filter.DatabaseTransformFilterreplicator.filter.dbtransform.from_regex=fooreplicator.filter.dbtransform.to_regex=bar

Filter properties vary by filter type. Filters are documented fully in Section C.3, “Filters”.

3.3. Replication Catalogs

3.3.1. Tungsten Database Tables

Replication catalogs are database tables that the Tungsten Replicator uses to keep track of replication events andmanage the replication process. Catalogs are normally stored in the same database server for which the TungstenReplicator is handling replication, as shown in the following diagram.

Page 37: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.431

Tungsten version 2.0.4

Figure 3.1. Replication Catalogs

Replication catalogs include tables for storing replicated events, consistency checks, and any other informationrequired by the Tungsten Replicator. The Tungsten Replicator creates them automatically at start-up time.

The catalog database is the database that contains the catalog tables. Catalog tables may be stored in anydatabase. By convention they are stored in the tungsten_service database, where the service name is definedin the replicator properties file that defines the service. The catalog database should be different from the databaseused to store application data.

The catalog database contents must be coordinated with the contents of tables that are currently being replicatedor replication will either miss SQL updates or try to apply them twice. When transferring a snapshot to provision anew Tungsten Replicator instance, the catalog tables are normally included.

The Tungsten Replicator catalog tables are described fully in Appendix A, Tungsten Replicator Catalogs.

TipFor databases other than MySQL you may need to create the catalog database beforehand when first settingup the master. Once the database is correctly set up it can be replicated to slaves along with applicationdatabases.

3.3.2. Purging the Transactional History Log

The Tungsten Replicator does not automatically purge records from the history catalog table, which comprisesthe THL when using DBMS logging. This table therefore grows without bound unless manually truncated.

NoteTungsten Enterprise includes a procedure to purge THL tables automatically. Look in directory tung-sten-replicator/samples/scripts/cronjobs. This procedure should always be deployed for pro-duction deployments.

When purging THL records, it is important to avoid deleting rows corresponding to SQL events that have yet to bereplicated to one or more slaves. This applies obviously to the current master database. However, it also applies to

Page 38: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.432

Tungsten version 2.0.4

slaves used for failover. It is important to avoid a situation where a slave is promoted to master but does not haveevents required by other slaves that happen to be lagging behind.

The following procedure describes a safe procedure for checking and truncating THL records.

1. Find the current THL high water mark on each slave database. The high water mark is the highest eventsequence number in each THL. Login to each slave and issue the following trepctl command:

$ thl info

2. Select the lowest high water mark value. For example, suppose there are two slaves and one has a highwater mark of 6461 and the other has a high water mark of 7730. 6461 is the lowest value so we select this.

3. Truncate the master and slave THL tables to eliminate all records with sequence numbers less than the currentlow value. You can do this as follows using the thl utility.

$ bin/thl purge -high 6460 WARNING: The purge command will break replication if you delete all events or delete events that have not reached all slaves. Are you sure you wish to delete 101 events [y/N]? y Deleting events where SEQ# <=6460 Deleted events: 4424

You can also perform this operation directly using SQL as shown below.

DELETE from history WHERE seqno < 6461;

Warning

Do not delete all records in the THL or replication restart may fail. You should always leave at least onerecord. MySQL users should also beware of locking problems with InnoDB. See Section 2.2.6, “MySQLDatabase Housekeeping” for more information.

3.4. Backup and Restore

3.4.1. Overview of Backups and Backup Storage

Tungsten Replicator provides a built-in facility to backup and restore databases using simple commands. Thebackup facility uses two specialized types of plugins.

Backup agents implement the backup and restore operations using commands that are specific for a particulardatabase type. The result of a backup is a file that can be restored. A restore operation takes a backup file andrestores it to the database. There may be multiple backup agents for a particular file type.

Page 39: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.433

Tungsten version 2.0.4

Storage agents manage backup files: storing, retrieving, and purging files that are no longer needed. There isa small number of storage agents that serve the needs of many different kind of backups. The default storagemechanism is file-based storage on a shared disk that is visible to all hosts using Tungsten Replicator.

Backups are an optional feature of Tungsten Replicator and do not need to be configured if you do not need them.If you do not enable at least one backup agent and at least one storage agent, the backup and restore commandswill be disabled.

3.4.2. Backup Configuration

Backup configuration is controlled by replicator.properties. Backup and storage agent plug-ins follow thesame general configuration pattern described in Section 3.2.1, “The replicator.properties File”. Here are the mainsteps to observe when setting configuration values.

1. List one or more backup agents using the replicator.backup.agents property and configure the corre-sponding agent properties. Backup agent settings are specific to the type of database and backup/restore mech-anism.

2. Select a default backup agent using the replicator.backup.default property. This agent is used if youdo not specify one on the backup command.

3. In like fashion select one or more storage agents using the replicator.storage.agents property, configureproperties, and select a default using the replicator.storage.default property. All storage agents havea retention property. This determines the maximum number of backups retained until old backups are deleted.

Tip

When testing replication (for example before deploying into production) it is often useful to define two storageagents that use different locations. The default agent is used for for normal backup and restore. The otheragent can hold a base backup to restore system state quickly at the beginning of tests.

Note

Backup properties are not dynamic. You must reread replicator.properties if you make changesusing trepctl configure or by restarting the process.

3.4.3. Running Backup and Restore Commands

Before running a backup Tungsten Replicator must be in the OFFLINE state. Depending on the backup agent used,the database may also need to be fully quiesced with no active transactions.

The trepctl command has options to run backup and restore tasks. The following example runs a backup task usingthe default backup agent, stores it using the default storage agent, and prints the URI when the backup completes.

$ trepctl backupBackup completed successfully; \URI=storage://file-system/store-0000000013.propertiesState: OFFLINE:NORMAL

Page 40: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.434

Tungsten version 2.0.4

The following example selects specific backup and storage agents and waits only 15 minutes before returning. Ifthe backup is not done before the timeout expires the URI is not printed. Instead, you must check the log to findthe backup URI.

$ trepctl backup -backup mysqldump -storage fs -limit 900Backup is pending; check log for statusState: OFFLINE:BACKUP

To restore data, use the trepctl restore command. As with backups, Tungsten Replicator must be in the OFFLINEstate to run a restore command.

The following example runs a backup task using the latest backup stored with the default storage agent. TungstenReplicator automatically determines the backup agent to use by reading the backup metadata and calling thecorrect agent.

$ trepctl restoreRestore completed successfullyState: OFFLINE:NORMAL

Similarly, the following example restores a specific backup by specifying its URI. We also wait up to 20 minutesbefore returning. As with backups, if the restore is not done before the timeout expires you will not know if therestore succeeded. Instead, you must check the replicator log.

$ trepctl restore -uri storage://file-system/store-0000000013.properties -limit 1200Restore is pending; check log for statusState: OFFLINE:RESTORE

TipBoth backup and restore operations return Tungsten Replicator to the OFFLINE:NORMAL if they succeed.In the event of an error Tungsten Replicator goes into the OFFLINE:ERROR state, which means that failuresare easy to detect.

3.4.4. Storage Organization and Management

Storage agents store three types of information. The first is the backup files themselves. The second is metadatadescribing each file, for example its size, the date on which it was generated, and the backup agent that producedit. Finally, there is information used to manage the store itself, such as the index number to assign the next backup.

Each storage agent may have a slightly different form of organization. The default file system storage agent providedby Tungsten Replicator stores backups and metadata on the file system. The following listing shows typical storagecontents.

storage.indexstore-0000000012-mysqldump-6450242716624177955sql

Page 41: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.435

Tungsten version 2.0.4

store-0000000012.propertiesstore-0000000013-mysqldump-8628772972175094324sqlstore-0000000013.propertiesstore-0000000014-mysqldump-6633515753745331096sqlstore-0000000014.properties

The storage.index file has the index used to generate new backup numbers. Each backup consists of thebackup file itself plus a properties file with metadata. You can look at but should not edit property files.

You can purge backups by deleting the backup file and the matching properties file. You can re-initialize storagefully by removing all files including storage.index.

TipYou can increase the file storage availability by backing it up regularly to another file system using toolslike rsync.

3.4.5. Creating a Backup When Starting the Replicator

You can configure the replicator to automatically backup the database when the replicator is started and before itis taken online. In order to enable this you must configure your cluster to support backups and add a line to thedynamic.properties file.

replicator.auto_backup=true

The next time that the replicator is started, it will invoke the backup method on your default backup method as config-ured in replicator.properties. The dynamic.properties file will be updated to turn off the auto_backupflag to ensure that the dataserver is not backed up when the replicator is restarted.

NoteYou must wait for the backup files to be written to the backup directory before provisioning any new slaveswith the backup.

3.5. Master Failover

Master failover is the process of switching from an existing master to a new one. Failover is not automatic. Instead,users must execute a series of commands to configure a new master and point slaves to the master. The followingprocedure describes how to perform a planned failover.

NoteCommands shown in this section correspond to Unix conventions. Windows commands are analogous butuse the Windows scripts.

1. Quiesce master database applications so that there are no transactions in progress. Use flush to ensure themaster database state is synchronized with the log and collect the last sequence number in the transactionhistory log. Wait for the slave that is being promoted to master to catch up to that sequence number.

Page 42: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.436

Tungsten version 2.0.4

$ trepctl -host prodmast1 flushMaster log is synchronized with database at sequence number: 376State: ONLINE:MASTER$ trepctl -host prodslave4 wait -applied 376

2. Send master and slave to the OFFLINE state.

trepctl -host prodmast1 offlinetrepctl -host prodslave4 offline

3. Configure the old slave to be a master by setting its remote THL URI to point to the same host.

trepctl -host prodslave4 setrole -role mastertrepctl -host prodslave4 online

Note

For instructions on bringing the old master back online, see Section 9.6.2, “Repairing a Failed Master”

4. On the remaining slaves, run the commands below.

trepctl setrole -role slave value thl://prodslave4/trepctl online

Warning

If you are switching masters due to a failover, do not enable the old master as a slave, as this can leadto data inconsistency errors due to transactions on the master that were not replicated to any slave. SeeSection 9.6.2, “Repairing a Failed Master” for more information. A future release of Tungsten Replicator willprovide mechanisms to identify automatically when a master can safely be reused as a slave.

3.6. Provisioning New Slaves

3.6.1. Overview

You can add new slaves to a replication configuration at any time. The provisioning procedure has the followingsteps.

1. Install and configure. Install software and configure the replicator.properties file. Ensure that thedatabase to which you are replicating is running and has proper accounts set up.

2. Synchronize slave state. Load the slave with an initial copy of data that includes Tungsten Replicator catalogs.The initial copy must be transactionally consistent. This means that the data in the copy must match thesequence numbers recorded in the catalog tables, so that when replication starts it will start applying eventsat a point that correctly matches current data. The copy must be new enough so that the master can startreplication at the slave's current position in the THL.

3. Enable replication. Issue commands to configure the slave and bring it into the ONLINE state.

Page 43: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.437

Tungsten version 2.0.4

3.6.2. Procedure for Automatically Provisioning New Slaves

You can configure the replicator to automatically provision the database from the most recent backup when thereplicator is started. In order to enable this you must configure your cluster to support backups and add a line tothe dynamic.properties file.

replicator.auto_provision=true

The next time that the replicator is started, it will invoke the restore method on your default backup methodas configured in replicator.properties. The dynamic.properties file will be updated to turn off theauto_provision flag to ensure that the dataserver is not restored when the replicator is restarted.

If you have not configured backup methods or if a backup is not available, the replicator will go into an error state.If you have backups configured but there was not a backup available you can attempt to restore again once youhave created a backup using the master or another slave.

3.6.3. Procedure for Manually Provisioning New Slaves

The first time you provision a slave it will be necessary to stop the master completely in order to create a consistentdata dump. However, once you have at least one slave available you can use a slave for provisioning instead. Wecall this kind of slave a donor slave. Donor slaves eliminate the need to stop the master.

The following diagram illustrates use of a donor slave to provision another slave without stopping the master.

Figure 3.2. Provisioning a New Slave from a Donor

Synchronizing data requires database-specific dump and load commands. The following procedure is generic andassumes that you have backup properly configured as described in Section 3.4, “Backup and Restore”. You canalso substitute a backup mechanism of your own. The host names are donor and recipient respectively.

Page 44: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.438

Tungsten version 2.0.4

1. Command the donor slave to the OFFLINE state, back it up, and bring it back on line again. If you have properlyconfigured backups this looks like the following:

trepctl -host donor offlinetrepctl -host donor backuptrepctl -host donor online

If backups are not configured, you would substitute appropriate database dump commands for trepctl backup.Once this step is complete the donor is not needed further.

WarningFor this procedure to work properly in all cases you must ensure that the backup is transactionallyconsistent, that is, that there are no writes going to the database. This is not normally an issue withslaves but requires quiescing applications if the donor is a master.

2. Provision the new slave by starting and loading the backup. Again if backups are configured this is a relativelysimple procedure as shown below.

trepstart trepctl -host recipient restoretrepctl -host recipient online

If backups are not configured, you would substitute appropriate database restore commands. These must runbefore you start the replicator and bring it online.

3.7. Consistency Checking

3.7.1. Overview

Tungsten Replicator offers a built-in consistency checking facility that makes it easy to run a consistency checkon part or all of a table. The consistency check runs a checksum on the table on the master and then repeats thesame checksum on the slave. If the checksum fails, it can generate either a warning or an error that sends thereplicator to the OFFLINE:ERROR state.

Table consistency checks work by generating a special consistency check event that is replicated between themaster and slave. When the event arrives on the slave Tungsten Replicator recomputes the checksum and com-pares the results with the previous values for the master. The check executes in serial order with other SQL up-dates, which means that the slaves should be in the same state as the master when the event runs. As a result, itis not necessary to stop the slave; also the comparison can be incremental.

WarningTables must have a primary key defined for consistency checks to work. Tungsten Replicator uses theprimary key to order rows in order to generate consistent results on successive invocations. Checksumson tables without a primary key will fail.

3.7.2. Invoking Consistency Checks

All consistency checks must be initiated from the host serving as the master.

Page 45: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.439

Tungsten version 2.0.4

To run a consistency check over a table, use the trepctl command as shown in the following example. In this casewe are checking consistency on table orders in database sample.

trepctl check sample.orders

To check a part of the table, use the -limit flag. The limit parameters are offset, which the row on which to start, andthe limit, which is the number of rows to check. The following example shows a check that starts on the 100th rowand proceeds for 100 rows. (Row numbering starts at 0, not 1.)

trepctl check sample.orders -limit 99,100

There is also a -method flag, but for the time being the only value that is accepted is 'md5'. This flag is thereforetypically omitted.

3.7.3. Configuration

Tungsten Replicator provides properties to control the behavior of consistency checks. The following exampleshows these properties.

# How to react on consistency check failure. Possible values are 'stop' or 'warn'. replicator.applier.consistency_policy=stop

# Should consistency check be sensitive to column names and/or types? Settings# on a slave must be identical to master's. Values are 'true' or 'false'. replicator.applier.consistency_column_names=truereplicator.applier.consistency_column_types=true

The consistency_policy controls the slave response to a failed checksum. The slave can either fail or print awarning when a consistency check fails. The following example illustrates configuration.

# How to react on consistency check failure. Possible values are 'stop' or # 'warn'. replicator.applier.consistency_policy=stop

There are also two additional parameters that are helpful for consistency checking across different database types.

• consistency_column_names - If true, the consistency check ignores differences in the column name case.For example, a column named "address" and a column named ADDRESS will be considered equivalent.

• consistency_column_types - If true, the consistency check ignores differences in column types. For exam-ple, this would would allow integer and long values to compare.

3.8. Replicator Monitoring and Management APIs

Tungsten Replicator implements capable management and monitoring interfaces through JMX. JMX is a standardmanagement API that allows Java processes to expose monitoring data, management commands, and notificationsof state changes to external clients. For more information on JMX itself look at the Sun documentation, for examplehttp://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/.

Page 46: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.440

Tungsten version 2.0.4

3.8.1. JMX/MBean Interface Architecture

Tungsten Replicator is designed to be a relatively simple network service. It does not attempt to make decisionsabout database state or failover, nor does it implement overall management procedures. Instead, Tungsten Repli-cator exposes JMX MBeans, which serve as an interface for other management tools. These tools use informationfrom the MBean interfaces as well as operations to implement higher-level management procedures.

The following diagram depicts the management architecture.

Figure 3.3. High-Level Management Architecture

Tungsten Replicator MBean interfaces expose only standard Java types. Clients do not need to include replicatorlibraries to invoke and use the replicator MBeans. This means that generic management tools like jconsole caneasily connect to and manage Tungsten Replicator.

3.8.2. Basic JMX/MBeans

Tungsten Replicator separates management and monitoring statistics into separate MBean interfaces.

TipTungsten Replicator JMX interfaces often change, though we make every effort to ensure upward compat-ibility for clients. The Javadoc pages in binary builds or the Java interfaces in source code are the finalreference for MBean interface behavior.

3.8.2.1. ReplicatorManagerMBean - Replicator Management

The ReplicatorManagerMBean is the principle management interface for Tungsten Replicator. Managementoperations like going online, rereading configuration, and stopping the replicator use this interface.

• Attributes - Values include the current state of the replicator (for example, OFFLINE:NORMAL) and pendingerror conditions, if any.

• Operations - Operations correspond to all management operations that affect or query Tungsten Replicatorstate.

Page 47: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Basic Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.441

Tungsten version 2.0.4

• Notifications - Notifications include state changes and error conditions.

For detailed documentation of ReplicatorManagerMBean interfaces, refer to the Javadoc page for interfacecom.continuent.tungsten.replicator.ReplicatorManagerMBean.

3.8.2.2. ReplicatorMonitorMBean - Replicator Monitoring

The ReplicatorMonitorMBean exposes replicator statistics. Management operations that examine TungstenReplicator performance and throughput use this interface.

• Attributes - Values include key statistics and monitoring data for extraction, receipt, and application of SQLevents. Note that replicators in the MASTER state will update extraction statistics, where as replicators in theSLAVE state will update event receipt and application statistics. To get a full picture of statistics you need to lookat both manager and slave replicators.

• Operations - ReplicatorMonitorMBean has a single operation to reset monitoring counters.

• Notifications - ReplicatorMonitorMBean does not currently generate notifications.

For detailed documentation of ReplicatorMonitorMBean interfaces, refer to the Javadoc page for interfacecom.continuent.tungsten.replicator.conf.ReplicatorMonitorMBean.

3.8.3. JMX Clients for Tungsten Replicator

Any client that can manipulate JMX interfaces can manage and monitor Tungsten Replicator.

3.8.3.1. Tungsten Replicator trepctl Client

The trepctl client distributed with the replicator is a JMX client. It uses JMX interfaces to obtain replicator data andto implement management operations. If you can see an attribute or perform an operation through trepctl youcan do the same using JMX directly.

3.8.3.2. Java jconsole Client

The Sun JDK distributes jconsole, a very flexible JMX client. All Tungsten Replicator attributes, operations,and notifications are accessible through jconsole. The jconsole also shows extensive Java Virtual Machinemonitoring data, which makes it an invaluable tool for monitoring.

The easiest way to connect with jconsole is to start jconsole on the same host as the replicator. You shouldsee the Tungsten Replicator listed in the local connections when jconsole starts up. You can also connect toremote replicator processes by specifying the host name and port.

3.8.3.3. Custom JMX Client

It is relatively easy to write a custom JMX management agent. Consult the Sun JMX documen-tation for advice on writing your own client. When you start to write code itself, look at classcom.continuent.tungsten.replicator.ReplicatorManagerCtrl in the Tungsten Replicator sourcecode.

NoteTungsten Replicator depends on utility code from the Tungsten Commons project to connect to JMX. Thiscode is downloadable from the Continuent Community Site.

Page 48: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.442

Tungsten version 2.0.4

Chapter 4. Advanced Principles of OperationThis chapter provides guidance on approaches for using Tungsten Replicator as part of production solutions. Thetechniques described here assume familiarity with Tungsten Replicator as described in Section 3.1, “The TungstenReplicator Process”.

4.1. Specialized Pipeline Extensions

Tungsten Replicator is configured with default master and slave pipelines as discussed in Section 3.2.4, “Pipelinesand Stages”. However, pipelines enable a number of useful extensions and special applications beyond simplemaster/slave replication. The subsections provide additional tips on pipeline usage.

4.1.1. Dummy Replication

Dummy replication extracts events from the queue using a single-stage pipeline and immediately discards them.It is a common test configuration to ensure Tungsten Replicator can properly extract events from database logs.

TipDummy replication is non-intrusive and can run against production database servers. It is highly recom-mended as an initial test to ensure there are no issues with starting Tungsten processes and extractingevents from database logs.

A pre-configured dummy replication pipeline is provided as part of the standard MySQL configuration. Follow thesteps shown below to enable dummy replication.

NoteYou can also select dummy replication pipelines as a replication service role during normal installation.

1. Installation. Install Tungsten normally on the host containing a master database that is receiving updates. MySQLbinlogs must be enabled. Do not start services.

2. Post-configuration setup. Edit the replicator configuration in tungsten-replicator/conf/replicator.properties. Change the replicator role to dummy as shown in the following example. Also, au-to-enable the replicator to ensure that it goes online automatically.

# Replicator role. Uncomment one of the choices of master or slave. # There is no default for this value--it must be set or the replicator # will not go online. replicator.role=dummy...# Replicator auto-enable. If true, replicator automatically goes online # at start-up time. replicator.auto_enable=true

3. Start-up. Start the replicator process. It will automatically go online and begin extraction

tungsten-replicator/bin/replicator start

Page 49: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.443

Tungsten version 2.0.4

4. Monitoring. Ensure that replication is working by checking the status of the replicator using trepctl status. Thesequence number should increment with each transaction on the database, reported latency should be at orclose to 0, and there should be no errors in the replicator log.

tungsten-replicator/bin/trepctl status

If you encounter errors, you should correct them and restart the replicator process. Each time the replicator goesonline the sequence number will restart at 0. Repeat this process until the replicator runs without errors.

TipOne variation on dummy replication for MySQL is to enable relay logs on the extractor, which allows thereplicator to run completely unintrusively on a separate host from the master database. See Section 4.3.3,“Enabling Relay Log Extraction” for more information.

4.1.2. Direct Replication

Direct replication extracts events from the master using a two-stage pipeline with an intervening queue and appliesthem directly to a slave. Direct replication is useful for extracting and applying a specific set of events from a masterto slave without intervening logging.

TipYou can use a direct replication pipeline to implement restartable upgrade. You can also use direct replica-tion to recover "trapped" data (that is, not replicated to one or more slaves) due to master failure.

A pre-configured direct replication pipeline is provided as part of the standard MySQL configuration. Follow thesteps shown below to enable direct replication and use it to replicate a specific set of events between services.

1. Installation. Install Tungsten normally on the host containing a master database that is receiving updates. MySQLbinlogs must be enabled. Do not start services.

NoteIn direct replication there is only one replicator which both extracts and applies events. You can alsoselect direct replication pipelines as a replication service role during normal installation.

2. Post-configuration setup. Edit the replicator configuration in tungsten-replicator/conf/replicator.properties. Change the replicator role to direct as shown in the following example. Also,turn off the replication auto-enable feature. This is necessary to start the replicator at a particular start point.

# Replicator role. Uncomment one of the choices of master or slave. # There is no default for this value--it must be set or the replicator # will not go online. replicator.role=direct...# Replicator auto-enable. If true, replicator automatically goes online # at start-up time. replicator.auto_enable=false

Page 50: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.444

Tungsten version 2.0.4

3. Start-up. Choose a starting point for replication and bring the replicator online. If the master is idle (that is, notreceiving updates) you can just start at the current point in the database log. However, with direct replication itis common to pick a specific starting point as shown in the following example.

tungsten-replicator/bin/replicator starttungsten-replicator/bin/trepctl online –from-event mysql-bin.004095:511332596

This will replicate everything in the database log from the given native event ID. You can also replicate betweena specific range of events using the following trepctl online -event syntax, which allows you to replicate upto a particular event.

tungsten-replicator/bin/trepctl online –from-event mysql-bin.004095:511332596 \ -event mysql-bin.004095:512983221

4. Monitoring. Ensure that the direct pipeline is working by checking status of the replicator using trepctl status.The sequence number should increment with each transaction on the database, reported latency should be ator close to 0, and there should be no errors in the replicator log.

tungsten-replicator/bin/trepctl status

If you encounter errors, you should correct them and restart the replicator process. With direct replication Tung-sten replicator remembers the restart point and will start up again from where the error occurred.

TipAs with dummy replication on MySQL you can enable relay logs on the extractor. See Section 4.3.3, “En-abling Relay Log Extraction” for more information.

WarningDirect replication only works if databases are properly synchronized before starting. Failure to synchronizemaster and slave databases can result in replicator crashes or data corruption.

4.1.3. MySQL to PostgreSQL/Greenplum Replication

This chapter covers the steps on how to configure heterogeneous replication from MySQL to PostgreSQL or fromMySQL to Greenplum (a variant of PostgreSQL designed for data warehousing and analytics).

NoteReplication to PostgreSQL/Greenplum is a Tungsten Enterprise feature. It is not fully available in the com-munity releases of Tungsten Replicator.

Before you start, ensure that you have read and understood the contents in:

• Section 3.2, “Replicator Configuration”

• Section C.1, “Transaction History Log (THL) Storage”

Page 51: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.445

Tungsten version 2.0.4

• Section C.2.1, “MySQL Extractor”

To configure this heterogeneous replication, proceed as follows.

NoteThe configuration of a Greenplum slave has a few differences from the configuration of a PostgreSQL slave- these differences are described below inline.

1. Use row replication for DML.

For heterogeneous replication to succeed, enable and use only row replication on the MySQL master. DMLstatements in statement replication are saved as textual SQL strings and they introduce vast dialect differencesbetween MySQL and PostgreSQL DBMS types.

Furthermore, row replication records structures of changes introduced in rows by DML statements, allow-ing Tungsten Replicator to extract these structures and apply them to the PostgreSQL/Greenplum server asgeneric ones.

ImportantReplicating DDL

DDL statements are database type specific. Successful replication of your application specific MySQLDDL statements to PostgreSQL/Greenplum requires extensive customization of the pgddl.js filter.

Even if row replication is enabled, DDL statements are saved as textual SQL strings. Generic transla-tion of the MySQL DDL dialect to the PostgreSQL/Greenplum DDL dialect is a vast and complex task.Therefore, if your application only generates DDL during the first setup and upgrades, we recommendnot to replicate them at all. Just prepare the slave with the same schema structure manually and repli-cate the DML statements into it.

If the slave schema is not consistent with the master schema, you will receive errors when an incomingevent tries to access a table or a column which does not exist:

com.continuent.tungsten.replicator.applier.ApplierException:\org.postgresql.util.PSQLException: ERROR: column "customertype" does not existcom.continuent.tungsten.replicator.applier.ApplierException:\org.postgresql.util.PSQLException: ERROR: relation "customers" does not exist

Nevertheless, sometimes application generated DDL statements are enumerable. If you know the na-ture of your application, you can consider adding custom DDL translators. For more information, seethe pgddl.js filter in step Prepare the filters below.

2. Prepare the JDBC driver.

For a PostgreSQL slave, ensure that the JDBC properties are set as follows:

replicator.resourceJdbcUrl=jdbc:postgresql://pghost:5432/test/replicator.resourceJdbcDriver=org.postgresql.Driver

Page 52: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.446

Tungsten version 2.0.4

replicator.resourceVendor=postgresql

For a Greenplum slave, the resourceVendor must be changed as follows:

replicator.resourceJdbcUrl=jdbc:postgresql://gphost:5432/test/replicator.resourceJdbcDriver=org.postgresql.Driverreplicator.resourceVendor=greenplum

3. Prepare the applier.

Both PostgreSQL and Greenplum use the same PostgreSQLApplier. See below for an example of anapplier executing writes into the test database:

# PostgreSQL applier.replicator.applier.postgresql=com.continuent.tungsten.\replicator.applier.PostgreSQLApplierreplicator.applier.postgresql.url=jdbc:postgresql://pghost:5432/test/replicator.applier.postgresql.user=${replicator.global.db.user}replicator.applier.postgresql.password=${replicator.global.db.password}

It is important to specify the correct database in the url parameter. Databases and schemas in MySQL areessentially the same, while in PostgreSQL/Greenplum one database may contain many schemas. Thus, onthe slave, there must be one database prepared where all the writes will be applied.

Note

The database user must have all the privileges to write to the tables that are touched by the the repli-cation stream, including the tungsten schema. In other words, the database user must have enoughpermissions to cope with anything that comes in from the MySQL master. Thus, we recommend usingthe postgres user (for PostgreSQL) and gpadmin (for Greenplum) for testing purposes. This makesthe setup task easier.

4. Prepare the filters.

If DDL statements are going to be replicated, the filter below must be enabled and extended on a case bycase basis:

replicator.filter.pgddl=com.continuent.tungsten.\replicator.filter.JavaScriptFilterreplicator.filter.pgddl.script=${replicator.home.dir}\/samples/scripts/javascript-advanced/pgddl.js

For replicating into the Greenplum server, the UPDATE events must be transformed to only update the columnsin a row that were actually changed. (By default, the MySQL binary log saves updates to all the columns,including keys, despite the fact whether they were changed or not.) The filter below handles this:

Page 53: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.447

Tungsten version 2.0.4

replicator.filter.optimizeupdates=com.continuent.tungsten.\enterprise.replicator.filter.OptimizeUpdatesFilter

This filter is required. Otherwise the applier tries to update one of the columns that are part of the DistributionKey, which is not allowed in Greenplum. If OptimizeUpdatesFilter is not enabled, you will receive thefollowing error on the very first incoming UPDATE event:

Cannot parallelize an UPDATE statement that updates the distribution columns

NoteYou can receive this error even if OptimizeUpdatesFilter is enabled. This would indicate a state-ment which tries to UPDATE the Distribution Key’s column. In this case:

a. Fix the application in a way that it does not update the Distribution Key's column, or

b. Add a dummy column to the problematic table and use that column as a new Distribution Key. Thisis a workaround solution and you must understand all its possible effects.

You can also use the dbselector.js filter if you must only replicate one specific MySQL database (schema):

replicator.filter.dbselector=com.continuent.tungsten.\replicator.filter.JavaScriptFilterreplicator.filter.dbselector.script=${replicator.home.dir}\/samples/extensions/javascript/dbselector.jsreplicator.filter.dbselector.db=dbtoreplicate

Do not forget to enable these filters for the specific pipeline, which is actually applying events to the DBMS.For example:

replicator.stage.q-to-postgresql.filters=pgddl,optimizeupdates

5. Prepare the THL.

The following configuration is an example of how to prepare a DBMS based THL on the PostgreSQL/Green-plum slave:

replicator.store.thl=com.continuent.tungsten.replicator.thl.THLreplicator.store.thl.storage=com.continuent.tungsten.\replicator.thl.JdbcTHLStoragereplicator.store.thl.url=jdbc:postgresql://greenplum1.lab:5432/test/replicator.store.thl.user=${replicator.global.db.user}replicator.store.thl.password=${replicator.global.db.password}

Page 54: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.448

Tungsten version 2.0.4

Again, the correct database specification in the url parameter is important.

6. Prepare the tungsten schema for Greenplum.

Currently, the tungsten schema must be manually modified for Greenplum after it is created. First, ensurethat the replicator does not go online automatically after it is started:

replicator.auto_enable=false

Next, after the first start, run the prepare-greenplum-slave script, which will modify the tungstenschema. You must specify the database, which contains the tungsten schema, as the first parameter. Forexample:

./cluster-home/bin/prepare-greenplum-slave test

This must only be done once after the initial setup.

7. You can now put Tungsten Replicator online with command:

./trepctl online

TipOne way of setting up the MySQL to PostgreSQL/Greenplum replication is to use relay logs on the extrac-tor. This allows a single replicator instance on the slave PostgreSQL/Greenplum host to cover the wholeprocess. For more information, see Section 4.3.3, “Enabling Relay Log Extraction”.

However, if you use relay log replication, the built-in consistency checks will be unavailable, as there is noReplicator on the master. For more information, see Section 3.7, “Consistency Checking”.

4.2. THL Disk Storage Configuration and ManagementTungsten disk logs have a number of configuration parameters. This section covers the most important settingsand management procedures.

NoteDisk THL storage is an enterprise feature. It is not available in community releases of Tungsten Replicator.

1. To adjust disk log settings edit the replicator.properties file and locate the section that containssettings for the Transaction History Log (THL). By convention properties for the THL have the prefixreplicator.store.thl.

2. Enable disk logs by setting the THL implementation class and storage implementation as shown below.

Page 55: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.449

Tungsten version 2.0.4

# Set the THL storage implementation class. For disk logs use commercial class# com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHL. replicator.store.thl=com.continuent.tungsten.enterprise.replicator.thl.EnterpriseTHL

# Set the storage accessor. If you use disk logging this must be commercial# class com.continuent.tungsten.enterprise.replicator.thl.DiskTHLStorage. replicator.store.thl.storage=com.continuent.tungsten.enterprise.replicator.thl.\DiskTHLStorage

3. Set the log location using the log_dir property. This must be an existing, writable directory. You can also setthe size of individual log files in bytes; if you do not it defaults to 1GB.

# Uncomment the following properties to control disk log storage location# and size of files. These are default values. replicator.store.thl.log_dir=/opt/rhodges/logsreplicator.store.thl.log_file_size=1000000000

WarningEnsure you have sufficient storage for disk logs. If you are using MySQL note that Tungsten logs as a ruletake roughly twice the space of MySQL binlogs due to additional information included in each transaction.

4. Set the log retention using the file_retention property. This is critical to manage storage and avoid runningout of disk space.

# To drop log files after a certain period, set the retention to an interval# which is number{d|h|m|s}, where the letters stand for days, hours, minutes, # or seconds respectively. If unset logs are retained indefinitely. replicator.store.thl.log_file_retention=3d

5. Adjust any remaining disk log and THL parameters. Currently the disk serialization protocol is the only additionalparameter and should be left as is. However, to take full advantage of disk log performance, you should set theTHL buffer_size to a value greater than 1 so that multiple events will be transfered at once. Note that thissetting may require additional memory to be assigned to the Java VM.

# Maximum number of events to transfer at once. Higher values are better # but as with queue store large sizes require more memory. replicator.thl.protocol.buffer_size=10

6. Restart the replicator so that the configuration values are reread. You can also use trepctl configure.

4.3. Replication Configuration for MySQL

Tungsten Replicator has a number of specialized features for MySQL. In this section we document the most im-portant issues when using Tungsten together with MySQL.

Page 56: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.450

Tungsten version 2.0.4

4.3.1. Migration between MySQL and Tungsten Replication

Tungsten Replicator is designed to enable seamless migration from existing MySQL replication configurations toTungsten and back again without losing data or requiring application outages. This procedure allows low-impactconversions with an easy backout option in the event of problems.

NoteIt is not necessary to use MySQL Replication before starting with Tungsten. You can provision Tungstenfrom scratch as described in Chapter 2, Tungsten Replicator Installation and Configuration.

4.3.1.1. Migrating a MySQL Master/Slave Pair from Native Replication to Tungsten

Before converting from native MySQL replication you must set up replication as described in applicable MySQLdocumentation. Once replication is running and follow the procedure shown below.

1. Tungsten installation. Untar the file in a directory of your choice and run the configure utility. Observe thefollowing pointers.

• Do not start services automatically.

• Do not auto-enable the replicator.

These settings prevent the replicator from starting automatically and allow you control the exact start point ofreplication.

2. Health check. Ensure MySQL replication is operating correctly and fully caught up. To this login to the slavedatabase and issue a SHOW SLAVE STATUS. Look for errors and check the number of second behind themaster.

WarningMigrating to Tungsten when databases are inconsistent may result in replicator errors or even data cor-ruption. If there are problems, correct them before continuing.

3. Stop MySQL replication. Login to the slave database and issue a STOP SLAVE command.

NoteIf you use MySQL statements, there is a chance that your application uses temporary tables. You shouldcarefully check for open temporary tables on the slave before stopping replication. Here is the full pro-cedure.

mysql> stop slave sql_thread;Query OK, 0 rows affected (0.00 sec)

mysql> show status like '%temp%';+------------------------+-------+| Variable_name | Value |+------------------------+-------+| Slave_open_temp_tables | 0 | +------------------------+-------+1 row in set (0.00 sec)

Page 57: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.451

Tungsten version 2.0.4

mysql> stop slave;Query OK, 0 rows affected (0.00 sec)

WarningIf the value of Slave_open_temp_tables is greater than 0, you should restart MySQL replication andtry again to ensure no temporary tables are open. Migrating while temporary tables are open can lead tostatements failing due to missing tables. This may require slaves to be reloaded in order to fix.

4. Note slave replication position. Login to the slave database and issue a SHOW SLAVE STATUS command.Note the master log file and offset values as shown in the following example.

mysql> SHOW SLAVE STATUS\G*************************** 1. row *************************** Slave_IO_State: Master_Host: 10.3.2.117 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000095 Read_Master_Log_Pos: 415359990...

5. Start Tungsten master. Start the Tungsten replicator process on the master host so that it starts reading masterlogs at the current slave location.

tungsten-replicator/bin/replicator starttungsten-replicator/bin/trepctl online –from-event mysql-bin.000095:415359990tungsten-replicator/bin/trepctl statustungsten-replicator/bin/trepctl heartbeat

The –from-event syntax is the log file followed by the byte offset you obtained from SHOW SLAVE STATUSseparated by a colon. The trepctl status output should show that the master is processing events and does nothave errors. The trepctl heartbeat pushes an event through the system to the slave.

WarningIf the master shows errors, stop and correct them now. If necessary you can re-enable MySQL replicationby issuing START SLAVE on the slave.

6. Start Tungsten slave. Start the Tungsten replicator on the slave host. The slave will fetch the master logs fromthe beginning, hence does not need a specific start point.

tungsten-replicator/bin/replicator starttungsten-replicator/bin/trepctl onlinetungsten-replicator/bin/trepctl status

Page 58: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.452

Tungsten version 2.0.4

The trepctl status command shows that the slave is receiving and applying eents.

WarningIf the slave shows errors, stop and correct them now. If necessary revert back to MySQL replication usingthe procedure described in Section 4.3.1.3, “Reverting from Tungsten to MySQL Native Replication”.

7. Disable MySQL replication. Once Tungsten replication is running properly you should disable MySQL replicationcompletely to prevent it from restarting if you reboot the MySQL server. You can do this by logging into the slavedatabase and issuing the following CHANGE MASTER command.

mysql> CHANGE MASTER TO master_host = '';

8. Enable other Tungsten services. If you are using Tungsten Manager to run a database cluster, you can startthe manager now. We do not recommend starting the manager until you are sure that migration is successfulas it may lead to a spurious failover.

4.3.1.2. Migrating Additional Slaves from Native Replication to Tungsten

Tungsten Replicator currently lacks the ability to tell a new slave to start at a particular sequence number in themaster log. You can however set the value manually by updating the trep_commit_seqno table with the appro-priate value. Here is the procedure to migrate additional slaves.

1. Dump the Tungsten database from a current slave and reload into the new slave. This provides base metadatathat will be used to set the replication start position. On MySQL you can do this with the following commands:

mysqldump –single-transaction –utungsten –p -h$MASTER tungsten > tungsten.dmpmysql –utungsten –p -h$SLAVE < tungsten.dmp

2. Stop the slave and find the current slave position using SHOW SLAVE STATUS as described in Section 4.3.1,“Migration between MySQL and Tungsten Replication”.

3. Use the thl list utility to locate the sequence associated with the MySQL replication position.

4. Update the tungsten.trep_commit_seqno table to set the starting sequence number to the value from theprevious step.

mysql> UPDATE tungsten.trep_commit_seqno SET seqno=352335;Query OK, 0 rows affected (0.09 sec)Rows matched: 1 Changed: 0 Warnings: 0

5. Start Tungsten slave. Start the Tungsten replicator on the slave host.

tungsten-replicator/bin/replicator starttungsten-replicator/bin/trepctl online

Page 59: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.453

Tungsten version 2.0.4

4.3.1.3. Reverting from Tungsten to MySQL Native Replication

You can revert from Tungsten replication back to MySQL replication at any time. Execute the procedure shownbelow on each slave. Slaves can be reverted in any order.

1. Stop Tungsten replicator. Stop the Tungsten Manager if it is also running.

tungsten-replicator/bin/replicator stoptungsten-manager/bin/manager stop

2. Note slave replication position. Select the current position reached by Tungsten from thetungsten.trep_commit_seqno table as shown in the following example.

mysql> SELECT eventid FROM tungsten.trep_commit_seqno;+--------------------------------+| eventid |+--------------------------------+| 000107:0000000952493724;322965 | +--------------------------------+1 row in set (0.00 sec)

The first number is the binlog file number. The second number following the colon is the byte offset in that file.The number following the semi-colon is an internal session ID and should be ignored.

3. Configure the MySQL slave. Use the values from the previous step to issue a CHANGE MASTER command asshown in the example below. Note that you must specify host and login credentials as well as the binlog position.

CHANGE MASTER TO master_host = 'myhost', master_user = 'repl', master_password = 'mypasswd', master_log_file = 'mysql-bin.000107', master_log_pos = 952493724;

4. Start MySQL replication. Login to the slave database and issue a START SLAVE command.

4.3.2. MySQL Character Sets and Binary Data

MySQL allows SQL statements and row data to contain mixed character sets as well as embedded binary data.Tungsten Replicator requires special configuration to avoid corrupting data in these circumstances.

4.3.2.1. Direct Binary Transfer for Homogeneous MySQL Replication

When replicating between MySQL DBMS instances, Tungsten should be configured to use binary string transfer.This uses a special protocol to move string data as bytes without intervening translation and prevents data cor-ruption.

Page 60: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.454

Tungsten version 2.0.4

To use binary transfer of strings, you must enable the useBytesForStrings property on the MySQL extractordefinition in replicator.properties, as shown in the following example:

# Use bytes to transfer strings. This should be set to true when using MySQL# row replication and the table or column character set differs from the# character set in use on the server.replicator.extractor.mysql.useBytesForStrings=true

When binary transfer is enabled, you must use the -charset option on the thl utility to view log data containingbinary strings. The following example shows this option in use.

thl list -seqno 11 -charset utf8

WarningBinary transfer is always recommended when replicating data between MySQL DBMS instances.

4.3.2.2. Heterogeneous Binary Transfer

When replicating between MySQL and other DBMS types, string data must be transferred using non-binary meth-ods or the values will not be properly applied. In this case you must ensure that you use character sets carefully.This section provides advice on non-binary string transfer.

In non-binary string transfer, Tungsten automatically converts all string data to Unicode for internal processing.This enables consistent behavior when replicating across platforms, simplifies replicator code paths (thus reducingthe number of bugs), and makes it very easy to implement flexible data filtering.

Unicode string translation is therefore highly beneficial for users but makes it critically important that applicationshandle character sets in a consistent way. Proper character set handling is particularly important for MySQL ap-plications, as statement-based replication in particular is sensitive to embedded character sets. Here is a shortsummary of best practices to avoid problems.

NoteThese recommendations only apply to MySQL. They do not affect clusters created using PostgreSQL WarmStandby Replication. The discussion of introducers only applies when using statement replication, not forrow replication.

• Use UTF-8 character set consistently for MySQL data including not just table definitions but also client settingsfor applications that insert data into the master. This ensures the greatest flexibility when moving data betweenservers and avoids the chance of data loss or corruption due to missing characters in code set translations.

• Always use introducers for binary data embedded in SQL statements. MySQL introducers prevent data corruptionon insertion, which is a benefit even in the absence of replication. Tungsten Replicator automatically translatessuch strings into a safe hexadecimal format that is guaranteed to replicate correctly regardless of character setdifferences. Here is an example of correct string syntax using introducers.

INSERT INTO media_info VALUES(1, 3.5, _binary'...non-character data...')

Page 61: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.455

Tungsten version 2.0.4

MySQL permits use of un-introduced binary strings (i.e, quotes with embedded characters). This practice isunsafe for multi-byte character sets like UTF-8 and can lead to silent data corruption both on masters as wellas during replication. SQL statements like the following are not recommended.

INSERT INTO media_info VALUES(1, 3.5, '...non-character data...')

It is not always possible to change applications that use such binary strings. If this is the case, you should observethe following conventions to ensure safe replication.

• Use single-byte character sets like ISO-8859-1 (latin1) consistently through your applications. This includesclient connections as well as tables.

• Ensure that the Tungsten applier URL in tungsten.properties sets the characterEncoding propertyto the name of your single-byte character set as shown in the following example:

replicator.applier.mysql=com.continuent.tungsten.replicator.applier.MySQLApplierreplicator.applier.mysql.host=centos5areplicator.applier.mysql.port=3306replicator.applier.mysql.url_options=?jdbcCompliantTruncation=false&\ zeroDateTimeBehavior=convertToNull&tinyInt1isBit=false&\ allowMultiQueries=true&yearIsDateType=false&\ characterEncoding=latin1replicator.applier.mysql.user=tungstenreplicator.applier.mysql.password=secret

• If you mix different character sets in statements, you should always use character introducers for strings otherthan the default character set. The following example shows an example of a string introducer for embeddingISO-8859-1 characters in UTF-8 statements.

INSERT INTO location VALUES(1, _latin1'In a Café')

WarningFailure to observe conventions for handling character sets and binary data can make silent data corruptionof strings during replication very probable. Always test replication carefully using table consistency checksbefore deploying into a production setting.

There are other best practices for character set handling that do not normally affect replication but should beavoided anyway as a matter of good application design. Putting binary data into text fields or using binary stringsto fool MySQL into suppressing character set translations are not recommended. Avoiding these is beneficial to allapplications and improves chances that data will be correctly stored and displayed.

4.3.3. Enabling Relay Log Extraction

Tungsten Replicator by default reads master database logs from local disk, which means that the replicator mustbe on the same host as the database. This can create undesirable load on the master as the replicator consumesCPU and performs I/O on the same storage devices used by MySQL.

Page 62: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.456

Tungsten version 2.0.4

To reduce load on masters, the MySQL extractor provides the option of downloading logs through the normalMySQL client port in a manner similar to MySQL native replication. Such logs are known as relay logs. To enablerelay logs, follow the procedure shown below.

1. Edit the replicator.properties file and locate the section that contains settings for the MySQL extractor.By convention the properties for the extractor have the prefix replicator.extractor.mysql.

2. Ensure the replicator settings are correct for the host from which you will be downloading binary log data. Tung-sten Replicator uses the login and password provided as parameters to the extractor to download binlog data.This login must have the SUPER as well as REPLICATION SLAVE privileges. If you are running off-board (thatis, on another host), ensure that the JDBC URL for the extractor points to the correct host. Examples of thisproperties are shown below.

replicator.extractor.mysql.binlog_file_pattern=mysql2-binreplicator.extractor.mysql.host=centos5areplicator.extractor.mysql.port=3306replicator.extractor.mysql.user=tungstenreplicator.extractor.mysql.password=s3cr3t!!

3. Enable relay log replication by setting the useRelayLogs property to true.

# When using relay logs we download from the master into binlog_dir. This # is used for off-board replication. replicator.extractor.mysql.useRelayLogs=true

4. Select a location for the logs to be stored. The replicator will read from this location.

# When you turn on relay logs, you must define a location for them. This# overrides the normal log directory. replicator.extractor.mysql.relayLogDir=/opt/tungsten/relay-logs

5. Adjust any remaining relay log parameters. The default values for these are appropriate for most installationsand do not to be changed.

6. Restart the replicator so that the configuration values are reread. You can also use trepctl configure.

4.4. Common Applications of ReplicationReplication is a very powerful tool for solving a number of important problems related to database availability,workload scaling, and transfer of data. This section provides guidance on using Tungsten Replicator to solve theseand other problems.

4.4.1. Using Database Replicas to Scale Reads

Many sites take advantage of database replicas to scale application performance and throughput. The basic ideais to move read-only operations to replica copies. There are two types of processing you should always considerrunning on slaves:

Page 63: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.457

Tungsten version 2.0.4

• Backups – Backup is typically a very resource-intensive operation and should wherever possible run on a slave.Tungsten Replicator can integrate with and operate a variety of backup procedures, as documented in Sec-tion 3.4, “Backup and Restore”.

• Large Reports and Queries – Reports that do not depend on completely up-to-date data should always run onslaves.

The biggest scaling benefits, however, come from dynamic load balancing of application queries onto slavedatabases. One common approach is to add an extra slave connection to your applications, which means goingthrough application code and deciding whether each query can go to a slave instead of the master. This approachis sometimes called “slave-enabling.”

Tungsten offers a better approach to scaling reads that avoids laborious and error-prone application changes.Tungsten supplies both Java application libraries (Tungsten SQL Router) and well as high-speed proxies (TungstenConnector) that connect your applications automatically to master and slave hosts. Both of these products provideload balancing policies that can direct reads automatically to slaves in a way that minimizes or completely eliminatesany changes to application logic. Please consult the respective guides for each product to understand the optionsthey provide for efficient read scaling.

4.4.2. Implementing Automated Failover

Automatic failover is a standard technique to ensure database high availability. Automation allows you to elect anew master whenever your current master fails without having to wait for a human to make a decision. It is key toensuring that databases remain highly available at all times.

Database failover is normally controlled by an external program known as a "cluster manager." The cluster manageris responsible for deciding when it is time to do a failover and carrying out the failover procedure. For a variety ofreasons Tungsten Replicator cannot make this decision for itself, as deciding which of several slaves to promoteand when to do so is a non-trivial problem that requires distributed algorithms to solve. Instead, Tungsten Replicatorprovides network interfaces and commands to make failover and replication reconfiguration as easy as possible.

The simplest way to implement failover is to use Tungsten itself. Tungsten Manager detects failover and providesautomated, rule-based management procedures that not only switch the master but take care of ensuring thatapplication SQL requests are properly rerouted in sites that use the Tungsten SQL Router or Tungsten Connector.To understand and use the failover capabilities, please refer to the Tungsten Concepts and Administration Guide.

NoteAutomated failover is a commercial feature. Tungsten Community software provides efficient managementcommands to initiate failover manually but does not include automated rules.

WarningWhatever solution you choose for handling automated failure, be sure to test it thoroughly and regularly.This is the only way to ensure that failover will work correctly when you need it.

4.4.3. Fast Database Upgrade and Migration

Tungsten Replicator has a number of features that make it well-suited for performing database server upgradesas well as application migrations. The basic idea is to upgrade a slave database that is in the OFFLINE state, letit catch up with any missed updates, and then promote it to master. This approach solves a number of difficultproblems for DBAs.

Page 64: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.458

Tungsten version 2.0.4

• If the upgrade fails for any reason, you can just discard the slave without any affect on the master.

• Once the slave is fully upgraded and catches up with missed updates, the upgrade outage time is reduced tothe time to do a failover.

• You can make the old master a slave, which means that you can keep it around as a fallback in case the newlyupgraded database has problems. In this case, you just fail back to the old slave.

Tungsten Replicator has a simple procedure for provisioning slaves and performing clean failover. It also canreplicate from newer to older database versions such as MySQL 5.0 to 4.1. Finally, filters allow users to replicateeven back to databases that have schema changes. You can construct filters to drop new columns or even entireupdates that will not work when replicating back to an old version.

The setup for upgrade is almost identical to automatic failover, except that a cluster manager is not required.

Figure 4.1. Fast Upgrade and Migration

The basic upgrade procedure is described below.

1. Set up a master/slave pair and ensure replication is working correctly.

2. Take the slave OFFLINE and perform the upgrade. If upgrade fails discard the slave and start over.

3. Bring the slave back online and allow it to catch up with the master. Once the slave is in the SLAVE state it isfully caught up.

4. Failover from the existing master to the slave.

5. Bring the old master up as a slave.

You can upgrade the old master at leisure or simply discard it if you do not need it after the upgrade.

If there are also schema changes you may have to insert filters on the slave to alter or discard SQL events thatgo back to the old master. Depending on the extent of the changes, it may not be practical to replicate back tothe master.

Page 65: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.459

Tungsten version 2.0.4

The simple way to handle switch-over of client applications is to use Tungsten Connector or SQL Router combinedwith the Tungsten Manager to issue management commands. Tungsten Manager offers a range of commands tohelp manage availability and reconfiguration of clusters with minimal effort. These commands allow you to imple-ment fully seamless upgrades that occur without application outages. Refer to the Tungsten Concepts and Admin-istration Guide for information on managing maintenance and upgrade procedures.

TipAvoid mixing too many things in a single upgrade. It is generally better to proceed by single steps thatchange only one thing at a time.

WarningAlways test upgrades carefully on real data! Using a slave for the upgrade allows you to test repeatedly onproduction data. Also, ensure you have a complete backout procedure in case the upgrade fails. Upgradingusing a slave as described here covers most database backout issues but does not cover restoring pre-upgrade application code.

4.4.4. Heterogeneous Replication

Tungsten Replicator supports replication between different database types as well as between databases and non-database entities like applications or even flat files. This section provides you with some ideas about how to setup different heterogeneous replication use cases.

4.4.4.1. Replication between Different Database Types

Tungsten Replicator can replicate between different database types, since SQL events are essentially generic afterthey have been extracted. To replicate between different databases, set up a replicator for each server as if youwere replicating between databases of the same type. The "from" database must run in the master role, while the"to" database acts as a slave.

For example, you can set up replication between MySQL and PostgreSQL as follows. Install and configure TungstenReplicator for the PostgreSQL database and run it in the master role. Install and configure Tungsten Replicator forthe MySQL database and run it in the slave role.

When replicating between different database types you must be careful what is being replicated. SQL INSERT,UPDATE, and DELETE statements tend to be quite portable. So, for example, you can replicate from a MySQL 5.0instance using statement replication to an PostgreSQL instance. Beware, however, that SQL functions as well asbinary data types tend to be relatively non-portable. Also, DDL statements beyond the simplest CREATE TABLEexpressions are rarely at all portable.

SQL portability issues can be solved in at least two ways.

• Use row replication. Row replication moves data in a generic form that avoids SQL dialect dependencies.

• Use filters. Implement filters to drop or transform SQL that causes problems. Filter implementation is describedin Chapter 11, Extending the Tungsten Replicator System.

NoteThe current version of the replicator has some temporary limits that affect how easily it can replicate betweendifferent database types. The most significant of these is that there is no fully generic JDBC applier that

Page 66: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Advanced Principles of Operation

Tungsten Replicator Guide - Document issue 2.0.460

Tungsten version 2.0.4

works with any database type. Appliers are currently specific to one database. This limit and others willbe removed shortly.

See Section 4.1.3, “MySQL to PostgreSQL/Greenplum Replication” for one specific implementation.

4.4.4.2. Replication between Databases and Non-Databases

Replication to and from non-databases is not supported in the off-the-shelf replicator. However, such replication isquite easy to implement for anyone with a reasonable understanding of Java and the willingness to write a replicatorplug-in, as described in Chapter 11, Extending the Tungsten Replicator System.

A simple example of non-database replication is to morph database changes into XML documents, one per update.To do this you would implement an applier plug-in that takes SQL updates and converts them into your preferredXML format. This applier just needs to be able to read the SQL event data structures and generate XML tags.

TipIf you use row replication, the applier will be much easier to write. If using statement replication, you mayneed to parse SQL text, which can be a non-trivial undertaking.

When converting SQL events into XML you might wish to convert only certain changes to XML. You could buildlogic into your applier to skip events that do not interest you. However, a better way is to write a filter that dropsuninteresting events. This approach results in two components that are simpler and can also be used independently.

Page 67: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Replication Services

Tungsten Replicator Guide - Document issue 2.0.461

Tungsten version 2.0.4

Chapter 5. Replication ServicesTungsten 2.0 introduces replication services, which are independently managed pipelines that run within thesame replicator process. Pipelines are flows of transactions, consisting of stages, where each stage consists ofone or more threads. Tungsten 1.3 permits a single pipeline per replicator process with a single thread per stage.Tungsten 2.0 by contrast allows more than one pipeline and allows multiple threads in each stage.

Replication services allow users to create flexible flows of updates between services. For example, a single repli-cator can use multiple replication services to accept data from several masters into a single slave. Replicationservices also help implement bi-directional as well as cross-site replication.

5.1. Principles of OperationReplication services operate as independently manageable entities within the Tungsten Replicator process. Thereis an additional service responsible for starting and stopping replication services. A typical replicator with tworeplication services svc1 and svc2 is shown below.

Figure 5.1. Tungsten Replicator with Two Replication Services

The management service is implemented by ReplicationServiceManager and has a corresponding JMXMBean interface named named OpenReplicatorManagerMBean. Global properties for the replicator and forthis service are stored in configuration file services.properties.

Tungsten Replicator services are implemented by OpenReplicatorManager, which has a corresponding JMXMBean interface named OpenReplicatorManagerMBean. Each replicator service has a separate properties file

Page 68: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Replication Services

Tungsten Replicator Guide - Document issue 2.0.462

Tungsten version 2.0.4

containing the pipeline definitions as well as other configuration data. It also has its own metadata database andlogs. In the version shown here, event logs are stored in the tungsten_service name database. For commercialreplicators, logs are typically stored on disk and named after the replicator.

5.2. ConfigurationServices are configured using property files in the tungsten-replicator/conf directory. The following listdescribes property file contents.

• services.properties - This property file contains global replicator process configuration data. Properties inthis file normally do not need to be changed between installations.

• static-svc.properties - Here, svc is the service name. This property file contains configuration for a singleservice. This file defines all properties related to replication including general replication parameters as well aspipelines that implement particular replicator roles.

• dynamic-svc.properties - Here, svc is the service name. This property file contains dynamically changedproperties, such as the replicator role that override values in the static-svc.properties file.

Tungsten 2.0 introduces the following specialized properties to support services and parallel process-ing within stages. For more information, look at the comments within the template file sample_static_properties_mysql.st or the definition file for any deployed replication service.

Table 5.1. Properties to Support Services and Parallel Processing within Stages

Property Name Descriptionreplicator.detached If true, replication services run as detached processes (not currently support-

ed for production use).service.name Name of the replication service. This must be unique within a single replica-

tor. It is used to name the metadata database as well as replicator disk logs.local.service.name Name of the local service that "owns" the data source. This name is important

for suppressing loops in bi-directional replication. This property is used tosupport multi-master replication.

replicator.service.type Must be local or remote. A local data service is a "normal" data servicethat uses unlogged updates for slave updates as well as Tungsten metadata.A remote data service logs all updates. This property is likewise used tosupport multi-master replication

NoteIn the current Tungsten 2.0 RC builds, you must use CML (Camel) commands to generate replication serviceproperty files.

5.3. Replication Service MetadataReplication services store replication metadata in a database that has the name tungsten_svc, where svc is thename of the service, for example, tungsten_svc1. If you have multiple replication services, there will (obviously)be multiple metadata databases present in the server.

The tables in the metadata database are the same as for older Tungsten releases. Here is a list of tables.

Page 69: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Replication Services

Tungsten Replicator Guide - Document issue 2.0.463

Tungsten version 2.0.4

• consistency - Table for consistency check requests and results.

• heartbeat - Heartbeat requests used to check that Tungsten Replicator is alive.

• history - Stores transactions for replicators that use the DBMS log. Unused when the disk logs are enabled.

• trep_commit_seqno - Tracks the position of Tungsten Replicator tasks. There is one row for each apply taskon slaves. On masters there is a single row only.

Tungsten Replicator service metadata tables have primary keys that ensure there are no conflicts when applyingdata in parallel. Removing the keys can result in deadlocks or lock wait timeouts, hence should be avoided.

5.4. Replication Service Event Logs

Each replication service has an independent event log containing replicated transactions. When using the DBMSlog, events are stored in the history table. When using the disk log, events are stored in a separate directory namedafter the service. Disk logs are stored by convention in the logs directory in the TUNGSTEN_HOME directory. Disklogs to match Section 5.1, “Principles of Operation” would look like the following example:

/op/tungsten/logs /opt/tungsten/logs/svc1 /opt/tungsten/logs/svc2

5.5. Management

The trepctl command has a number of commands to manage services directly. The usual operations like start,stop, online, offline and the like still apply. However, you must now use the -service option to identify the serviceon which you are operating. If there is only one active service, you can omit this option as trepctl will default tothat service name.

There are two steps to create a new service:

1. Add a static-svc.properties file to the tungsten-replicator/conf directory. The configure-ser-vice command performs this function.

2. Issue command trepctl -service svc to start the service.

Here is a sample of starting a service named svc1.

$ trepctl -service home_demo_test startService started successfully: name=home_demo_test

To list all active services, use the trepctl services command as shown below.

$ trepctl servicesProcessing services command...NAME VALUE

Page 70: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Replication Services

Tungsten Replicator Guide - Document issue 2.0.464

Tungsten version 2.0.4

---- -----appliedLastSeqno: 388appliedLatency : 0.695role : masterserviceName : svc1serviceType : localstarted : truestate : ONLINEFinished services command...

To stop an existing service, issue the following command:

$ trepctl -service home_demo_test stopDo you really want to stop the replication service? [yes/NO] yesService stopped successfully: name=home_demo_test

All additional trepctl commands work as usual. If there are multiple services, you must use the -service optionto indicate the service on which you are operating. For example, the following commands show how to bring areplicator online and then offline.

$ trepctl -service home_demo_test online$ trepctl -service home_demo_test offline

The thl command is likewise extended with a -service option that selects the service log. The following exampleshows how to get information about the log for a service.

$ thl -service home_demo_test infoConnecting to storage2010-11-09 19:07:38,464 INFO replicator.thl.DiskLog\ Using directory '/opt/user2/tungsten/tungsten-replicator/../cluster-home/\ logs/home_demo_test/' for replicator logs2010-11-09 19:07:38,469 INFO replicator.thl.DiskLog Acquired write\ lock; log is writable2010-11-09 19:07:38,475 INFO replicator.thl.DiskLog Loaded event\ serializer class: com.continuent.tungsten.enterprise.replicator.\ thl.serializer.ProtobufSerializer2010-11-09 19:07:38,478 INFO replicator.thl.LogIndex Building file\ index on log directory: /opt/user2/tungsten/tungsten-replicator/\ ../cluster-home/logs/home_demo_test2010-11-09 19:07:38,481 INFO replicator.thl.LogIndex Constructed\ index; total log files added=12010-11-09 19:07:38,482 INFO replicator.thl.DiskLog Validating last\ log file: /opt/rhodges2/user/tungsten-replicator/../cluster-home/\ logs/home_demo_test/thl.data.00000000012010-11-09 19:07:38,483 INFO replicator.thl.DiskLog Log preparation\ is complete2010-11-09 19:07:38,483 INFO replicator.thl.DiskTHLStorage Adapter\

Page 71: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Replication Services

Tungsten Replicator Guide - Document issue 2.0.465

Tungsten version 2.0.4

preparation is completemin seq# = 0max seq# = 3events = 4highest known replicated seq# = -1

NoteYou must enter the -service option when using the thl command. There is currently no default even whenthere is a single service.

5.6. JMX APIs

The following JMX MBean interfaces control replicator services. For additional details, read the Javadoc providedin tungsten-replicator/doc/javadoc.

• OpenReplicatorManagerMBean - Management of a single replication service.

• ReplicationServiceManagerMBean - Management operations on all services.

For code samples, you can look at the source code of the trepctl utility, which is implemented in classcom.continuent.tungsten.replicator.management.OpenReplicatorManagerCtrl.

5.7. Diagnostic Messages

Diagnostic messages appear in the usual replicator log, which by default appears in tungsten-replica-tor/log/trepsvc.log.

NoteThe current implementation does not distinguish between messages from different replication services. Thiswill be addressed in a future build.

Page 72: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Event Metadata and Sharding

Tungsten Replicator Guide - Document issue 2.0.466

Tungsten version 2.0.4

Chapter 6. Event Metadata and ShardingTungsten 2.0 adds significant metadata to replicated transactions or events as they are known in Tungsten par-lance. Event metadata allows Tungsten to make intelligent decisions about how to apply transactions, for exampleto discard transactions that would create replication loops or to split transactions into separate, non-conflictingstreams that can be applied to slaves in parallel.

The most important metadata assignment is the shard ID. Shards are groups of unrelated data, for examplerepresenting customers. Shards commonly default to databases or schemas depending on the DBMS, though itis possible to shard in other ways as well. Shards solve an important data management problem, namely dividingdata and transactions into groups that can be replicated and managed independently. Tungsten 2.0 uses shardsto implement parallel apply, which is covered in Chapter 8, Parallel Apply.

6.1. Principles of Operation

Master replicators automatically scan MySQL transactions at extraction time to deter-mine transaction metadata using a special filter named EventMetadataFilter (classcom.continuent.tungsten.replicator.event.EventMetadataFilter - see replicator Javadoc formore information). The EventMetadataFilter class assigns the following metadata to each transaction:

Table 6.1. Transaction Metadata

Property Name Descriptionis_metadata If present and true, transaction contains a Tungsten metadata operation, for ex-

ample a heartbeat or consistency check.shard Name of the shard to which this transaction belongs. This is normally the same

as the schema name. Shards that cannot be identified receive the special shardID #UNKNOWN.

service Name of the replicator service to which this transaction belongs.

The shard ID defaults to the schema name or #UNKnOWN in the current EventMetadataFilter implementationif the schema cannot be determined or multiple schemas are affected by the update. However, Tungsten does notrequire shards to be equivalent to databases.

In fact, shards can be defined any way that users wish provided there are no dependencies between shards. Thisincludes the following.

• Referential integrity - You may not have foreign key constraints that would cause transactions to fail due to keyreferences in another shard. Triggers that operate across shards pose the same problem.

• Locks - Transactions on one shard may not lock data that belong to another shard. Foreign key constraintscan have this effect. Shards can share tables (and do in Tungsten metadata tables). However, each shard mustinclude independent rows, which must be indexed to ensure that there are no lock conflicts.

• Causal dependencies - Transactions should not have dependencies across shards, as these can lead to dead-locks or corrupt data if applied in parallel.

These conditions are not as restrictive as they sound. ISP (Internet Service Providers) and SaaS (Software-as-a-Service) commonly store customer data in completely independent schemas.

Page 73: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Event Metadata and Sharding

Tungsten Replicator Guide - Document issue 2.0.467

Tungsten version 2.0.4

Tungsten can handle cases that do not meet criteria by ensuring that transactions that violate the conditions forsharding execute serially. See Section 8.2.6, “Critical Shards” for more information.

6.2. Configuration

Metadata filtering occurs automatically on all transactions as they are extracted. No configuration is necessary inthe replicator service properties file.

NoteA future build will permit users to assign their own shard filters. Shard ID assignment is currently automat-ically performed by the Tungsten filter class EventMetadataFilter.

6.3. Management

It is possible to view event metadata using the thl utility. For example, the following thl command shows metadatafields for a user transaction. Note the service name and shard ID values.

$ thl -service home_demo_test list -seqno 6...SEQ# = 6 / FRAG# = 0 (last frag)- TIME = 2010-11-09 21:04:20.0- EVENTID = 001022:0000000000002247;286030- SOURCEID = logos1- STATUS = COMPLETED(2)- SCHEMA = db1- METADATA = [service=home_demo_test;shard=db1]- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent- OPTIONS = [##charset = ISO8859_1, autocommit = 1, sql_auto_is_null = 1,\ foreign_key_checks = 1, unique_checks = 1, sql_mode = '', character_set_client\ = 8, collation_connection = 8, collation_server = 8]- SQL(0) = insert into user values(357, 'buster', '55fb8a8a8fbc4afce2ce63604158d5e4',\ 'Buster Posey', 'Famous catcher from 2010 World Series')

The trepctl status command has a useful option for checking current status of shards processed by an onlinereplication service The shards option shows the current status of each shard. Invoke the command as follows:

$ trepctl status -name shardsProcessing status command (shards)...NAME VALUE ---- ----- appliedLastEventId : 001571:0000000000000762;313379appliedLastSeqno : 1appliedLatency : 0.0eventCount : 0shardId : tungsten_sjc_west_w1stage : q-to-dbmsFinished status command (shards)...

Page 74: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Event Metadata and Sharding

Tungsten Replicator Guide - Document issue 2.0.468

Tungsten version 2.0.4

The shard information is based on shards that have been processed since Tungsten Replicator went online. If ashard has not been seen yet, it will not be included in the listing.

Page 75: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Multi-Master Replication

Tungsten Replicator Guide - Document issue 2.0.469

Tungsten version 2.0.4

Chapter 7. Multi-Master ReplicationTungsten 2.0 uses replication services to handle multi-master replication. Multi-master replication solves a numberof important use cases that commonly appear in large applications:

• Replicating configuration data bi-directionally between masters on different sites for a SaaS application.

• Replicating data from multiple masters into a single slave to generate analytic reports.

• Generating reference data in a site master, then replicating out to customer master and its slaves without losingtrack of the position on the site master on slaves.

These use case scenarios can be surprisingly difficult to support in a satisfactory manner when you add in therequirement to support failover. The failover case is difficult because slaves must remember their position in allof these cases with respect to two or more masters. This is necessary to allow them to be promoted. Tungstenhandles this by means of special remote services that replicate restart metadata as well as transactions.

7.1. Principles of Operation

Tungsten enables multi-master support by adding extra replication services to replicate data from remote masters.However, the relationship between these additional services and the local service that reads the master databaselog is based on a very simple principle:

All updates go through the database log.

There are no extra communication pathways between replication services to support multi-master replication. Thefollowing diagram provides a simple illustration of the principle.

Figure 7.1. Updates through the Database Log

In the simple master/slave example at the left, all updates to Master A go to the database log and then replicateto Slave A. When we add an additional master to create a multi-master configuration, updates from Master Breplicate to Master A, where they are logged just like normal application updates, including metadata operationslike updating table trep_commit_seqno with the current log position. They then replicate from Master A to SlaveA, which means that the Master B metadata changes also replicate as well.

Page 76: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Multi-Master Replication

Tungsten Replicator Guide - Document issue 2.0.470

Tungsten version 2.0.4

Metadata replication is a unique feature of Tungsten Replicator. It means that Slave A not only receives the updatesfrom Master B, but it also knows the associated sequence number of each update. This means that in the eventof a failure of Master A, Slave A can receive updates directly from Master B, as shown in the following diagram:

Figure 7.2. Metadata Replication

Beyond the principle of moving updates only through the database log, there are three specific features that arenecessary to enable multi-master topologies.

1. Multiple replication services per database. This feature is described in Chapter 5, Replication Services.

2. Remote replication services. Remote replication services log all updates on the slave, including updates toTungsten metadata. This allows changes from a remote master to replicate to a local database and appear inthe binlog, which allows them to replicate in turn to downstream slaves.

3. Anti-loop filtering. A special filter named BidiRemoteSlaveFilter discards events that come from the localdata service.

To make it easier to understand how these features help implement multi-master replication we need to walk throughsome use cases with diagrams to explore exactly how each one works.

7.1.1. Local Master/Slave Operation

Local master/slave operation is the most basic form of replication, in which a single master replicates data to one ormore slaves. The following diagram shows the database and replication service topology for a typical master/slavecluster.

Page 77: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Multi-Master Replication

Tungsten Replicator Guide - Document issue 2.0.471

Tungsten version 2.0.4

Figure 7.3. Database and Replication Service Topology for a Master/Slave Cluster

In master/slave replication all replication services are local, which means that they do not log transactions to cre-ate the metadata database, nor do they log updates to slaves. Only application updates and Tungsten metadataupdates for consistency checks and heartbeats are logged.

Note also that the service name is the same for each for each replicator. This convention is required when addingmulti-master replication.

7.1.2. Bi-Directional Replication

Bi-directional replication uses remote services to replicate services between masters. The full replication topologyis shown in the next diagram.

Figure 7.4. Bi-Directional Replication

Bi-directional topology is configured as follows in Tungsten:

Page 78: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Multi-Master Replication

Tungsten Replicator Guide - Document issue 2.0.472

Tungsten version 2.0.4

• Each master has a local service that reads the database log and allows slaves to connect to it to receive trans-actions. These services are labeled e1 and w1 in the diagram.

• Each master also has a remote service that replicates transactions from the other master using a slave pipeline.The remote service acts as a normal client and fully logs all updates to the database.

• Each remote service also configures a BidiRemoteSlaveFilter, which drops all transactions generated onthe local data service. Note that transactions are dropped at the slave end, rather than on the master. (This maybe relaxed in future to drop events at the master end so they do not travel over the wire.)

Failover is meaningless in this topology, as there are no slaves available for promotion. If one master fails, the othermaster simply continues to receive updates until the other master is restored. There is no operation to "promote"the remaining replicator as its local replication service is already operating in the master role.

7.1.3. Bi-Directional Replication with Slaves

Bi-directional replication extends easily to add slaves as shown below.

Figure 7.5. Bi-Directional Replication with Slaves

Slaves are implemented as local data services, here named w1 and e1. Note that remote master updates replicatefully, which means that the slaves also have metadata databases tungsten_e2w and tungsten_w2e respec-tively. If either master fails, the slave can be promoted to take over for its local master. For instance, if the databaseon host sjc1 fails, the following steps will handle promotion of slave sjc2 to become the local master:

1. Terminate replication services on host sjc1.

2. Promote the w1 replication service on host sjc2 to the master role.

3. Start a copy of the e2w remote service on host sjc2. This should be configured to read from host nyc1.

4. Reconfigure the w2e service on host nyc1 to pull data from host w2e.

Page 79: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Multi-Master Replication

Tungsten Replicator Guide - Document issue 2.0.473

Tungsten version 2.0.4

NoteIt is possible for the w2e service on nyc1 to have a higher sequence number than the w1 slave. In this caseit will be necessary to recover transactions manually from nyc1 in order to connect nyc1. Before failing overit is wise to compare sequence numbers on w2e and w1 to avoid a recover problem.

NoteTo avoid sequence number mishaps of the kind described in the note above, it may be enough to haveremote services read transactions from a local slave service instead of the master. This ensures that theremote service will always have a lower sequence number than the slave that is promoted to serve asmaster following a failover.

7.2. Configuration

Local master and slave replication services are configured normally. Remote services must be carefully configuredto ensure that they correctly name the local data service. The following example shows a sample of the relevantservice parameter settings in the replicator service properties file..

# Replication service type. Values are 'remote' or 'local'. Local services # do not log updates to Tungsten catalogs. Remote services do log them. replicator.service.type=remote

# Name of this replication service. service.name=e2w

# Name of the local replication service. This parameter must be set when # performing bi-directional replication using a remote slave. Events # generated by this service are dropped, thereby preventing replication # loops. It may not be the same as the value of service.name. local.service.name=w1

The replicator.service.type must be remote. The remote service name must be different from thelocal.service.name, which must be correctly configured with the name of the local service, here w1. Finally,the local service name must be set consistently (that is, to the same value across all local master and slave repli-cation services). This is necessary to ensure transactions are marked with the correct originating.

WarningFailing to configure the local.service.name according to the preceding rules may result in repli-cation loops in which transactions are replicated indefinitely between masters, resulting in corruptdata, failures, bogged down servers or all of the above. The BidiRemoteSlaveFilter uses thelocal.service.name value to identify and drop services.

The slave pipeline apply stage must likewise have a BidiRemoteSlaveFilter defined, as shown in the followingexample. This is already present in the default template provided with Tungsten, hence should not require additionaluser configuration.

Page 80: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Multi-Master Replication

Tungsten Replicator Guide - Document issue 2.0.474

Tungsten version 2.0.4

replicator.pipeline.slave=remote-to-thl,thl-to-q,q-to-dbmsreplicator.pipeline.slave.stores=thl,parallel-queuereplicator.pipeline.slave.syncTHLWithExtractor=false

. . .

replicator.stage.q-to-dbms=com.continuent.tungsten.replicator.pipeline.\ SingleThreadStageTaskreplicator.stage.q-to-dbms.extractor=parallel-q-extractorreplicator.stage.q-to-dbms.applier=mysqlreplicator.stage.q-to-dbms.filters=mysqlsessions,bidiSlavereplicator.stage.q-to-dbms.taskCount=${replicator.global.apply.channels}replicator.stage.q-to-dbms.blockCommitRowCount=${replicator.global.buffer.size}. . .# Remote slave filter. This filter sanitizes events on remote slaves by # dropping events produced on the same service ID. To use this you *must* # set the localServiceName parameter, which must be the same as the # service.name parameter of the local service that reads the binlog. replicator.filter.bidiSlave=com.continuent.tungsten.enterprise.replicator.\ filter.BidiRemoteSlaveFilterreplicator.filter.bidiSlave.localServiceName=${local.service.name}

WarningRemoving BidiRemoteSlaveFilter may also lead to replication loops between masters.

7.3. ManagementMulti-master replication is based on the service management capabilities provided by trepctl described in Sec-tion 5.5, “Management”. The following example shows how to list services on a replicator with a remote and localservice configured using the trepctl services command.

$ trepctl servicesProcessing services command...NAME VALUE---- -----appliedLastSeqno: 388appliedLatency : 0.695role : masterserviceName : nyc_east_e1serviceType : localstarted : truestate : ONLINENAME VALUE---- -----appliedLastSeqno: 510appliedLatency : 0.695role : slaveserviceName : sjc_west_w1

Page 81: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Multi-Master Replication

Tungsten Replicator Guide - Document issue 2.0.475

Tungsten version 2.0.4

serviceType : remotestarted : truestate : ONLINEFinished services command...

Page 82: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Parallel Apply

Tungsten Replicator Guide - Document issue 2.0.476

Tungsten version 2.0.4

Chapter 8. Parallel ApplyTungsten 2.0 introduces a parallel replication feature known as parallel apply. When using parallel apply, transac-tions from different databases commit concurrently using separate apply threads, which are known as channels.Applying transactions in parallel optimizes slave resource usage and in the best case results in large performanceincreases by eliminating unnecessary waits due to slow transactions on a single apply thread.

8.1. Principles of OperationParallel apply builds on the Tungsten Replicator pipeline architecture, which divides replicator processing intoseparate stages with independent threads. Parallel apply is illustrated in the following diagram:

Figure 8.1. Parallel Apply

Parallel apply processing adds the following extensions to Tungsten pipelines:

• The last stage has separate apply threads for each channel allocated to parallel apply. The number of threadsis controlled by the channel parameter of the CREATE DATASERVICE command when creating services usingTungsten Manager.

• A parallel queue is used to feed the threads. The parallel queue splits incoming transactions into separatequeues for each apply thread. The splitting process is known as partitioning and is handled by a componentcalled a partitioner that is part of the parallel queue implementation.

• The partitioner uses rules to split transactions based on shard ID, which in the current build equates to thedatabase ID. Partitioning rules are described in file shard.list which is in the Tungsten Replicator configu-ration directory.

Page 83: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Parallel Apply

Tungsten Replicator Guide - Document issue 2.0.477

Tungsten version 2.0.4

8.2. ConfigurationThis chapter describes the configuration of parallel apply.

8.2.1. Replication Service Configuration

The replicator properties template configures parallel apply by default. The only setting that must be set by usersis the following:

# For parallel replication we have a global apply thread count. replicator.global.apply.channels=5

This parameter is used elsewhere in the properties to configure the number of threads use to apply transactionsand the number of parallel queues.

WarningThe number of queues in the parallel queue and the number of threads in the partitioner must matchor Tungsten Replicator will fail to go online when operating as a slave. The properties file uses thereplicator.global.apply.channels variable to ensure consistency between values.

8.2.2. Choosing the Correct Number of Channels for Parallel Apply

Parallel apply speeds up replication by performing slave updates concurrently across different databases. Thisreduces the time that replication is blocked due to I/O operations on individual updates. To get the best effects,a small number of channels is the best.

Large numbers of channels can actually reduce overall performance for the following reasons:

• Queues not full. Individual apply threads perform best when they can use block commit to apply 10 or moretransactions together rather than separately in sequence. Block commit requires that parallel queues supplyinginput to the apply threads remain as full as possible. As the number of channels increases, individual queuestend to have fewer transactions in them, thereby reducing the performance improvement of block commit.

• Excessive memory utilization. Large numbers of parallel queues (> 10) can cause Java to run out of memory ifthere are large transactions in each queue. Applications that replicate large BLOBs, for example, should keepthe number of channels as small as possible.

NoteRecommended values may change in later builds.

8.2.3. Configuring Partitioning Rules

The shard.list file contains rules to map shards to partitions. In Tungsten 2.0.0, shards are equivalent todatabase names. Partitions are numbered starting from 0 and increasing to N-1 where N is the number of channelsdefined the service replicator.properties file.

The following example shows the default shard.list file that is distributed with Tungsten. This configurationuses hashing to assign shards to partitions and handles needs of applications that spread transactions more orless evenly across databases and do not perform updates that span databases.

Page 84: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Parallel Apply

Tungsten Replicator Guide - Document issue 2.0.478

Tungsten version 2.0.4

# SHARD MAP FILE. # This file contains shard handling rules used in the ShardListPartitioner # class for parallel replication. If unchanged shards will be hashed across# available partitions.

# You can assign shards explicitly using a shard name match, where the form# is <db>=<partition>. #common1=0#common2=0#db1=1#db2=2#db3=3

# Default partition for shards that do not match explicit name. # Permissible values are either a partition number or -1, in which # case values are hashed across available partitions. (-1 is the # default. #(*)=-1

# Comma-separated list of shards that require critical section to run. # A "critical section" means that these events are single-threaded to # ensure that all dependencies are met. #(critical)=common1,common2

The shard.list file is read when the replicator service goes online. To make rules changes go into effect, youmust take the service offline and then back online, which forces rereading of the file.

8.2.4. Explicit Partitioning Assignment

You can assign shards directly to specific partitions by simply by typing the shard (database) name followed bythe partition number. Make sure the partition number is between 0 and one less than the number of channels inthe apply task.

8.2.5. Default Partition Assignment

Shards that are not directly assigned end up in the default partition using the (*) operator. You can use eithera single explicit partition or allow Tungsten to assign the partition automatically. The following example assignsshards by default to partition 4:

(*)=4

Automatic partition assignment works by hashing the shard ID and then dividing the result by the number of parti-tions. It is the default means of assignment and tends to distributed databases evenly across partitions.

8.2.6. Critical Shards

Critical shards address cases where there are dependencies across shards that require transactions to be appliedin a fully serialized manner to avoid errors. For example, applications may have shared data like lists of user privi-

Page 85: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Parallel Apply

Tungsten Replicator Guide - Document issue 2.0.479

Tungsten version 2.0.4

leges or reference information like exchange rate tables that are used by all customers in a multi-tenant application.Parallelizing updates on shared data can result in errors or data inconsistencies.

Tungsten fully serializes all transactions from critical shards to prevent out of order updates. When an update toa critical shard appears, Tungsten first allows all current parallel queues to drain completely. It then applies thecritical shard update(s) serially. After they have been processed parallel apply is restarted.

Critical shards are declared using the (critical) operator and a comma-separated list. In the following example, alltransactions from the common1 and common2 shards are considered critical:

(critical)=common1,common2

NoteConfiguring data services to run with a single channel is the same as defining all shards as critical, sinceall processing is serialized. If in doubt, you can always use a single channel.

8.3. Management

The trepctl status command has a useful option for checking current status of threads in each task. The tasksdisplay shows the last sequence number processed by each task along with other information. Invoke the commandas follows:

$ trepctl status -name tasksProcessing status command (tasks).... . .NAME VALUE ---- ----- appliedLastEventId : 001571:0000000000000762;313379appliedLastSeqno : 1appliedLatency : 2.148cancelled : falseeventCount : 2stage : q-to-dbmstaskId : 2Finished status command (tasks)...

Parallel apply tasks are listed at the end of the display. The event count for each task is the number of eventsprocessed since the replicator came online.

Page 86: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tuning and Troubleshooting Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.480

Tungsten version 2.0.4

Chapter 9. Tuning and Troubleshooting TungstenReplicator

This chapter deals with Tungsten Replicator troubleshooting as well as tuning.

9.1. Recogizing and Handling Errors

Tungsten Replicator uses an internal state machine that includes special handling for errors. In the case of anunrecoverable error, the replicator process automatically switches into the OFFLINE:ERROR state. The replicatorpreserves information about the error that caused the problem until a user fixes the problem and enters a commandto bring the replicator back online again.

The following command illustrates error handling on a slave if you attempt bring it on-line when the underlyingdatabase is not up.

$ trepctl onlineState: OFFLINE:ERRORError: Replicator service start-up failedException Message: com.continuent.tungsten.replicator.ReplicatorException: com.mysql.jdbc.CommunicationsException: Communications link failure

Last packet sent to the server was 1 ms ago.

NoteYou can obtain additional information using trepctl status, which contains complete data about the errorincluding the log sequence number on which the error occurred.

If you restart the database server, you can then repeat the command as shown in the example below.

$ trepctl onlineState: GOING-ONLINE:SYNCHRONIZING$ trepctl State: ONLINE

Note that error information is automatically cleared once a command brings the replicator out of theOFFLINE:ERROR sub-state.

9.2. Tuning Replicator Memory

Java virtual machines may run out of memory when processing large qualities of data. This results in a log messagelike the following:

2009-01-12 08:27:26,512 ERROR tungsten.replicator.ReplicatorManager Received error notification, shutting down services: Applier thread

Page 87: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tuning and Troubleshooting Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.481

Tungsten version 2.0.4

failure at event sequence number 25 java.lang.OutOfMemoryError: Java heap space

If this occurs you should raise memory by adding JVM options to set a higher heap size allocation. Java providesthe following option to set the heap and stack frame sizes:

• -XmxNNNm where NNN is the heap size in megabytes. 256m is the default value.

• -XssNNNNk where NNNN is the stack size of each thread in kilobytes. This parameter usually has a rea-sonable default. You should only raise the value if you see a stack over flow error, which is signalled byjava.lang.StackOverflowError. 1024 is a good starting value if you see stack overflow problems.

The following example shows how to set the heap on Linux to 512M prior to starting the replicator from the commandline.

export JVM_OPTIONS=-Xmx512mtrep_start.sh

If you run Tungsten Replicator as a service, memory settings are in file conf/wrapper.conf. You can updatethese as shown in the following example.

# Maximum Java Heap Size (in MB)wrapper.java.maxmemory=512

Java memory tuning is a complex subject. For further information consult Java Development Kit documentation.

9.3. Tuning Replicator Performance

There are a number of performance trade-offs associated with replication. Tungsten Replicator includes severalproperty settings that can help improve performance in specific situations.

9.3.1. Apply-Side Event Caching

Deserializing replicated events in order to apply them is one of the most CPU-intensive operations that occurs inthe replication process. Tungsten Replicator supports a cache on the slave side that can hold events as they arereceived from the master. This cache eliminates a deserialization operation as events are read back out of theTHL prior to application. You can control the cache size using the replicator.thl.cache_size property asshown in the following example.

replicator.thl.cache_size=500

The cache consumes heap memory in the Tungsten Replicator slave process. You can assume that the additionalmemory required is roughly 2x the size of the SQL events held in the cache and adjust slave JVM heap sizeaccordingly upwards.

Page 88: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tuning and Troubleshooting Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.482

Tungsten version 2.0.4

9.3.2. Block Commit

Replicator pipeline stages by default apply events one at a time, which can result in considerable inefficienciesas it requires a commit for each transaction and also ties reading and writing to file systems into the same loop.You can improve performance by enabling apply-side block commit. This feature applies events in each stage upto a preset limit or until the extractor has no more events, then commits. You can control the block commit valueas shown in the following example.

# Global queue size for pipelines. This defines the number of events# buffered between stages. Values greater than 1 improve performance# dramatically but mean that you need to have enough heap memory to# handle blobs and large transaction fragments.replicator.global.buffer.size=25

Block commit increases queue sizes between stages, hence can result in additional memory consumption whenhandling large transactions. If the JVM runs out of memory with large block commit sizes, try either increasing theJVM heap size or reducing the block commit size.

9.3.3. Master Connection Reset Period

Master connections normally write each event to the slave as an independently serialized object that has no ref-erences to previous events. It is possible to improve transfer speed by changing the connection reset period to avalue greater than one, which allows events to contain references to previously written objects in order to avoidwriting them again with each new event

The following example shows how to raise the reset period using the replicator.thl.reset_period property.

replicator.thl.reset_period=10

Raising the reset period will increase the amount of memory consumed by the master. You can assume that theadditional memory required is roughly 2x the average SQL event size times the reset period and adjust masterJVM heap accordingly.

9.4. Running out of Disk Space in DBMS Logs

Replicator failures may occur if the catalog tables used by the Transaction History Log (THL) run out of space.This leads to a message like the following:

State: OFFLINE:ERRORError: THL thread failedException Message: java.sql.SQLException: The table 'history' is full

The cure is to free up disk space on the affected file system. You can restart Tungsten Replicator once disk spaceis available again. To avoid problems in the first place, check disk space levels and purge the THL regularly usingthe thl utility.

Page 89: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tuning and Troubleshooting Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.483

Tungsten version 2.0.4

NoteIt may not help to purge the THL after a disk full condition. Database engines do not necessarily returnspace from deleted rows immediately to the file system. Your best bet is to delete space or extend the filesystem if you can using a logical volume manager.

9.5. Data Inconsistencies Between Master and Slave Databases

Tungsten Replicator requires databases master and slave databases to be in identical states when replicating. Ifnot, SQL updates that work on the master may produce different results or fail outright in the slave. This assumptionis fundamental to the design of data replication.

Data inconsistencies typically reveal themselves in one of three ways.

• Slave update failures - When tables do not match it is possible for a SQL update to execute on the master and failon the slave. When the SQL update fails on the slave, the slave will automatically switch to the OFFLINE:ERRORstate.

• Log consistency check failures - Tungsten by default performs an automatic consistency check between mas-ter and slave event histories when a slave connects to the master THL to ensure the event histories are identical.When this check fails the slave will automatically switch to the OFFLINE:ERROR state.

• Replicator table consistency check failure - The trepctl check performs a consistency check between masterand slave tables. By default the slaves switch to the ONLINE:ERROR state if a consistency check fails.

WarningData inconsistences are symptoms of potentially serious problems that may lead to loss or corruption ofdata. It is essential to determine the root cause of such problems and fix them. Contact your technicalsupport provider immediately to get help.

There are a number of best practices to avoid data inconsistencies. First, ensure slaves are correctly provisionedso that contents are identical to the master from the start. Second, make slaves read-only to prevent accidentalupdates. Finally, use the data consistency checking described in Section 3.7, “Consistency Checking” to identifydata consistency problems that arise after provisioning.

9.5.1. Handling Log Consistency Check Failures

Tungsten log consistench checks work by comparing the log sequence number and epoch number of the last eventstored on the slave with the same event in the master log. If the events do not match, you will see the followingexception message.Event extraction failed: Client handshake failure: Client response validation failed: Log epoch numbersdo not match: client source ID=centos5b seqno=44439 server epoch number=44438 client epoch num-ber=44435

This error can occur in clusters if there is a failover between databases such that not all events on the old masterwere replicated properly to the slave. It can also happen in "split-brain" situations where a slave host becomesisolated and promotes itself to become master, then rejoins the cluster as a slave after processing local updates.

The best way to diagnose such errors is to use the thl utility to examine both master and slave logs to determinethe differences between logs. If necessary you can force slaves to connect by setting the remote THL extractorcheckSerialization property to false in file replicator.properties as shown in the following example.

Page 90: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tuning and Troubleshooting Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.484

Tungsten version 2.0.4

After editing the file, you should restart Tungsten or issue a trepctl configure command to reread configurationand then bring the slave online.

# Remote THL extractor. replicator.extractor.thl-remote=com.continuent.tungsten.replicator.thl.RemoteTHLExtractorreplicator.extractor.thl-remote.connectUri=${replicator.master.connect.uri}# If true, check to ensure logs are consistent. replicator.extractor.thl-remote.checkSerialization=false

WarningAlways investigate log consistency check failures to determine root cause and ensure you understand thedifferences. If you experience check failures after failover this means you need to reprovision. Note thatthere are also methods to recover events "trapped" on an old master; consult with your technical supportprovider for more information.

9.5.2. Skipping a Failed SQL Update on the Slave

SQL may fail on the slave for reasons unrelated to replication. For example, updates may fail due to lack of filespace, privileges, or incorrect database configuration. In these cases the normal procedure is to correct the problemand then bring the replicator online.

In cases where data are truly inconsistent, you may need to skip over one or more transactions to resume replica-tion. There are three ways to do this.

The first method is to use the trepctl online -skip command to bring the slave online but to ignore one or moretransactions. The following command shows how to skip the first three transactions after going online.

$ trepctl online -skip 3

The second method is to use the thl utility to find the failing SQL statement and skip over it. This method allowsyou to see the failed SQL, which helps you figure out what may have caused the failure. Here is the procedure.

1. Find the sequence number of the failed statement. You can use trepctl status to see the sequence numberof the failed transaction.

2. Examine the failed event using thl list. This is not required but highly recommended to understand exactlywhich statement may have failed. In this example you can display the failed event using thl list -seqno 5.

3. Use the thl skip command to skip the event.

$ thl skip -seqno 5WARNING: Skipping events may cause data inconsistencies with the master database.Are you sure you wish to skip 1 events [y/N]?ySkipping events where SEQ# = 5Marked events as skipped: 1

Page 91: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tuning and Troubleshooting Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.485

Tungsten version 2.0.4

4. Bring the slave back on-line using trepctl online . The slave will continue at the next event following theskipped event.

The third method of dealing with failures is to configure slave replicators to skip over failed statements automatically.To do this, set the applier failure policy in replicator.properties as shown below.

replicator.applier.failure_policy=warn

WarningIt is your responsibility to ensure that there are no problems with data consistency due to skipped events. Youshould always ensure you understand the root cause for any applier failure. If you set the applier policy to'warn', be aware that it applies to any event failure. This removes a fundamental check on replication--thinkcarefully before using the 'warn' setting.

9.6. Database FailureIf the underlying database fails, the replicator process will go into the OFFLINE:ERROR state. The procedure forfixing this problem differs depending on whether the failed database is acting as a master or slave.

9.6.1. Repairing Failed Slaves

If one of the Tungsten slave nodes fails, the master node and other slaves will keep on working without interruption.In other words, the cluster is functional despite the slave problem.

To recover from a slave node failure, it is essential to first analyze the reason for the node failure and fix anyproblem(s) that can prevent future slave operation.

There are two ways to recover a failed slave. In the event of a severe failure that causes the database to losedata, the simplest and best procedure is to re-provision the slave from scratch using the procedure described inSection 3.6, “Provisioning New Slaves”.

In most other cases it is sufficient to restart the Tungsten Replicator as well as the slave database, then bring thereplicator to the ONLINE state. It will pick up replication where it left off.

WarningIf you observe replication errors when restarting the slave after a crash, you should re-provision the slavefrom scratch. Forcing replication to continue in these circumstances can lead to invalid data on the slave.

9.6.2. Repairing a Failed Master

This section shows a simple failover scenario, where a master fails and must be replaced due to either a databaseserver failure or a failure of the master replicator process.

To recover, proceed as follows:

1. Ensure the master is stopped. This may be unnecessary if it has crashed but in some cases may require killingthe replicator process so that slave replicators lose their connection to the master.

2. Check the status on each slave by issuing the command below:

Page 92: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tuning and Troubleshooting Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.486

Tungsten version 2.0.4

trepctl

Wait until all slaves reach the SYNCHRONIZING state. Slaves automatically go into this state when they cannotcontact the master.

3. Check the seqno ranges from all slaves by issuing the command below:

trepctl

Select the slave with the greater max seqno as the new master.

4. From this point on, follow the master failover procedure described in Section 3.5, “Master Failover” to promotethe slave with the highest sequence number to be the new master and redirect slaves to that master.

After you have failed over to another master, you should repair the failed master. The first step is to analyze andcorrect any problems that led to the failure. Once this is done you can recover the master.

The standard way to recover a master is to provision it as a slave using the procedure described in Section 3.6,“Provisioning New Slaves” and then perform another master failover.

WarningMaking a failed master a slave without first re-provisioning can lead to data inconsistencies if the master hasunreplicated changes that were lost when failing over to a slave. Re-provisioning synchronizes databasecontents fully with the current master and avoids possible data problems.

9.7. Re-initializing Tungsten Replicator StateSometimes it is necessary to reset the replicator from scratch in order to get past an error or set up quickly fortesting. The following procedure shows how to initialize the system quickly.

1. Stop all replicator processes.

2. Clear all tables used by the replicator by logging into the Tungsten Replicator catalog database and droppingall tables found there. You can also accomplish the same thing by dropping and creating the database.

drop database tungsten;create database tungsten;

3. Synchronize master and slave state. You can do this either by dumping the state from the master and reloadinginto the slave or using an application specific procedure that drops or truncates all tables that are replicated.It does not matter how the data are synchronized so long as the schema and contents of replicated tablesmatch exactly.

After completing this procedure you can restart replication normally. Replication will start with the next update thatis applied to the database. Any previous changes in database logs are ignored.

NoteIf you are using Tungsten Manager to control the replicator process you must also remote service definitionsfor the replicators and restart the manager processes.

Page 93: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tuning and Troubleshooting Tungsten Replicator

Tungsten Replicator Guide - Document issue 2.0.487

Tungsten version 2.0.4

9.8. PostgreSQL Troubleshooting

The following log files can provide a good starting point when something unexpected happens:

• Tungsten/tungsten-replicator/log/trepsvc.log

• Tungsten/tungsten-manager/log/tmsvc.log

• PostgreSQL/pgstartup.log

• PostgreSQL/data/standby.log

• PostgreSQL/data/pg_logs/

When using WAL Shipping, pay particular attention to the following items:

• rsync and scp must work without asking any passwords from master to any of the slaves and vice versa. If yousee an rsync/scp command failing in the logs, that is critical and must be fixed.

• If you do not use the standard SSH port 22 on any of your nodes, define the port number explicitly in the ~/.ssh/config file on all other nodes of your cluster. For example:

Host centosb Port 20022

• The standby.log file will grow indefinitely, thus periodical archiving or purging is recommended (for examplewith logrotate.d).

• If WAL files successfully transfer from master to slave, but they are not applied, ensure that pg_standby iscorrectly configured during ./configure and that it is accessible.

• If a slave is stuck with a specific progress number and the reason cannot be identified (which should not happen),the last resort is to provision it by putting the Replicator offline and then online. If that also fails, reinstall the wholeTungsten deployment from scratch for the failing slave.

• Do not use RO_RELAXED consistency mode together with Warm Standby cluster, as this could lead to readrequests redirected to a slave, which, in Warm Standby case, is not accessible for reads.

• Ensure that Tungsten services are started by the postgres user. If you see themanagement.script.ScriptPlugin sh: psql: command not found error message in the Replicatorlogs, it is possibly because Tungsten services were started by the root user, which does not have psql in itsshell's path.

Page 94: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Command Reference Guide

Tungsten Replicator Guide - Document issue 2.0.488

Tungsten version 2.0.4

Chapter 10. Command Reference GuideTungsten Replicator can be run from the command line interface or run as an operating system process. Thefollowing commands are available for Tungsten Replicator.

TipTo avoid file permission problems and possibility failures, always use the correct account to start and stopTungsten Replicator. By convention this account is continuent. You may use any account to run othercommands, such as the trepctl command.

10.1. Running Tungsten Replicator from the Command Line Interface

The Tungsten Replicator bin directory contains scripts that can be used to start and stop the Replicator from thecommand line prompt.

Linux, Solaris, and Mac OS X Command Line Interfaces

Use these commands to start and stop Tungsten Replicator from the command line prompt in the Linux, Solaris,and Mac OS X operating systems.

• trepstart [-clear]

This command starts the Tungsten Replicator process if it is not already running. The replicator process willload static replicator properties automatically from conf/replicator.properties and then read dynamicproperties stored in conf/dynamic.properties.

The -clear option clears dynamic properties so that the replicator process starts from static values only. Youcan also remove the dynamic.properties file to achieve the same effect.

• trepctl stop

This command is used to stop Tungsten Replicator in an orderly manner. The process will clean up and exit.This command must be run from the START or OFFLINE state.

• trepstop

This command stops the Tungsten Replicator process if it is running. It is used to halt a replicator process thatdoes not respond to trepctl stop.

Windows Command Line Interface

Use these commands to start and stop Tungsten Replicator from the command line prompt in the Windows oper-ating system.

• trepstart.bat [-clear]

This command starts the Tungsten Replicator process if it is not already running. It functions identically to theUnix trepstart

• trepctl.bat stop

Page 95: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Command Reference Guide

Tungsten Replicator Guide - Document issue 2.0.489

Tungsten version 2.0.4

This command is used to stop Tungsten Replicator in an orderly manner. It functions identically to the Unixtrepctl stop.

Note

There is no equivalent for trepstop on Windows. You should use operating system tools to terminate theprocess if required.

10.2. Running Tungsten Replicator as an Operating System Service

These instructions are only applicable for the Linux, Solaris, and Mac OS X operating systems.

Tungsten Replicator includes replicator which is based on the Java Service Wrapper (http://wrapper.tanukisoftware.org). This allows you to run Tungsten Replicator as a service that has protection againstsignals and it also implements the standard interface used by Unix Services. The service implementation alsorestarts Tungsten Replicator in the event of a crash.

The Tungsten Replicator service implementation supports services on 32-bit and 64-bit versions of Linux, and onMac OS X platforms.

You can adjust the Tungsten Replicator service configuration by editing the conf/wrapper.properties con-figuration file. Read the comments in the file for information on legal settings. For most installations, the includedfile should work out of the box.

On Linux hosts you can add replicator as a system service that will start and start automatically, using thechkconfig command, as shown in the following example:

ln -s /opt/tungsten/tungsten-replicator/bin/replicator \ /etc/init.d/replicatorchkconfig --add replicator

If you are using the tungsten account as recommended, you should edit the replicator script and change theRUN_AS_USER to the correct account.

The replicator is a replacement for the trepstart and trepstop commands. The replicator commands aresummarized below:

• replicator start - This command starts the Tungsten Replicator service if it is not already running. Logs arewritten to log replicator.log.

• replicator status - This command prints out the status of the Tungsten Replicator service, namely whether itis running and if it is, on which process number.

• replicator stop - This command stops the Tungsten Replicator service if it is currently running.

• replicator restart - This command restarts the Tungsten Replicator service, stopping it first if it is currentlyrunning.

• replicator console - This command runs the Tungsten Replicator service in a Java console program that allowsyou to view log output in a GUI shell.

Page 96: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Command Reference Guide

Tungsten Replicator Guide - Document issue 2.0.490

Tungsten version 2.0.4

• replicator dump - This command sends a 'kill -quit' signal to the Java VM to force it to write a thread dump tothe log. This command is useful for debugging a stuck process.

NoteFor maximum ease of use, set the replicator.auto_enable property to true so the replicator willgo online automatically on start or restart. This allows you to start the replicator very quickly, for exampleusing replicator restart.

10.3. Controlling a Running Tungsten Replicator Process

The commands in the sections below change Tungsten Replicator state.

Controlling a Running Tungsten Replicator Process in Linux, Solaris and Mac OSX

The trepctl script allows you to submit commands to Tungsten Replicator. These commands change the Tung-sten Replicator state. The general syntax is as follows:

trepctl [global_options] command [command_options]

The following global options are supported.

• -host host - Specifies the replicator host. Defaults to "localhost".

• -port port - Specifies the replicator port. Defaults to 10000.

• -verbose - Prints verbose error messages on failures.

Commands and their options are described below.

• trepctl backup [-backup backupAgent] [-storage storageAgent] [-limit]

Starts a backup of the database and stores the resulting backup file in a designated storage location. The backupcommands returns the URI of the backup. This command may only be executed when Tungsten Replicator isin the OFFLINE state.

Backup command behavior can be altered by the following optional parameters:

• -backup - Provides the name of a backup agent defined in replicator.properties. If omitted TungstenReplicator will use the default backup agent name. If no default is specified, the backup will fail.

• -storage - Provides the name of a storage agent defined in replicator.properties. If omitted TungstenReplicator will use the default storage agent name. If no default is specified, the backup will fail.

• -limit - Specifies the amount of time in seconds to wait for the backup to complete before returning. If notspecified the command will wait until the backup finishes.

• trepctl clear

This command clears dynamic properties. It must be entered from the OFFLINE state.

• trepctl configure [file]

Page 97: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Command Reference Guide

Tungsten Replicator Guide - Document issue 2.0.491

Tungsten version 2.0.4

This command refreshes the Tungsten Replicator properties. The configure command may only be used whenthe node is in the OFFLINE state. If the file is omitted the replicator process will reread its own static propertiesfile followed by any currently set dynamic properties.

Tungsten Replicator reads the conf/replicator.properties file by default. An alternative property file canbe specified by giving the configuration property file name as an argument.

• trepctl flush

Synchronizes the state of the log with the state of the database. Flush inserts a heartbeat into the master databaseand returns the sequence number of that heartbeat event in the transaction history log.

This command is used for planned failover as it allows users to ensure that the transaction history log contains allmaster updates and also provides the sequence number that a slave must reach before it can safely be promotedto a master. The flush command may only be used when the replicator process is in the ONLINE:MASTER state.

• trepctl heartbeat [-name name]

The heartbeat command inserts a SQL update into the heartbeat table, which is automatically maintained inall master and slave replicators. The master update contains a timestamp showing when the update arrived inthe master database. When the heartbeat arrives on the slave, Tungsten Replicator automatically adds anothertimestamp and computes latency since the first update on the master.

This command is convenient for assessing true end-to-end latency as it forces an update to be processed. Theheartbeat command may only be used when the replicator process is in the ONLINE:MASTER state.

The heartbeat command accepts the following flag.

• -name - Names the heartbeat. If omitted the heartbeat name is "DEFAULT."

• trepctl kill [-y]

Kills the replicator process immediately without any clean-up of current activities. This command can be usedto terminate a wedged replicator or as part of replication tests. For all other situations the 'stop' command ispreferred.

Normally users will be prompted to be sure they wish to kill the replicator. The -y option suppresses the prompt.This command may be issued in any replicator state.

• trepctl offline

This command puts the replicator into the OFFLINE state if it is not there already. This command must be used tostop the replicator cleanly. It is also required when reconfiguring a slave or master as part of a failover procedure.It may only be issued in the SLAVE, SYNCHRONIZING, or MASTER state.

• trepctl offline-deferred [-seqno seqno] [-event event_id] [-heartbeat [name]] [-time YYYY-MM-DD_hh:mm:ss]

This command inserts a request to put the replicator offline once it reaches a particular point. If the replicatorhas already reached the specified point, it will go offline immediately. Otherwise the effect of this command isidentical to the normal offline command.

Command behavior can be altered by the following optional parameters:

Page 98: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Command Reference Guide

Tungsten Replicator Guide - Document issue 2.0.492

Tungsten version 2.0.4

• -seqno - Replicates up to the given sequence number and goes offline.

• -event - Replicates up to the given native event ID and goes offline.

• -heartbeat - Replicates up to the given heartbeat name goes offline. If the name is left out, it defaults to"DEFAULT".

• -time - Replicates up to the given transaction time and goes offline. This option can be used for point-in-timerecovery.

• trepctl online [-from-event event_id] [-skip number] [-seqno seqno] [-event event_id] [-heartbeat [name]][-time YYYY-MM-DD_hh:mm:ss]

This command is used to command the slave Tungsten Replicator to enter the online state. Depending on thereplicator.role property the replicator will go online as a slave or a master. This command may only be issuedfrom the OFFLINE state.

Online command behavior can be altered by the following optional parameters:

• -from-event - Provides an event ID for starting replication on masters. For example, you can start replicationat a specific location as shown by the following MySQL example.

trepctl -from-event mysql-bin.000004:34552

• -skip - Allows replication to skip applying one or more transactions. This is typically used to get around errors.

• -seqno - Replicates up to the given sequence number and goes offline.

• -event - Replicates up to the given native event ID and goes offline.

• -heartbeat - Replicates up to the given heartbeat name goes offline. If the name is left out, it defaults to"DEFAULT".

• -time - Replicates up to the given transaction time and goes offline. This option can be used for point-in-timerecovery.

• trepctl restore [-uri backupUri] [-limit

Restores a backup of the database, where the backup is a file stored in configured backup storage. The restorecommand prints a message indicating whether the restore task completed successfully. The restore commandmay only be executed when Tungsten Replicator is in the OFFLINE state.

Restore command behavior can be altered by the following optional parameters:

• -uri - Provides the URI of a backup to load from storage. If not specified the restore command will load thelast available backup in the default storage facility. If no such backup exists, the restore will fail.

• -limit - Specifies the amount of time in seconds to wait for the restore to complete before returning. If notspecified the command will wait until the restore finishes.

• trepctl setrole -role name -uri uri

Page 99: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Command Reference Guide

Tungsten Replicator Guide - Document issue 2.0.493

Tungsten version 2.0.4

This command sets a dynamically settable property value. The -name argument provides the name of the roleand must correspond to a replicator pipeline name such as master or slave. The -url argument provides themaster URI and is required for replicators that act in a slave capacity. This command may only be issued whenin the OFFLINE state.

• trepctl status

Prints full state and monitoring data for the replicator. This command may be issued from any state.

• trepctl stop

This command cause the replicator process to clean up and exit. It may only be issued from the START orOFFLINE state.

• trepctl wait [-state state] [-limit seconds]

Waits for the replicator to reach a particular state, returning immediately if the replicator is in this state. Thiscommand also returns with an error if the replicator goes into the error state. This command may be issued inany replicator state.

The state name must be a fully qualified name like OFFLINE:NORMAL. You can specify a parent state suchas OFFLINE, in which case the command will return when the replicator reaches OFFLINE or any substate.The optional -limit parameter provides a number of seconds to wait. It must have a value between 0 and 1800seconds. The limit defaults to 1800 if you do not enter a value.

• trepctl wait [-applied seqno] [-limit seconds]

Waits for the replicator to apply a particular sequence number on a slave. This command is only valid on slavereplicator processes and may only be entered when Tungsten Replicator is in the ONLINE:SLAVE state.

The command returns when the sequence number is applied or the timeout expires. Timeout behavior is thesame as for the trepctl wait -state command.

• trepctl

Run the trepctl command without arguments to check the Tungsten Replicator state.

Controlling a Running Tungsten Replicator Process in Windows

The trepctl.bat script allows you to submit commands to the Replicator. These commands change the Tung-sten Replicator state. The commands are identical to those provided for the Unix trepctl commands.

10.4. Replicator THL Utility

The Replicator THL Utility (thl) allows users to view and manipulate events in the of the Transaction History Log.Events are serialized into a platform-independent format that is not decipherable. The thl utility not only printsevents in an easy-to-read format but also support operations to purge or skip THL events.

The thl utility resides in the bin directory. The command syntax is as follows:thl [global-options] command [command-options]

The following sections provide details of commands and options.

Page 100: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Command Reference Guide

Tungsten Replicator Guide - Document issue 2.0.494

Tungsten version 2.0.4

WarningThe thl utility must be used with caution. Purging or skipping events can lead to data inconsistenciesbetween replicas or cause replicators to fail if used incautiously. Observe all caveats given below.

ImportantThe thl utility is under development and not all commands documented in this manual are fully functionalyet. Any such lacunae will be filled shortly.

10.4.1. THL Utility Global Options

The thl utility supports the following global option(s):

• -conf path

Path to replicator.properties file. The value defaults to the file in conf/replicator.properties inthe Tungsten Replicator release directory.

10.4.2. THL Utility Commands

Each thl invocation contains a command that specifies the management operation to perform.

• help

Print help information about thl commands.

• list

Display summary information about the THL including the minimum and maximum sequence numbers.

• list [-low #] [-high #] [-by #] [-sql]

Dump one or more events starting at the low and high numbers and paging by the number of events given in the-by option. The high and low event entries default to the beginning and end of the THL respectively.

• list [-seqno #] [-sql]

Dump a specific event with the indicated sequence number.

• purge [-low #] [-high #] [-age time] [-y]

Deletes events within the given range. This operation is useful for clearing the log to prevent it from buildingup over time. The purge command prints a prompt before continuing. The prompt can be suppressed with the-y option.

The -age option takes a time argument which consists of a number and a time unit (s, m, h, d). For example,purge -age 3d deletes all events more than three days old.

WarningUse caution when purging events. You should not purge events until they have reached all slaves. Also,you must leave at least one event in the THL at all times or replication may fail to restart if you turnTungsten Replicator off.

Page 101: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Command Reference Guide

Tungsten Replicator Guide - Document issue 2.0.495

Tungsten version 2.0.4

• purge [-seqno #] [-y]

Deletes a specific THL event. Caveats are identical to the more general form given previously.

• skip [-low #] [-high #] [-y]

Marks a range of events to be skipped. This function allows users to skip over events that may be causing errors.

WarningUse caution when skipping events. Skipping an event can lead to data inconsistencies between replicas.This command is normally used to get around errors that cause replicator failures.

• skip [-seqno #] [-y]

Skips a specific THL event. Caveats are identical to the more general form given previously.

Page 102: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Extending the Tungsten Replicator System

Tungsten Replicator Guide - Document issue 2.0.496

Tungsten version 2.0.4

Chapter 11. Extending the Tungsten ReplicatorSystem

The Tungsten Replicator implements appliers, extractors, filters, and stores as plug-ins. Users can write their ownplug-ins to add specialized capabilities to the Tungsten Replicator, such as supporting new databases, replicatingfrom databases into applications or message processing systems, or performing custom transformations on repli-cated data.

Such plug-ins apply only to native Tungsten replication, which is itself an implementation of the OpenReplica-torPlugin interface. Open replicator plugins implement the entire replication mechanism and follow different con-ventions from Tungsten native plug-ins. For more information on writing open replicator plugins, see Appendix B,Tungsten Open Replicator Specification.

This chapter describes how native Tungsten plug-ins work and provides guidelines for writing new plug-ins.

11.1. The ReplicatorPlugin Interface

All replicator plug-ins derive from the ReplicatorPlugin interface. This interface defines the general contractfor all plug-ins. Each specific type of replicator plug-in extends the ReplicatorPlugin interface. There are three mainplug-in types, as shown in the following table.

Table 11.1. Replicator Plug-In Types

Interface Name DescriptionApplier Applies SQL events to a replication targetExtractor Extracts SQL events from a source like a database recovery logFilter Transforms or drops a SQL eventStore Implements event storage between stages

ReplicatorPlugin methods like configure(), prepare(), and release() accept a PluginContext asan argument. The context provides a means to issue call-backs into the replicator in a portable and efficient manner.

11.2. Replicator Plug-In Life Cycle

ReplicatorPlugin implementations have the following life cycle.

1. Configuration. Tungsten Replicator instantiates plug-ins and assigns their properties when processing a config-ure administrative operation. All plug-ins are configured at this time. Configuration includes the following steps.

a. Instantiation. Tungsten Replicator creates an instance of the plug-in class using its default constructor. If theplug-in cannot be instantiated, configuration fails.

b. Property Assignment. Set plug-in properties. The properties are mapped to setters on the plug-in instance.For example, replicator.applier.mysql.host maps to setHost(String host). If a setter fails orcannot be found, configuration fails.

c. Complete Configuration. Finally, Tungsten Replicator calls the plug-in's configure() method. This methodis responsible for ensuring that configuration is complete and all required properties are correctly specified.

Page 103: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Extending the Tungsten Replicator System

Tungsten Replicator Guide - Document issue 2.0.497

Tungsten version 2.0.4

WarningThe configuration stage should not allocate resources.

2. Preparation. The Tungsten Replicator calls the plug-in prepare() method to allow the plug-in to allocate re-sources for operation. At this point the plug-in should login to databases, open files, and perform other opera-tions to get ready for actual work.

Appliers and post-storage filters are prepared when Tungsten Replicator goes into the SLAVE state. Extractorsand pre-storage filters are prepared when Tungsten Replicator goes into the MASTER command.

3. Release. The Tungsten Replicator calls the plug-in release() method when the plug-in is about to be de-allocated. The plug-in is responsible for cleaning up all resources at this time.

Between preparation and release the plug-in is active and handles calls specific to that plug-in type. Plug-in-specificcalls are never called at any other time.

11.3. Plug-In Setter Conventions

Properties in replicator.properties are mapped to setters on the plug-in instance according to the followingrules.

1. Any prefix on the property name is removed so that a property likereplicator.extractor.mydbms.foo_bar becomes simply foo_bar.

2. The first letter of the property is capitalized, so foo_bar becomes Foo_bar.

3. If an underscore ("_") occurs in the property name, it is omitted and the following character, if any, is capitalized.Foo_bar becomes FooBar.

4. The prefix "set" is added to the result. FooBar becomes setFooBar.

WarningIt is tempting to try to process properties directly by callingPluginContext.getReplicatorProperties() rather than using setter methods. You must resist thistemptation! Setters are type-safe and allow Tungsten Replicator to perform automatic validity checks ofproperty assignments. Also, processing properties directly can result in complex or brittle configuration thatis likely to fail if the property file format changes.

11.4. Logging from Plug-Ins

The Tungsten Replicator uses Log4j to generate log messages. Plug-in authors can assume that Log4j is alwaysavailable in the class path. The convention in the Tungsten Replicator is to use the class name rather than artificiallog names when writing log messages.

11.5. Advice on Writing Plug-Ins

To write your own plug-in, implement a Java class that implements one of the plug-in interfaces and follows thelife-cycle as well as conventions for setting properties. Built your plug-in into a JAR file and place it in the TungstenReplicator lib-ext along with any other JAR files on which your plug-in depends.

Page 104: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Extending the Tungsten Replicator System

Tungsten Replicator Guide - Document issue 2.0.498

Tungsten version 2.0.4

Serious plug-in development is easiest if you have access to the Tungsten Replicator source. Download the sourcecode and build the replicator yourself. This makes debugging easier, as you can see the full context of calls to yourplug-on. Also, you can use existing plug-ins as examples for your own plug-in development.

Plug-ins can be debugged using any Java IDE that supports remote debugging. Most Tungsten development isdone using Eclipse, which has excellent remote debugging support. You can enable remote debugging by uncom-menting lines in the Tungsten Replicator start scripts.

Page 105: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Catalogs

Tungsten Replicator Guide - Document issue 2.0.499

Tungsten version 2.0.4

Appendix A. Tungsten Replicator CatalogsTungsten Replicator uses the database schema described in the table descriptions below.

A.1. consistency

The consistency table is described in the table below:

Table A.1. consistency Table

Field Type Notes Descriptiondb char(64) Primary Keytbl char(64) Primary Keyid int(11) Primary Keythis_crc char(40)this_cnt int(11)master_crc char(40)master_cnt int(11)ts timestamp Defaults to current timestampcommand text

A.2. heartbeat

The heartbeat table is described in the table below:

Table A.2. consistency Table

Field Type Notes Descriptionid bigint(20) Primary Key Unique ID of this heartbeatseqno bigint(20) Sequence number of heartbeateventid varchar(32) Event IDsource_tstamp timestamp Time of heartbeattarget_tstamp timestamp Heartbeat arrival time on slavelag_millis bigint(20) Lag in millisecondssalt bigint(20) Salt value to distinguish other-

wise identical heartbeatsname varchar(132) Heartbeat name

A.3. history

The history table is described in the table below:

Page 106: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Catalogs

Tungsten Replicator Guide - Document issue 2.0.4100

Tungsten version 2.0.4

Table A.3. history Table

Field Type Notes Descriptionseqno bigint(20) Primary Key Unique transaction IDfragno smallint Fragment number for transactions that are broken into

multiple eventslast_frag char(1) Marker for last fragment in the transactionsource_id varchar(32) Source ID of master that produced this eventtype tinyint Event type (0=DBMS Update, 1=Start Master, 2=Stop

Master, 3=Heartbeat)epoch_number bigint(20) Unique number that increments following each Start Mas-

ter eventsource_tstamp timestamp When event was extracted from master loglocal_enqueue_tstamp timestamp When event was stored in the slave THLprocessed_tstamp timestamp When event was processed on slaveStatus timestamp Event status (0=Pending, 1=In Process, 2=Completed,

3=Failed, 4=Skip, 5=Skipped)comments var-

char(128)Error message associated with processing, if any

eventId varchar(16) KEYevent longblob Serialized replication event (binary data)

A.4. trep_commit_seqno

The trep_commit_seqno table is described in the table below:

Table A.4. trep_commit_seqno Table

Field Type Notes Descriptiontask_id bigint(20) Primary Key Apply task IDseqno bigint(20) Last event sequence number

processed by taskseqno smallint(6) Last event fragment number pro-

cessed by tasklast char(1) If 1, fragment was last in transac-

tionsource_id varchar(128) Source ID of replicator that ex-

tracted this eventepoch_number bigint(20) Epoch number of last eventeventid varchar(128) Native event ID of last eventapplied_latency int(11) Event applied latency in seconds

Page 107: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Open Replicator Specification

Tungsten Replicator Guide - Document issue 2.0.4101

Tungsten version 2.0.4

Appendix B. Tungsten Open ReplicatorSpecification

This appendix describes details of the Tungsten Open Replicator architecture necessary to create plugins.

NoteThe architecture described in this appendix is provisional and subject to change. Plug-in implementersshould consult with Tungsten Replicator core developers before embarking on implementations as inter-faces may change unexpectedly.

B.1. Tungsten Open Replicator Plugin InterfaceReplicator plugins implement Java classcom.continuent.tungsten.replicator.management.OpenReplicatorPlugin. Specification for thisinterface is provided in the Javadoc accompanying the class. Replicator plug-in implementations follow bean con-ventions used elsewhere in Tungsten software, including instantiation using a default constructor and assigningproperties using setter methods.

Plugin loading is controlled by property replicator.plugin. The value of this property points to the name of aplugin class, which may be assigned property values as shown in the following example.

# Select a replicator plugin. replicator.plugin=tungsten

# Native Tungsten replication plugin. replicator.plugin.tungsten=com.continuent.tungsten.replicator.management.\tungsten.TungstenPlugin

# Script plugin for external replication. replicator.plugin.script=com.continuent.tungsten.replicator.management.\script.ScriptPluginreplicator.script.root_dir=/opt/postgres/pluginreplicator.script.conf_file=conf/postgresql-wal.propertiesreplicator.script.processor=bin/pg-wal-plugin

B.2. Tungsten Replicator Plugin Implementation

B.2.1. Overview

The Tungsten Replicator native implementation is located in Java classcom.continuent.tungsten.replicator.management.tungsten.TungstenPlugin. This class imple-ments native Tungsten replication and is the default plugin implementation.

B.2.2. Configuration Properties

The native Tungsten plugin currently does not expose any configuration properties.

Page 108: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Open Replicator Specification

Tungsten Replicator Guide - Document issue 2.0.4102

Tungsten version 2.0.4

B.3. Script Replication Plugin Implementation

B.3.1. Overview

The script plugin implementation is located in classcom.continuent.tungsten.replicator.management.script.ScriptPlugin. The script plugin allowsTungsten Replicator to manage non-Tungsten replication mechanisms. The script plugin calls a program (usuallya script, hence the name) at specific points in replicator state machine transitions. The script must implement aspecific command syntax defined later in this section.

B.3.2. Configuration Properties

The script plugin has three configuration properties used to locate the script and configuration data.

Table B.1. Script Plugin Properties

Property Name Description Syntax Required Default Valueroot_dir Root directory for script

filesFile Yes none

conf_file Relative location from theroot_dir to a configurationfile that will be passed tothe script

Relative path Yes none

processor Relative location from theroot_dir to an executablescript that implements thescript plugin protocol

Relative path Yes none

B.3.3. Script Syntax Reference

The script plugin calls replication script to perform replication operations. This section defines the parameters thatthe script must accept and the behavior the replication script must provide.

In general level the script call appears as follows.script --config file --operation command [--in-params ...] [--out-params file]

The --in-params and --out-params values are optional arguments.

The Tungsten Scrip Plugin will specify a --config parameter with each operation called for. The replicationscript thus does not need to store the configuration reference, but can reference the current configuration whileprocessing each operation.

The following sections describe the syntax of plugin command line arguments.

• --config

Specifies the configuration file for the replication script. This file is provided by the replication script provider andTungsten does not use this file for anything else but passing the reference to the script. Config file reference ispassed along with each script call. This is just for the convenience of the replication script provider.

Page 109: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Open Replicator Specification

Tungsten Replicator Guide - Document issue 2.0.4103

Tungsten version 2.0.4

• --operation

This option specifies the operation for the script The operation is one of the following.

• prepare - Prepare the script for use within the script plugin. This operation is called one time during plugininitialization.

• release - Release static resources. Called one time during script plugin shutdown.

• online - Turn on replication.

• offline - Turn off replication.

• provision - Start node provisioning. The direction of provisioning can be either way, and this is documentedin replicator capabilities. The provision target takes an optional uri argument that specifies the replicator to/from which provision occurs.

• status - Write status information in the output file. Status must be organized as parameter value pairs. Therewill be obligatory and optional status parameters. Optional status parameters can be anything that the replicatorprovider thinks is good for the DBA to know when monitoring cluster operation. Obligatory status variables aredefined in a separate chapter below.

• flush - Flush the master replication queue and give back the event ID of last replicated event. The flushoperation has --out-params file defined and the script must fill in status parameter last-sent in this outputfile.

• waitevent - Wait until slave/standby node has applied the given event. The waitevent operation has two pa-rameters. event is the event to wait for. timeout is the timeout for the wait

• setrole - Set the role of the replicator. The setrole command has two input parameters. role specifies thenew role for the replicator, which can be one "master", "slave", or "standby." uri is the URI of a replicator.

• capabilities - Print out the capabilities of the replicator. The --out-params specifies the file where to writecapability list. Capability variables are defined in a separate chapter below.

• --in-params

Parameters are defined as ';' separated list of parameter=value pairs. Full parameter list is enclosed with quotes:--in-params "param1=value1 ; param2=value2 ; ...".

Operations can use separate parameters, as these are defined for each operation individually.

• --out-params

Name of output file. Script should write all output to this file. Some commands require the output to be organizedas parameter file containing a list of parameter-value pairs.

B.3.4. Status Variables

The following list describes status variables logged in the output file specified by the --out-params flag. Allvariables are required for the status operation. Other operations may need to log particular variables dependingon their specification.

Page 110: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Open Replicator Specification

Tungsten Replicator Guide - Document issue 2.0.4104

Tungsten version 2.0.4

• state - Tells the current state of the replicator. Value is one of: offline, offline-error, offline-con-figuring, offline-normal, goingonline, goingonline-restoring, goingonline-syncroniz-ing, online, goingoffline.

• role - Current role: master, slave, standby.

• errmsg - Free form error message of last pending error

• last-sent - ID of last sent replication event

• last-applied - ID of last applied replication event

• last-received - ID of last received replication event

• apply-latency - Latency in seconds between generating a master event and applying it on the slave

B.3.5. Capabilities

As output of the capability operation, the script must write supported capabilities parameter list in the specifiedoutput file. The capability file must contain lines with parameter=value pairs.

The capability parameters are shown below.

• roles - roles capability is a list of accepted roles for the replicator. Acceptable roles are: master, slaveandstandby.

• model - replicator model is one of:

• push - master sends replications events to slaves

• pull - slaves poll master for events

• peer - multi-master replication

• consistency - is consistency checking supported, true or false

• heartbeat - is heartbeat operation supported: true or false

• flush - is flush operation supported: true or false

• provision - provision direction, one of:

• donor - donor will send DB state to the joining node

• joiner - joining node will ask donor for the DB state

Page 111: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Plug-ins

Tungsten Replicator Guide - Document issue 2.0.4105

Tungsten version 2.0.4

Appendix C. Tungsten Replicator Plug-insThis appendix describes plugins native to the Tungsten Replicator implementation. These are available only in theTungsten Replicator. Other replication types managed through the open replicator do not use these.

NotePlug-in properties use short names. When actually using a property you must prefix it as described inSection 3.2.1, “The replicator.properties File”.

C.1. Transaction History Log (THL) Storage

THL storage is implemented using a single generic plug-in.

C.1.1. THL JDBC Storage Plug-In

This plug-in implements a generic JDBC storage for plug-ins. It requires a JDBC URL and has identical propertiesregardless of the database.

Table C.1. JDBC THL Storage Implementation Class

Java Implementation Classcom.continuent.tungsten.replicator.thl.JdbcTHLStorage

Table C.2. JDBC THL Properties

Property Name Description Syntax Required Default Valuepassword Password to login used for

THLString Yes none

storage THL storage implementa-tion name. Must be setto the THL storage classname given above.

Class name Yes None

url JDBC URL to connect toTHL database

String Yes none

user Login to THL database String Yes none

C.2. Extractors

The extractor plug-ins are described in the chapters below.

C.2.1. MySQL Extractor

The MySQL extractor extracts SQL events from the MySQL binlog. It supports both statement as well as rowevents. It is designed to support MySQL 4.1 and above. The MySQL extractor requires read-only access to theMySQL binlog directory. It also requires an administrative login to the MySQL server capable of running commandslike FLUSH LOGS and SHOW MASTER STATUS.

Page 112: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Plug-ins

Tungsten Replicator Guide - Document issue 2.0.4106

Tungsten version 2.0.4

Table C.3. MySQL Extractor Implementation Class

Java Implementation Classcom.continuent.tungsten.replicator.extractor.mysql.MySQLExtractor

Table C.4. MySQL Extractor Properties

Property Name Description Syntax Required Default Valuebinlog_dir Location of mysql binlogs

(MySQL data directory)Directory No /var/log/mysql

binlog_file_pattern Pattern of mysql binlog files(must match MySQL log_binparameter)

String No mysql-bin

host MySQL server host name String No localhostpassword Password to administrative

loginString No Empty String ("")

port MySQL server TCP/IP port Integer No 3306strictVer-sionChecking

Whether replicator shouldprint warning for unsupportedMySQL version

Boolean No true

user Administrative login toMySQL server

String No root

C.3. FiltersThe filter plug-ins are described in the chapters below.

C.3.1. Case Mapping Filter

Transforms database, table and column names into upper or lower case. In the case of replicated statements, ittransforms all values outside quoted strings.

Table C.5. Database Transform Filter Implementation Class

Java Implementation Classcom.continuent.tungsten.replicator.filter.CaseMappingFilter

Table C.6. Database Transform Filter Properties

Property Name Description Syntax Required Default ValuetoUpperCase If true, transform to upper

case; otherwise to lower caseBoolean No false (to lower

case)

C.3.2. Database Transform Filter

The Database Transform filter maps the default database name stored on a SQL event to a new name using regularexpressions. Any name that matches the fromRegex expression is mapped using the toRegex expression. The

Page 113: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Plug-ins

Tungsten Replicator Guide - Document issue 2.0.4107

Tungsten version 2.0.4

rules for matching are based on Java regular expressions as implemented in the Java Pattern and Matcher classes.Transforms work according to the Java Matcher.replaceAll() method.

Table C.7. Database Transform Filter Implementation Class

Java Implementation Classcom.continuent.tungsten.replicator.filter.DatabaseTransformFilter

Table C.8. Database Transform Filter Properties

Property Name Description Syntax Required Default ValuefromRegex Java regular expression that

matches against databasenames

String Yes None

toRegex Transformation regular ex-pression used to changematching Java databasenames

String Yes None

C.3.3. Logger Filter

The logger filter logs SQL events by calling their toString() methods. The result is written to the log4j log.

Table C.9. Logger Filter Implementation Class

Java Implementation Classcom.continuent.tungsten.replicator.filter.LoggingFilter

Table C.10. Logger Filter Properties

Property Name Description Syntax Required Default Value- - - - -

C.3.4. MySQL Session Support Filter

This filter automatically adds session IDs to statements extracted from the MySQL binlog so that temporary tablescan be correctly replicated in MySQL statement replication. Without this filter slaves may encounter DDL errors ormake inconsistent updates if applications use temporary tables.

Table C.11. MySQL Session Support Filter Implementation Class

Java Implementation Classcom.continuent.tungsten.replicator.filter.MySQLSessionSupportFilter

Table C.12. MySQL Session Support Filter Properties

Property Name Description Syntax Required Default Value- - - - -

Page 114: Tungsten Replicator Guide - Active Systemsactivesystems.ph › downloads › Utilities › MySQL › Tungsten-Replicato… · Tungsten Replicator Guide Tungsten Replicator Guide -

Tungsten Replicator Plug-ins

Tungsten Replicator Guide - Document issue 2.0.4108

Tungsten version 2.0.4

C.3.5. Time-Delay Filter

The Time-Delay filter implements simple time-delayed replication. The filter delays event application by a user-con-figurable period of time measured in seconds. Event application is blocked until the current time is equal to orgreater than the time at which the event was extracted plus the delay. For this to work, master and slave hostsshould use Network Time Protocol (NTP) to synchronize clocks.

WarningThe Time-Delay filter should only be used on slaves. It delays storage of events in the Transaction HistoryLog (THL). On masters delaying events can lead to data loss.

Table C.13. Time Delay Filter Implementation Class

Java Implementation Classcom.continuent.tungsten.replicator.filter.TimeDelayFilter

Table C.14. Time Delay Filter Properties

Property Name Description Syntax Required Default Val-ue

delay Time delay in seconds Integer No 0 (No delay)

C.4. Appliers

The applier plug-ins are described in the chapters below.

C.4.1. MySQL Applier

The MySQL Applier is a custom applier that applies SQL events to MySQL databases. The MySQL applier supportsapplication of statement and row events on all MySQL versions. The MySQL applier requires an administrativelogin to the MySQL server capable of running any SQL DDL statement or update that is replicated.

Table C.15. MySQL Applier Implementation Class

Java Implementation Classcom.continuent.tungsten.replicator.applier.MySQLApplier

Table C.16. MySQL Applier Properties

Property Name Description Syntax Required Default Valuehost MySQL server host name String No localhostpassword Password to administrative

loginString No Empty String ("")

port MySQL server TCP/IP port Integer No 3306user Administrative login to

MySQL serverString No root