A Beginners Guide to MySQL Replication Part 4: Using GTID-based Replication

This article is part of Aisha Bukar's 6 part series: A Beginners Guide to MySQL Replication. The entries include:

Welcome back to another replication series! As a quick reminder, we explored various methods of using MySQL’s replication capabilities in our previous discussions. Initially, we employed the traditional binary-log-based replication approach to set up our replication servers. This involved tracking the binary log file and its positions to facilitate replication.

In this article, we will dive into a more recent and acceptable approach to creating replication – using the Global Transaction Identifiers (GTID) based replication.

WHY Global Transaction Identifiers (GTIDs)?

GTIDs (Global Transaction Identifiers) provide a way to uniquely identify and track transactions across a distributed database environment. Each server assigns a unique GTID to every transaction, which greatly makes managing and coordinating data replication between multiple MySQL servers super easy. Here are some reasons why we use GTIDs in MySQL:

1. Unique Identification

GTIDs offer a distinct and globally unique identification for each transaction within a distributed database environment, as long as the transaction was recorded in the binary log. If a transaction is not written to the binary log, such as a read-only transaction, the system won’t assign it a GTID. This uniqueness guarantees that every transaction has a one-of-a-kind identifier, simplifying the process of monitoring and handling data modifications.

2. Data Consistency

GTIDs play a vital role in upholding data consistency across distributed databases. It ensures that a transaction committed on the source server cannot be duplicated on the replica, guaranteeing that transactions are applied in the precise order they were executed. As a result, data synchronization remains accurate and consistent across all servers involved.

3. Replication-made-easy

 GTIDs make the replication process between MySQL servers much simpler. With GTID-based replication, you don’t need to manage complex binary log positions; servers can easily identify which transactions have been applied and which are pending replication.

4. Failure and Recovery

 In case of server failures or crashes, GTIDs help ensure that data is consistently replicated. After a failure is resolved, the replication process can automatically resume from the last successfully applied GTID. This helps prevent data inconsistencies.

5. Easy to track

GTID-based replication simplifies tracking the source server of any transaction. By assigning a unique GTID to each transaction, which includes the source server’s unique identifier (server UUID) and the transaction sequence number, we can easily extract and decode the GTID from the binary log of a GTID-based transaction, thereby determining the original server. This capability is invaluable for auditing, troubleshooting, and maintaining data integrity in distributed database environments.

Configuring Replication with GTIDs

Configuring replication with GTIDs in MySQL involves several steps. Below is a step-by-step guide to help you set up GTID-based replication:

1. Configure the source server

To configure each MySQL server that would be participating in the replication setup, you must assign a unique server ID, set up a binary log, enable GTID mode, and create a replication user on the source server. Here’s what you need to do:

i. Assign a unique server ID: If the server ID is not assigned or you want to change it, you can do so by adding the following line in my.cnf or my.ini file under the [mysqld] section:

Replace YOUR_UNIQUE_SERVER_UUID with a universally unique identifier (UUID).

ii. Enable GTID Mode: To start using GTIDs, you must enable GTID mode in the MySQL configuration file (my.cnf or my.ini). Add the following line under the [mysqld] section:

The option --skip-slave-start=ON is employed to delay the start of replication until the replica is fully configured. Save the configuration file and restart the MySQL server.

iii. Binary Logging: Ensure that binary logging is enabled on the source server. This is necessary for GTID-based replication to function properly.

iv. Create Replication User: As part of the configuration, you need to create a dedicated user on the source server that the replica servers will use to connect and replicate data. Grant the necessary privileges to this user using the following SQL command:

Replace 'replication_user', 'replica_host', and 'your_password' with appropriate values. The FLUSH PRIVILEGES command is used to ensure that the server reloads the grant tables after granting the necessary privileges to the new user. This is necessary because the changes to the user privileges are stored in the MySQL grant tables, and these tables are cached in memory for performance reasons.

2. Configure the replica server

To configure the replica server using a GTID-based transaction, a unique server ID is required, else, the replication wouldn’t function properly. Here’s what we need to do:

i. Set Server UUID: If the server UUID of the replica server is different from the source server, you need to set it. Add the following line to my.cnf or my.ini file under the [mysqld] section:

Replace YOUR_UNIQUE_SERVER_UUID with a universally unique identifier (UUID).

ii. Enable GTID Mode: Open the MySQL configuration file (my.cnf or my.ini) on the replica server, and under the [mysqld] section, add the following line to enable GTID mode:

iii. Binary Log: Replica servers are allowed to use GTID without configuring binary logs. If you wish to continue without enabling this, you can use the --skip-log-bin and --log-replica-updates=OFF options. However, to ensure that the binary log is enabled on the replica server. Add the following line to my.cnf or my.ini file:

Replace /path/to/binary/log/file with the desired location for storing binary logs.

iv. Restart MySQL Server: Save the changes made to the configuration file and restart the MySQL server on the replica to apply the changes.

3. Start replication on the replica server

Once the source and replica servers are configured, we can start the replication process on the replica server. Connect to the replica server’s MySQL console and issue the following command:

Replace 'source_server_ip', source_server_port, 'replication_user', and 'your_password' with the appropriate values.

The replica has a special place where it keeps important information needed to read and apply changes from the original server. This information is used by a specific part of the replica called the “applier metadata repository” to understand what changes it has already processed and what still needs to be done. This information includes things like file names and positions, which help the applier metadata repository keep track of where it is in the process of replicating data and is stored within the slave_relay_log_info table in the MySQL system schema. It’s like a road map to follow.

Before MySQL 8.0, if you wanted to use tables as replication metadata repositories, you had to explicitly set the master_info_repository and relay_log_info_repository options to TABLE during server startup.

Starting from MySQL 8.0, creating the replication metadata repositories as tables became the default behavior. As a result, you no longer need to set these system variables to use tables for the replication metadata repositories. However, it’s important to note that the use of master_info_repository and relay_log_info_repository system variables is deprecated in MySQL 8.0 and may be removed in future versions. So, it’s recommended to rely on the default table-based replication metadata repositories instead.

4. Backup Data

Before starting replication, it’s essential to ensure that the replica server has the same data as the source server. You can use tools like mysqldump or Percona XtraBackup to create a backup on the source server and restore it on the replica server.

5. Start replica server

After setting up replication parameters, you can start the replica server by executing the following command:

You can monitor the  replica server’s status using the following command:

Look for the Slave_IO_Running and Slave_SQL_Running status variables to ensure that both are set to Yes, indicating that the replication is running correctly.

Once you have completed these steps, your MySQL replication with GTIDs should be up and running! 

Configuring Multi-Source Replication using GTIDs

To configure a replication channel for each source using a GTID-based replication, on each MySQL server, we need to identify the source servers from which they will replicate data. This is typically done by specifying the source server’s connection details, including the server’s unique ID and the login credentials in the configuration file. Then, we can enable multi-source replication by specifying the following commands:

 

To replicate the database of our choice, this can be achieved using the CHANGE REPLICATION FILTER statement. 

And this is how you configure multi-source replication in a GTID-based replication.

 

Drawbacks of GTID-based replication

Replication with GTID (Global Transaction Identifier) in MySQL offers several advantages, such as simplifying the configuration and improving reliability. However, there are certain restrictions and considerations to be aware of:

  1. GTID Mode: Replication using GTID requires all servers in the replication topology to be running in GTID mode. Mixing GTID and non-GTID servers is not supported.
  2. Server Version: GTID was introduced in MySQL 5.6.5. To use GTID-based replication, all servers involved must be running MySQL 5.6.5 or later versions.

  3. GTID Consistency: For proper replication, the GTID consistency across all servers must be maintained. If GTIDs are manipulated manually, it can lead to data inconsistency and replication errors.
  4. Temporary Tables: GTID-based replication does not support temporary tables on the slave. If you have temporary tables in your replication, you might need to restructure your schema.
  5. Unsupported CREATE TABLE AND SELECT STATEMENTS: The CREATE TABLE AND SELECT statements are not supported when using the GTID-based replication topology due to inconsistencies with the statement and row replication formats.

 

Conclusion

With GTID, replication setup becomes more straightforward, as there is no need to deal with file and position values. It also enhances data integrity and reliability by supporting ROW-based replication, eliminating the issues associated with non-deterministic statements. However, implementing GTID-based replication requires careful planning and consideration of certain restrictions as mentioned earlier. It is essential to be aware of the limitations, especially concerning temporary tables,  to avoid unexpected challenges during deployment or changes in the replication topology.

When used correctly and in suitable scenarios, GTID-based replication proves to be a valuable tool for managing replication environments, offering greater confidence in data synchronization and replication reliability. For additional information on GTID-based replication, I highly suggest you visit the MySQL official documentation.