MySQL in a Nutshell, 2nd Edition

Replication Process

When replication is running, SQL statements that change data are recorded in a binary log (bin.log) on the master server as it executes them. Only SQL statements that change the data or the schema are logged. This includes data-changing statements such as INSERT, UPDATE, and DELETE, and schema-manipulation statements such as CREATE TABLE, ALTER TABLE, and DROP TABLE. This also includes actions that affect data and schema, but that are executed from the command line by utilities such as mysqladmin. This does not include SELECT statements or any statements that only query the server for information (e.g., SHOW VARIABLES).

Along with the SQL statements, the master records a log position identification number. This is used to determine which log entries the master should relay to the slave. This is necessary because the slave may not always be able to consistently receive information from the master. We’ve already discussed one situation where an administrator deliberately introduces a delay: the planned downtime for making a backup of the slave. In addition, there may be times when the slave has difficulty staying connected to the master due to networking problems, or it may simply fall behind because the master has a heavy load of updates in a short period of time. However, if the slave reconnects hours or even days later, with the position identification number of the last log entry received, it can tell the master where it left off in the binary log and the master can send the slave all of the subsequent entries it missed while it was disconnected. It can do this even if the entries are contained in multiple log files due to the master’s logs having been flushed in the interim.

To help you better understand the replication process, I’ve included—in this section especially, and throughout this chapter—sample excerpts from each replication log and index file. Knowing how to sift through logs can be useful in resolving server problems, not only with replication but also with corrupt or erroneously written data.

Here is a sample excerpt from a master binary log file:

/usr/local/mysql/bin/mysqlbinlog /var/log/mysql/bin.000007 > 
   /tmp/binary_log.txt
tail --lines=14 /tmp/binary_log.txt

# at 1999
#081120 9:53:27 server id 1 end_log_pos 2158 Query thread_id=1391 
   exec_time=0 error_code=0
USE personal;
SET TIMESTAMP=1132502007;
CREATE TABLE contacts2 (contact_id INT AUTO_INCREMENT KEY, name VARCHAR(50), 
   telephone CHAR(15));

# at 2158
#081120  9:54:53 server id 1  end_log_pos 2186  Intvar
SET INSERT_ID=1;

# at 2186
#081120  9:54:53 server id 1 end_log_pos 2333 Query thread_id=1391 
   exec_time=0 error_code=0
SET TIMESTAMP=1132502093;
INSERT INTO contacts2 (name, telephone) VALUES ('Rusty Osborne', 
   '001-504-838-1234');

As the first line shows, I used the command-line utility mysqlbinlog to read the contents of a particular binary log file. (MySQL provides mysqlbinlog to make it possible for administrators to read binary log files.) Because the log is extensive, I have redirected the results to a text file in the /tmp directory using the shell’s redirect operator (>). On the second line, I used the tail command to display the last 14 lines of the text file generated, which translates to the last 3 entries in this case. You could instead pipe (|) the contents to more or less on a Linux or Unix system if you intend only to scan the results briefly.

After you redirect the results of a binary log to a text file, it may be used to restore data on the master server to a specific point in time. Point-in-time recovery methods are an excellent recourse when you have inadvertently deleted a large amount of data that has been added since your last backup.

The slave server, through an input/output (I/O) thread, listens for communications from the master that inform the slave of new entries in the master’s binary log and of any changes to its data. The master does not transmit data unless requested by the slave, nor does the slave continuously harass the master with inquiries as to whether there are new binary log entries. Instead, after the master has made an entry to its binary log, it looks to see whether any slaves are connected and waiting for updates. The master then pokes the slave to let it know that an entry has been made to the binary log in case it’s interested. It’s then up to the slave to request the entries. The slave will ask the master to send entries starting from the position identification number of the last log file entry the slave processed.

Looking at each entry in the sample binary log, you will notice that each starts with the position identification number (e.g., 1999). The second line of each entry provides the date (e.g., 081120 for November 20, 2008), the time, and the replication server’s identification number. This is followed by the position number expected for the next entry. This number is calculated from the number of bytes of text that the current entry required. The rest of the entry provides stats on the thread that executed the SQL statement. In some of the entries, a SET statement is provided with the TIMESTAMP variable so that when the binary log entry is used, the date and time will be adjusted on the slave server to match the date and time of the entry on the master. The final line of each entry lists the SQL statement that was executed.

The excerpt begins with a USE statement, which is included to be sure that the slave makes the subsequent changes to the correct database. Similarly, notice that the second entry sets the value of INSERT_ID in preparation for the INSERT statement of the following entry. This ensures that the value to be used for the column contact_id on the slave is the same. Nothing is left to chance or assumed, if possible.

The master server keeps track of the names of the binary log files in a simple text file (bin.index). Here is an excerpt from the binary index file:

/var/log/mysql/bin.000001
/var/log/mysql/bin.000002
/var/log/mysql/bin.000003
/var/log/mysql/bin.000004
/var/log/mysql/bin.000005
/var/log/mysql/bin.000006
/var/log/mysql/bin.000007

This list of binary log files can also be obtained by entering the SHOW MASTER LOGS statement. Notice that the list includes the full pathname of each binary log file in order, reflecting the order in which the files were created. The master appends each name to the end of the index file as the log file is opened. If a slave has been offline for a couple of days, the master will work backward through the files to find the file containing the position identification number given to it by the slave. It will then read that file from the entry following the specified position identification number to the end, followed by the subsequent files in order, sending SQL statements from each to the slave until the slave is current or disconnected. If the slave is disconnected before it can become current, the slave will make another request when it later reconnects with the last master log position identification number it received.

After the slave is current again, the slave will go back to waiting for another announcement from the master regarding changes to its binary log. The slave will make inquiries only when it receives another nudge from the master or if it is disconnected temporarily. When a slave reconnects to the master after a disconnection, it makes inquiries to ensure it didn’t miss anything while it was disconnected. If it sits idle for a long period, the slave’s connection will time out, also causing it to reconnect and make inquires.

When the slave receives new changes from the master, the slave doesn’t update its databases directly. Direct application of changes was tried in versions of replication prior to MySQL 4.0 and found to be too inflexible to deal with heavy loads, particularly if the slave’s databases are also used to support user read requests (i.e., the slave helps with load balancing). For example, tables in its replicated databases may be busy when the slave is attempting to update the data. A SELECT statement could be executed with the HIGH_PRIORITY flag, giving it priority over UPDATE and other SQL statements that change data and are not also specifically entered with the HIGH_PRIORITY flag. In this case, the replication process would be delayed by user activities. On a busy server, the replication process could be delayed for several minutes. If the master server crashes during such a lengthy delay, this could mean the loss of many data changes of which the slave is not informed because it’s waiting to access a table on its own system.

By separating the recording of entries received and their reexecution, the slave is assured of getting all or almost all transactions up until the time that the master server crashes. This is a much more dependable method than the direct application method used in earlier versions of MySQL.

Currently, the slave appends the changes to a file on its filesystem named relay.log. Here is an excerpt from a relay log:

/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;

# at 4
#081118  3:18:40 server id 2  end_log_pos 98
   Start: binlog v 4, server v 5.0.12-beta-standard-log created 051118 
      3:18:40

# at 98
#700101  1:00:00 server id 1  end_log_pos 0 Rotate to bin.000025 pos: 4

# at 135
#080819 11:40:57 server id 1  end_log_pos 98
   Start: binlog v 4, server v 5.0.10-beta-standard-log created 050819 
      11:40:57 at startup
ROLLBACK;

# at 949
#080819 11:54:49 server id 1  end_log_pos 952
    Query thread_id=10 exec_time=0 error_code=0
SET TIMESTAMP=1124445289;
CREATE TABLE prepare_test (id INTEGER NOT NULL, name CHAR(64) NOT NULL);

# at 952
#080819 11:54:49 server id 1  end_log_pos 1072
   Query thread_id=10 exec_time=0 error_code=0
SET TIMESTAMP=1124445289;
INSERT INTO prepare_test VALUES ('0','zhzwDeLxLy8XYjqVM');

This log is like the master’s binary log. Notice that the first entry mentions the server’s ID number, 2, which is the slave’s identification number. There are also some entries for server 1, the master. The first entries have to do with log rotations on both servers. The last two entries are SQL statements relayed to the slave from the master.

A new relay log file is created when replication starts on the slave and when the logs are flushed (i.e., the FLUSH LOGS statement is issued). A new relay log file is also created when the current file reaches the maximum size as set with the max_relay_log_size variable. The maximum size can also be limited by the max_binlog_size variable. If these variables are set to 0, there is no size limit placed on the relay log files.

Once the slave has made note of the SQL statements relayed to it by the master, it records the new position identification number in its master information file (master.info) on its filesystem. Here is an example of the content of a master information file on a slave server:

14
bin.000038
6393
master_host
replicant
my_pwd
3306
60
0

This file is present primarily so the slave can remember its position in the master’s binary log file even if the slave is rebooted, as well as the information necessary to reconnect to the master. Each line has a purpose as follows:

The first line contains the number of lines of data in the file (14). Although fewer than 14 lines are shown here, the actual file contains blank lines that make up the rest.
The second line shows the name of the last binary log file on the master from which the slave received entries. This helps the master respond more quickly to requests.
The third line shows the position identification number (6393) in the master’s binary log.
The next few lines contain the master’s host address, the replication username, the password, and the port number (3306). Notice that the password is not encrypted and is stored in clear text. Therefore, be sure to place this file in a secure directory. You can determine the path for this file in the configuration file, as discussed later in this chapter.
The next to last line (60) lists the number of attempts the slave should make when reconnecting to the master before stopping.
The last line here is 0 because the server from which this master information file came does not have the SSL feature enabled. If SSL was enabled on the slave and allowed on the master, there would be a value of 1 on this line. It would also be followed by 5 more lines containing values related to SSL authentication, completing the 14 lines anticipated on the first line.

Take note of how the values in the master information file match the following excerpt from a SHOW SLAVE STATUS statement executed on the slave:

SHOW SLAVE STATUS \G

*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: master_host
Master_User: replicant
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: bin.000038
Read_Master_Log_Pos: 6393
Relay_Log_File: relay.000002
Relay_Log_Pos: 555
Relay_Master_Log_File: bin.000011
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB: test
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '1000' for key 1' on query.'
Skip_Counter: 0
Exec_Master_Log_Pos: 497
Relay_Log_Space: 22277198
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL

Notice the labels for the additional SSL variables at the end of this excerpt. The master information file contains lines for them, whether they are empty or populated. Also note that, for tighter security, the command does not return the password.

After noting the new position number and other information that may have changed, the slave uses the same I/O thread to resume waiting for more entries from the master.

When the slave server detects any change to its relay log, through a different thread, the slave uses an SQL thread to execute the new SQL statement recorded in the relay log to the slave’s databases. After the new entry is recorded in the slave’s relay log, the new relay log position identification number is recorded in its relay log information file (relay-log.info) through the slave’s SQL thread. Here is an excerpt from a relay log information file:

/var/log/mysql/relay.000002
555
bin.000011
497

The first line lists the file path and name of the current relay log file (Relay_Log_File in the SHOW SLAVE STATUS command ). The second value is the SQL thread’s position in the relay log file (Relay_Log_Pos). The third contains the name of the current binary log file on the master (Relay_Master_Log_File). The last value is the position in the master log file (Exec_Master_Log_Pos). These values can also be found in the results of the SHOW SLAVE STATUS statement shown earlier in this section.

When the slave is restarted or its logs are flushed, it appends the name of the current relay log file to the end of the relay log index file (relay-log.index). Here is an example of a relay log index file:

/var/log/mysql/relay.000002
/var/log/mysql/relay.000003
/var/log/mysql/relay.000004

This process of separating threads keeps the I/O thread free and dedicated to receiving changes from the master. It ensures that any delays in writing to the slave’s databases on the SQL thread will not prevent or slow the receiving of data from the master. With this separate thread method, the slave server naturally has exclusive access to its relay log file at the filesystem level.

As an additional safeguard to ensure accuracy of data, the slave compares the entries in the relay log to the data in its databases. If the comparison reveals any inconsistency, the replication process is stopped and an error message is recorded in the slave’s error log (error.log). The slave will not restart until it is told to do so. After you have resolved the discrepancy that the slave detected in the data, you can then instruct the slave to resume replication, as explained later in this chapter.

Here is an example of what is recorded on a slave server in its error log when the results don’t match:

020714 01:32:03  mysqld started
020714  1:32:05  InnoDB: Started
/usr/sbin/mysqld-max: ready for connections
020714  8:00:28  Slave SQL thread initialized, starting replication in log
'server2-bin.035' at position 579285542, relay log './db1-relay-bin.001'
position: 4
020714  8:00:29  Slave I/O thread: connected to master
'...@66.216.68.90:3306',  replication started in log 'server2-bin.035' at
position 579285542 ERROR: 1146  Table 'test.response' doesn't exist
020714  8:00:30  Slave: error 'Table 'test.response' doesn't exist' on query
'INSERT INTO response SET connect_time=0.073868989944458,
page_time=1.53695404529572, site_id='Apt'', error_code=1146
020714  8:00:30  Error running query, slave SQL thread aborted. Fix the
problem, and restart the slave SQL thread with "SLAVE START". We stopped at
log 'server2-bin.035' position 579285542
020714  8:00:30  Slave SQL thread exiting, replication stopped in log
'server2-bin.035' at position 579285542
020714  8:00:54  Error reading packet from server:  (server_errno=1159)
020714  8:00:54  Slave I/O thread killed while reading event
020714  8:00:54  Slave I/O thread exiting, read up to log 'server2-bin.035',
position 579993154
020714  8:01:58  /usr/sbin/mysqld-max: Normal shutdown

020714  8:01:58  InnoDB: Starting shutdown...
020714  8:02:05  InnoDB: Shutdown completed
020714  8:02:06  /usr/sbin/mysqld-max: Shutdown Complete

020714 08:02:06  mysqld ended

In the first message, I have boldfaced an error message showing that the slave has realized the relay log contains entries involving a table that does not exist on the slave. The second boldfaced comment gives a message informing the administrator of the decision and some instructions on how to proceed.

The replication process may seem very involved and complicated at first, but it all occurs quickly; it’s typically not a significant drain on the master server. Also, it’s surprisingly easy to set up: it requires only a few lines of options in the configuration files on the master and slave servers. You will need to copy the databases on the master server to the slave to get the slave close to being current. Then it’s merely a matter of starting the slave for it to begin replicating. It will quickly update its data to record any changes made since the initial backup copied from the master was installed on the slave. From then on, replication will keep it current—theoretically. As an administrator, you will have to monitor the replication process and resolve problems that arise occasionally.

Before concluding this section, let me adjust my previous statement about the ease of replication: replication is deceptively simple. When it works, it’s simple. Before it starts working, or if it stops working, the minimal requirements of replication make it difficult to determine why it doesn’t work. Now let’s look at the steps for setting up replication.

Previous Chapter

Merits of Replication

Next Chapter

The Replication User Account

Table of Contents for MySQL in a Nutshell, 2nd Edition

Replication Process

Table of Contents for
MySQL in a Nutshell, 2nd Edition