When replication is running, SQL statements that change data are
recorded in a binary log (bin.log) on the master
server as it executes them. Only SQL statements that change the data or
the schema are logged. This includes data-changing statements such as
INSERT, UPDATE, and
DELETE, and schema-manipulation statements such as
CREATE TABLE, ALTER TABLE, and
DROP TABLE. This also includes actions that affect data
and schema, but that are executed from the command line by utilities such
as mysqladmin. This does not include
SELECT statements or any statements that only query the
server for information (e.g., SHOW VARIABLES).
Along with the SQL statements, the master records a log position identification number. This is used to determine which log entries the master should relay to the slave. This is necessary because the slave may not always be able to consistently receive information from the master. We’ve already discussed one situation where an administrator deliberately introduces a delay: the planned downtime for making a backup of the slave. In addition, there may be times when the slave has difficulty staying connected to the master due to networking problems, or it may simply fall behind because the master has a heavy load of updates in a short period of time. However, if the slave reconnects hours or even days later, with the position identification number of the last log entry received, it can tell the master where it left off in the binary log and the master can send the slave all of the subsequent entries it missed while it was disconnected. It can do this even if the entries are contained in multiple log files due to the master’s logs having been flushed in the interim.
To help you better understand the replication process, I’ve included—in this section especially, and throughout this chapter—sample excerpts from each replication log and index file. Knowing how to sift through logs can be useful in resolving server problems, not only with replication but also with corrupt or erroneously written data.
Here is a sample excerpt from a master binary log file:
/usr/local/mysql/bin/mysqlbinlog /var/log/mysql/bin.000007 > /tmp/binary_log.txttail --lines=14 /tmp/binary_log.txt# at 1999 #081120 9:53:27 server id 1 end_log_pos 2158 Query thread_id=1391 exec_time=0 error_code=0 USE personal; SET TIMESTAMP=1132502007; CREATE TABLE contacts2 (contact_id INT AUTO_INCREMENT KEY, name VARCHAR(50), telephone CHAR(15)); # at 2158 #081120 9:54:53 server id 1 end_log_pos 2186 Intvar SET INSERT_ID=1; # at 2186 #081120 9:54:53 server id 1 end_log_pos 2333 Query thread_id=1391 exec_time=0 error_code=0 SET TIMESTAMP=1132502093; INSERT INTO contacts2 (name, telephone) VALUES ('Rusty Osborne', '001-504-838-1234');
As the first line shows, I used the command-line utility
mysqlbinlog to read the contents of a particular binary log file. (MySQL provides
mysqlbinlog to make it possible for administrators to
read binary log files.) Because the log is extensive, I have redirected
the results to a text file in the /tmp directory
using the shell’s redirect operator (>). On the
second line, I used the tail command to display the
last 14 lines of the text file generated, which translates to the last 3
entries in this case. You could instead pipe (|) the contents to
more or less on a Linux or Unix
system if you intend only to scan the results briefly.
After you redirect the results of a binary log to a text file, it may be used to restore data on the master server to a specific point in time. Point-in-time recovery methods are an excellent recourse when you have inadvertently deleted a large amount of data that has been added since your last backup.
The slave server, through an input/output (I/O) thread, listens for communications from the master that inform the slave of new entries in the master’s binary log and of any changes to its data. The master does not transmit data unless requested by the slave, nor does the slave continuously harass the master with inquiries as to whether there are new binary log entries. Instead, after the master has made an entry to its binary log, it looks to see whether any slaves are connected and waiting for updates. The master then pokes the slave to let it know that an entry has been made to the binary log in case it’s interested. It’s then up to the slave to request the entries. The slave will ask the master to send entries starting from the position identification number of the last log file entry the slave processed.
Looking at each entry in the sample binary log, you will notice that
each starts with the position identification number (e.g., 1999). The
second line of each entry provides the date (e.g., 081120 for November 20,
2008), the time, and the replication server’s identification number. This
is followed by the position number expected for the next entry. This
number is calculated from the number of bytes of text that the current
entry required. The rest of the entry provides stats on the thread that
executed the SQL statement. In some of the entries, a
SET statement is provided with the
TIMESTAMP variable so that when the binary log entry is
used, the date and time will be adjusted on the slave server to match the
date and time of the entry on the master. The final line of each entry
lists the SQL statement that was executed.
The excerpt begins with a USE statement, which is
included to be sure that the slave makes the subsequent changes to the
correct database. Similarly, notice that the second entry sets the value
of INSERT_ID in preparation for the
INSERT statement of the following entry. This ensures
that the value to be used for the column contact_id on the slave is the same.
Nothing is left to chance or assumed, if possible.
The master server keeps track of the names of the binary log files in a simple text file (bin.index). Here is an excerpt from the binary index file:
/var/log/mysql/bin.000001 /var/log/mysql/bin.000002 /var/log/mysql/bin.000003 /var/log/mysql/bin.000004 /var/log/mysql/bin.000005 /var/log/mysql/bin.000006 /var/log/mysql/bin.000007
This list of binary log files can also be obtained by entering the
SHOW MASTER LOGS statement. Notice that the list includes the full pathname of each
binary log file in order, reflecting the order in which the files were
created. The master appends each name to the end of the index file as the
log file is opened. If a slave has been offline for a couple of days, the
master will work backward through the files to find the file containing
the position identification number given to it by the slave. It will then
read that file from the entry following the specified position
identification number to the end, followed by the subsequent files in
order, sending SQL statements from each to the slave until the slave is
current or disconnected. If the slave is disconnected before it can become
current, the slave will make another request when it later reconnects with the last master log position
identification number it received.
After the slave is current again, the slave will go back to waiting for another announcement from the master regarding changes to its binary log. The slave will make inquiries only when it receives another nudge from the master or if it is disconnected temporarily. When a slave reconnects to the master after a disconnection, it makes inquiries to ensure it didn’t miss anything while it was disconnected. If it sits idle for a long period, the slave’s connection will time out, also causing it to reconnect and make inquires.
When the slave receives new changes from the master, the slave
doesn’t update its databases directly. Direct application of changes was
tried in versions of replication prior to MySQL 4.0 and found to be too
inflexible to deal with heavy loads, particularly if the slave’s databases
are also used to support user read requests (i.e., the slave helps with
load balancing). For example, tables in its replicated databases may be
busy when the slave is attempting to update the data. A
SELECT statement could be executed with the
HIGH_PRIORITY flag, giving it priority over
UPDATE and other SQL statements that change data and
are not also specifically entered with the HIGH_PRIORITY flag. In this case, the
replication process would be delayed by user activities. On a busy server,
the replication process could be delayed for several minutes. If the
master server crashes during such a lengthy delay, this could mean the
loss of many data changes of which the slave is not informed because it’s
waiting to access a table on its own system.
By separating the recording of entries received and their reexecution, the slave is assured of getting all or almost all transactions up until the time that the master server crashes. This is a much more dependable method than the direct application method used in earlier versions of MySQL.
Currently, the slave appends the changes to a file on its filesystem named relay.log. Here is an excerpt from a relay log:
/*!40019 SET @@session.max_insert_delayed_threads=0*/;
/*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/;
# at 4
#081118 3:18:40 server id 2 end_log_pos 98
Start: binlog v 4, server v 5.0.12-beta-standard-log created 051118
3:18:40
# at 98
#700101 1:00:00 server id 1 end_log_pos 0 Rotate to bin.000025 pos: 4
# at 135
#080819 11:40:57 server id 1 end_log_pos 98
Start: binlog v 4, server v 5.0.10-beta-standard-log created 050819
11:40:57 at startup
ROLLBACK;
# at 949
#080819 11:54:49 server id 1 end_log_pos 952
Query thread_id=10 exec_time=0 error_code=0
SET TIMESTAMP=1124445289;
CREATE TABLE prepare_test (id INTEGER NOT NULL, name CHAR(64) NOT NULL);
# at 952
#080819 11:54:49 server id 1 end_log_pos 1072
Query thread_id=10 exec_time=0 error_code=0
SET TIMESTAMP=1124445289;
INSERT INTO prepare_test VALUES ('0','zhzwDeLxLy8XYjqVM');This log is like the master’s binary log. Notice that the first entry mentions the server’s ID number, 2, which is the slave’s identification number. There are also some entries for server 1, the master. The first entries have to do with log rotations on both servers. The last two entries are SQL statements relayed to the slave from the master.
A new relay log file is created when replication starts on the
slave and when the logs are flushed (i.e., the FLUSH
LOGS statement is issued). A new relay log file is also created
when the current file reaches the maximum size as set with the
max_relay_log_size variable. The maximum size can also
be limited by the max_binlog_size
variable. If these variables are set to 0, there is no size limit placed
on the relay log files.
Once the slave has made note of the SQL statements relayed to it by the master, it records the new position identification number in its master information file (master.info) on its filesystem. Here is an example of the content of a master information file on a slave server:
14 bin.000038 6393 master_host replicant my_pwd 3306 60 0
This file is present primarily so the slave can remember its position in the master’s binary log file even if the slave is rebooted, as well as the information necessary to reconnect to the master. Each line has a purpose as follows:
The first line contains the number of lines of data in the file (14). Although fewer than 14 lines are shown here, the actual file contains blank lines that make up the rest.
The second line shows the name of the last binary log file on the master from which the slave received entries. This helps the master respond more quickly to requests.
The third line shows the position identification number (6393) in the master’s binary log.
The next few lines contain the master’s host address, the replication username, the password, and the port number (3306). Notice that the password is not encrypted and is stored in clear text. Therefore, be sure to place this file in a secure directory. You can determine the path for this file in the configuration file, as discussed later in this chapter.
The next to last line (60) lists the number of attempts the slave should make when reconnecting to the master before stopping.
The last line here is 0 because the server from which this master information file came does not have the SSL feature enabled. If SSL was enabled on the slave and allowed on the master, there would be a value of 1 on this line. It would also be followed by 5 more lines containing values related to SSL authentication, completing the 14 lines anticipated on the first line.
Take note of how the values in the master information file match the following excerpt from a SHOW SLAVE STATUS statement executed on the slave:
SHOW SLAVE STATUS \G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: master_host Master_User: replicant Master_Port: 3306 Connect_Retry: 60 Master_Log_File: bin.000038 Read_Master_Log_Pos: 6393 Relay_Log_File: relay.000002 Relay_Log_Pos: 555 Relay_Master_Log_File: bin.000011 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: test Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1062 Last_Error: Error 'Duplicate entry '1000' for key 1' on query.' Skip_Counter: 0 Exec_Master_Log_Pos: 497 Relay_Log_Space: 22277198 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL
Notice the labels for the additional SSL variables at the end of this excerpt. The master information file contains lines for them, whether they are empty or populated. Also note that, for tighter security, the command does not return the password.
After noting the new position number and other information that may have changed, the slave uses the same I/O thread to resume waiting for more entries from the master.
When the slave server detects any change to its relay log, through a different thread, the slave uses an SQL thread to execute the new SQL statement recorded in the relay log to the slave’s databases. After the new entry is recorded in the slave’s relay log, the new relay log position identification number is recorded in its relay log information file (relay-log.info) through the slave’s SQL thread. Here is an excerpt from a relay log information file:
/var/log/mysql/relay.000002 555 bin.000011 497
The first line lists the file path and name of the current relay log
file (Relay_Log_File in the SHOW SLAVE STATUS command ). The second value is the SQL thread’s
position in the relay log file (Relay_Log_Pos). The
third contains the name of the current binary log file on the master
(Relay_Master_Log_File). The last value is the position
in the master log file (Exec_Master_Log_Pos). These
values can also be found in the results of the SHOW SLAVE STATUS statement shown earlier in this section.
When the slave is restarted or its logs are flushed, it appends the name of the current relay log file to the end of the relay log index file (relay-log.index). Here is an example of a relay log index file:
/var/log/mysql/relay.000002 /var/log/mysql/relay.000003 /var/log/mysql/relay.000004
This process of separating threads keeps the I/O thread free and dedicated to receiving changes from the master. It ensures that any delays in writing to the slave’s databases on the SQL thread will not prevent or slow the receiving of data from the master. With this separate thread method, the slave server naturally has exclusive access to its relay log file at the filesystem level.
As an additional safeguard to ensure accuracy of data, the slave compares the entries in the relay log to the data in its databases. If the comparison reveals any inconsistency, the replication process is stopped and an error message is recorded in the slave’s error log (error.log). The slave will not restart until it is told to do so. After you have resolved the discrepancy that the slave detected in the data, you can then instruct the slave to resume replication, as explained later in this chapter.
Here is an example of what is recorded on a slave server in its error log when the results don’t match:
020714 01:32:03 mysqld started 020714 1:32:05 InnoDB: Started /usr/sbin/mysqld-max: ready for connections 020714 8:00:28 Slave SQL thread initialized, starting replication in log 'server2-bin.035' at position 579285542, relay log './db1-relay-bin.001' position: 4 020714 8:00:29 Slave I/O thread: connected to master '...@66.216.68.90:3306', replication started in log 'server2-bin.035' at position 579285542 ERROR: 1146 Table 'test.response' doesn't exist 020714 8:00:30 Slave: error 'Table 'test.response' doesn't exist' on query 'INSERT INTO response SET connect_time=0.073868989944458, page_time=1.53695404529572, site_id='Apt'', error_code=1146 020714 8:00:30 Error running query, slave SQL thread aborted. Fix the problem, and restart the slave SQL thread with "SLAVE START". We stopped at log 'server2-bin.035' position 579285542 020714 8:00:30 Slave SQL thread exiting, replication stopped in log 'server2-bin.035' at position 579285542 020714 8:00:54 Error reading packet from server: (server_errno=1159) 020714 8:00:54 Slave I/O thread killed while reading event 020714 8:00:54 Slave I/O thread exiting, read up to log 'server2-bin.035', position 579993154 020714 8:01:58 /usr/sbin/mysqld-max: Normal shutdown 020714 8:01:58 InnoDB: Starting shutdown... 020714 8:02:05 InnoDB: Shutdown completed 020714 8:02:06 /usr/sbin/mysqld-max: Shutdown Complete 020714 08:02:06 mysqld ended
In the first message, I have boldfaced an error message showing that the slave has realized the relay log contains entries involving a table that does not exist on the slave. The second boldfaced comment gives a message informing the administrator of the decision and some instructions on how to proceed.
The replication process may seem very involved and complicated at first, but it all occurs quickly; it’s typically not a significant drain on the master server. Also, it’s surprisingly easy to set up: it requires only a few lines of options in the configuration files on the master and slave servers. You will need to copy the databases on the master server to the slave to get the slave close to being current. Then it’s merely a matter of starting the slave for it to begin replicating. It will quickly update its data to record any changes made since the initial backup copied from the master was installed on the slave. From then on, replication will keep it current—theoretically. As an administrator, you will have to monitor the replication process and resolve problems that arise occasionally.
Before concluding this section, let me adjust my previous statement about the ease of replication: replication is deceptively simple. When it works, it’s simple. Before it starts working, or if it stops working, the minimal requirements of replication make it difficult to determine why it doesn’t work. Now let’s look at the steps for setting up replication.