MySQL 5.6 is GA! Now we have new things to play with and in my personal opinion the most interesting one is the new Global Transaction ID (GTID) support in replication. This post is not an explanation of what is GTID and how it works internally because there are many documents about that:
http://dev.mysql.com/doc/refman/5.6/en/replication-gtids-concepts.html
One thing that worths to mention is that if you want GTID support log_slave_updates will need to be enabled in slave server and the performance impact should be taken in account.
Anyway, this post tends to be more practical, we will see how to create/restore new slaves from a master using GTID.
How to set up a new slave
The first thing that we need to know is that now Binary Logs and Position are not needed anymore with GTID enabled. Instead we need to know in which GTID is the master and set it on the slave. MySQL keeps two global variables with GTID numbers on it:
gtid_executed: it contains a representation of the set of all transaction logged in the binary log
gtid_purged: it contains a representation of the set of all transactions deleted from the binary log
So now, the process is the following:
- take a backup from the master and store the value of gtid_executed
- restore the backup on the slave and set gtid_purged with the value of gtid_executed from the master
The new mysqldump can do those tasks for us. Let’s see an example of how to take a backup from the master and restore it on the slave to set up a new replication server.
master > show global variables like 'gtid_executed'; +---------------+-------------------------------------------+ | Variable_name | Value | +---------------+-------------------------------------------+ | gtid_executed | 9a511b7b-7059-11e2-9a24-08002762b8af:1-13 | +---------------+-------------------------------------------+ master > show global variables like 'gtid_purged'; +---------------+------------------------------------------+ | Variable_name | Value | +---------------+------------------------------------------+ | gtid_purged | 9a511b7b-7059-11e2-9a24-08002762b8af:1-2 | +---------------+------------------------------------------+
Now we take a backup with mysqldump from the master:
# mysqldump --all-databases --single-transaction --triggers --routines --host=127.0.0.1 --port=18675 --user=msandbox --password=msandbox > dump.sql
It will contain the following line:
# grep PURGED dump.sql SET @@GLOBAL.GTID_PURGED='9a511b7b-7059-11e2-9a24-08002762b8af:1-13';
Therefore during the dump recover process on the slave it will set GTID_PURGED to the GTID_EXECUTED value from the master.
So now, we just need to recover the dump and start the replication:
slave1 > show global variables like 'gtid_executed'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | gtid_executed | | +---------------+-------+ slave1 > show global variables like 'gtid_purged'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | gtid_purged | | +---------------+-------+ slave1 > source dump.sql; [...] slave1 > show global variables like 'gtid_executed'; +---------------+-------------------------------------------+ | Variable_name | Value | +---------------+-------------------------------------------+ | gtid_executed | 9a511b7b-7059-11e2-9a24-08002762b8af:1-13 | +---------------+-------------------------------------------+ slave1 > show global variables like 'gtid_purged'; +---------------+-------------------------------------------+ | Variable_name | Value | +---------------+-------------------------------------------+ | gtid_purged | 9a511b7b-7059-11e2-9a24-08002762b8af:1-13 | +---------------+-------------------------------------------+
The last step is to configure the slave using the auto-configuration method of GTID:
slave1 > CHANGE MASTER TO MASTER_HOST="127.0.0.1", MASTER_USER="msandbox", MASTER_PASSWORD="msandbox", MASTER_PORT=18675, MASTER_AUTO_POSITION = 1;
How to restore a slave in a bad and fast way
Let’s imagine that our slave has been down for several days and the binary logs from the master have been purged. This is the error we are going to get:
Slave_IO_Running: No Slave_SQL_Running: Yes Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.'
So, let’s try to solve it. First we have the bad and fast way, that is, point to another GTID that the master has in the binary logs. First, we get the GTID_EXECUTED from the master:
master > show global variables like 'GTID_EXECUTED'; +---------------+-------------------------------------------+ | Variable_name | Value | +---------------+-------------------------------------------+ | gtid_executed | 9a511b7b-7059-11e2-9a24-08002762b8af:1-14 | +---------------+-------------------------------------------+
And we set it on the slave:
slave> set global GTID_EXECUTED="9a511b7b-7059-11e2-9a24-08002762b8af:1-14" ERROR 1238 (HY000): Variable 'gtid_executed' is a read only variable
Error! Remember, we get the GTID_EXECUTED from the master and set is as GTID_PURGED on the slave.
slave1 > set global GTID_PURGED="9a511b7b-7059-11e2-9a24-08002762b8af:1-14"; ERROR 1840 (HY000): GTID_PURGED can only be set when GTID_EXECUTED is empty.
Error again, GTID_EXECUTED should be empty before changing GTID_PURGED manually but we can’t change it with SET because is a read only variable. The only way to change it is with reset master (yes, on a slave server):
slave1> reset master; slave1 > show global variables like 'GTID_EXECUTED'; +---------------+-------+ | Variable_name | Value | +---------------+-------+ | gtid_executed | | +---------------+-------+ slave1 > set global GTID_PURGED="9a511b7b-7059-11e2-9a24-08002762b8af:1-14"; slave1> start slave io_thread; slave1> show slave statusG [...] Slave_IO_Running: Yes Slave_SQL_Running: Yes [...]
Now, if you don’t get any error like primary/unique key duplication then you can run the pt-table-checksum and pt-table-sync.
How to restore a slave in a good and slow way
The good way is mysqldump again. We take a dump from the master like we saw before and try to restore it on the slave:
slave1 [localhost] {msandbox} ((none)) > source dump.sql; [...] ERROR 1840 (HY000): GTID_PURGED can only be set when GTID_EXECUTED is empty. [...]
Wop! It is important to mention that these kind of error messages can dissapear on the shell buffer because the restore of the dump will continue. Be cautious.
Same problem again so same solution too:
slave1> reset master; slave1> source dump.sql; slave1> start slave; slave1> show slave statusG [...] Slave_IO_Running: Yes Slave_SQL_Running: Yes [...]
Conclusion
With the new GTID we need to change our minds. Now binary log and position is not something we need to take in account, gtid_executed and gtid_purged are our new friends. Newer versions of Xtrabackup have full support of GTID. You can check the following blog post:
注:
1、若要启用gtid,则slave server需要启用log_slave_updates参数。
2、使用gtid搭建的主从复制,不再关注binary log和position,现在关注gtid_executed和gtid_purged。
3、若要更新gtid_purged,
则需要
gtid_executed
值
为空。
It is possible to update the value of this variable, but only when gtid_executed
is the empty string, and therefore gtid_purged
is the empty string. This can occur either when replication has not been started previously, or when replication was not previously using GTIDs. Prior to MySQL 5.7.6, this variable was settable only when gtid_mode=ON
. In MySQL 5.7.6 and later, this variable is settable regardless of the value of gtid_mode
.
Issuing RESET MASTER
causes the value of this variable to be reset to an empty string.
执行reset master 可以清空global下的gtid_purged和gtid_extcuted,但不包括session值。
RESET MASTER
also clears the values of the gtid_purged
system variable as well as the global value of the gtid_executed
system variable (but not its session value); that is, executing this statement sets each of these values to an empty string (''
). In MySQL 5.7.5 and later, this statement also clears the mysql.gtid_executed
table (see mysql.gtid_executed Table).
4、当主从中断了,而且主机已将从库需要的日志删除,就会报:
Got fatal error 1236 from master when reading data from binary log: 'The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires.
此时,可以使用:
方法1:
1 slave1> reset master; 2 slave1> show global variables like 'GTID_EXECUTED'; 3 +---------------+-------+ 4 | Variable_name | Value | 5 +---------------+-------+ 6 | gtid_executed | | 7 +---------------+-------+ 8 slave1> set global GTID_PURGED="9a511b7b-7059-11e2-9a24-08002762b8af:1-14"; 9 slave1> start slave io_thread; 10 slave1> show slave statusG
不过,这样做完后,需要检查一下数据是否一致,可能会出现数据不一致的情况。
方法2:
是从主机重新dump一份数据,导入。
1 slave1> reset master; 2 slave1> source dump.sql; 3 slave1> start slave; 4 slave1> show slave statusG
此时,从库中的gtid_purged和gtid_extcuted值已不可用,而又无法修改,所以要使用reset master清空。
参考:
https://www.percona.com/blog/2013/05/09/how-to-create-a-new-or-repair-a-broken-gtid-based-slave-with-percona-xtrabackup/
http://dev.mysql.com/doc/refman/5.7/en/reset-master.html
http://dev.mysql.com/doc/refman/5.7/en/replication-options-gtids.html#sysvar_gtid_purged