2019/8/8
1.过程解析
1. 检查校验环境
2. 创建binlog streamer连接到主库或者从库,添加binlog的监听
3. 创建log表 _xxx_ghc 和ghost表 _xxx_gho并修改ghost表结构到最新
4. row-copy & apply-binlog
5. cut-over 阶段
6. 收尾
2. 日志
3. 其他文档
1. 检查校验环境
2. 创建binlog streamer连接到主库或者从库,添加binlog的监听
3. 创建log表 _xxx_ghc 和ghost表 _xxx_gho并修改ghost表结构到最新
4. row-copy & apply-binlog
5. cut-over 阶段
6. 收尾
2. 日志
3. 其他文档
1.过程解析
1. 检查校验环境
- 测试db是否可连通,并且验证database是否存在
- 确认连接实例是否正确
- 权限验证 show grants for current_user()
- binlog验证,包括row格式验证和修改binlog格式后的重启replicate
- 原表存储引擎,外键,触发器检查,行数预估等
2. 创建binlog streamer连接到主库或者从库,添加binlog的监听
3. 创建log表 _xxx_ghc 和ghost表 _xxx_gho并修改ghost表结构到最新
- create table _xxx_ghc # 日志表
- create table _xxx_gho # 临时表
- alter table _xxx_gho .. # 表变更
4. row-copy & apply-binlog
- 计算最大最小值
select `id` from darren`.`t4` order by id` asc limit 1;
select `id` from darren`.`t4` order by id` desc limit 1;
- 计算trunk
# 第一个trunk 到倒数第二个
select `id` from `darren`.`t4`
where `id` >= _binary'1' and `id` <= _binary'58594' order by `id` asc limit 1 offset 999
#最后一个chunk如果不足1000,那么上面sql查询为空,这时运行:
select `id` from (
select `id` from `darren`.`t4`
where `id` > _binary'58000' and `id` <= _binary'58594' order by `id` asc limit 1000
) select_osc_chunk order by `id` desc limit 1;
- 循环插入数据:
insert ignore into `darren`.`_t4_gho` (`id`, `name`, `c1`)
(select `id`, `name`, `c1` from `darren`.`t4` force index (`PRIMARY`)
where `id` >= _binary'1' and `id` <= _binary'1000' lock in share mode )
- row-copy & apply-binlog 映射操作
操作 | 原表 | 新表 |
row-copy | select | insert ignore into |
appy-binlog | insert | replace into |
update | update | |
delete | delete |
- row-copy & apply binlog 操作顺序数据冲突分析
数据迁移过程,涉及三个操作:- A:对原表进行rowcopy;
- B:应用程序的DML;
- C:应用binlog到新表,因为DML操作才会记录binglog,C一定会在B后
- binlog同步数据何时结束?
copy完数据向_xxx_ghc写入status:
AllEventsUpToLockProcessed:1533533052229905040,当binlogsyncer过滤到该值表示所有event都已应用
5. cut-over 阶段
S1: C10:
create table _t1_del like t1; # 创建magic表_xxx_del,目的为了防止过快的进行rename操作和意外情况rename
lock table t1 write,_t1_del write ; # 对源表和magic表_xxx_del加write锁
S2:C11-C19
mysql> select * from t1; # C11...C19: 新的请求进来,关于原表的请求被blocked
S3:C20
rename table `t1` to `_t1_del` ,`_t1_gho` to `t1`; #被阻塞,RENAME timeout as 3 seconds
S4: show processlist; # 检查是否有blocked的rename请求
S5: C10:
drop table _t1_del; #删除magic表
unlock tables;
S6:C20
mysql> rename table `t1` to `_t1_del` ,`_t1_gho` to `t1`;
Query OK, 0 rows affected (4 min 4.36 sec)
mysql> select * from t1;
Empty set (5 min 35.78 sec)
不同阶段失败后如何处理:
- S1 失败,退出程序,比如建表成功,加锁失败,退出程序,未加锁
- S3前,rename请求来的时候,会话C10死掉,lock会自动释放,同时因为_xxx_del的存在rename也会失败,所有请求恢复正常
- S3后, rename被blocked的时候,会话C10死掉,lock会自动释放,同样因为_xxx_del的存在,rename会失败,所有请求恢复正常
- C20死掉,gh-ost会捕获不到rename,会话C10继续运行,释放lock,所有请求恢复正常
6. 收尾
- 关闭binlogsyncer连接
- 删除源表和_t4_ghc表
2. 日志
[root@iZbp1aihmppor0oe9epotuZ ~]# gh-ost
> --ok-to-drop-table
> --initially-drop-ghost-table
> --initially-drop-socket-file
> --host="rm-host1.mysql.rds.aliyuncs.com"
> --port=3306
> --user=""
> --password=''
> --database="dba_test"
> --table="t1"
> --verbose
> --alter="add column cc1 varchar(256) default '';"
> --allow-on-master
> --assume-rbr='true'
> --assume-master-host='rm-host1.mysql.rds.aliyuncs.com:3306'
> --aliyun-rds --execute
2019-06-27 09:32:07 INFO starting gh-ost 1.0.48
2019-06-27 09:32:07 INFO Migrating `dba_test`.`t1`
2019-06-27 09:32:07 INFO connection validated on rm-host1.mysql.rds.aliyuncs.com:3306
2019-06-27 09:32:07 INFO User has REPLICATION CLIENT, REPLICATION SLAVE privileges, and has ALL privileges on `dba_test`.*
2019-06-27 09:32:07 INFO binary logs validated on rm-host1.mysql.rds.aliyuncs.com:3306
2019-06-27 09:32:07 INFO Inspector initiated on rm-host1.mysql.rds.aliyuncs.com:3306, version 5.7.25-log
2019-06-27 09:32:07 INFO Table found. Engine=InnoDB
2019-06-27 09:32:07 INFO Estimated number of rows via EXPLAIN: 1
2019-06-27 09:32:07 INFO Master forced to be rm-host1.mysql.rds.aliyuncs.com:3306
2019-06-27 09:32:07 INFO log_slave_updates validated on rm-host1.mysql.rds.aliyuncs.com:3306
2019-06-27 09:32:07 INFO connection validated on rm-host1.mysql.rds.aliyuncs.com:3306
2019-06-27 09:32:07 INFO Connecting binlog streamer at mysql-bin.001413:27777608
[2019/06/27 09:32:07] [info] binlogsyncer.go:133 create BinlogSyncer with config {99999 mysql rm-host1.mysql.rds.aliyuncs.com 3306 oper_super false false <nil> false UTC true 0 0s 0s 0 false}
[2019/06/27 09:32:07] [info] binlogsyncer.go:354 begin to sync binlog from position (mysql-bin.001413, 27777608)
[2019/06/27 09:32:07] [info] binlogsyncer.go:203 register slave for master server rm-host1.mysql.rds.aliyuncs.com:3306
[2019/06/27 09:32:07] [info] binlogsyncer.go:723 rotate to (mysql-bin.001413, 27777608)
2019-06-27 09:32:07 INFO rotate to next log from mysql-bin.001413:0 to mysql-bin.001413
2019-06-27 09:32:07 INFO connection validated on rm-host1.mysql.rds.aliyuncs.com:3306
2019-06-27 09:32:07 INFO connection validated on rm-host1.mysql.rds.aliyuncs.com:3306
2019-06-27 09:32:07 INFO will use time_zone='SYSTEM' on applier
2019-06-27 09:32:07 INFO Examining table structure on applier
2019-06-27 09:32:07 INFO Applier initiated on rm-host1.mysql.rds.aliyuncs.com:3306, version 5.7.25-log
2019-06-27 09:32:07 INFO Dropping table `dba_test`.`_t1_gho`
2019-06-27 09:32:07 INFO Table dropped
2019-06-27 09:32:07 INFO Dropping table `dba_test`.`_t1_ghc`
2019-06-27 09:32:07 INFO Table dropped
2019-06-27 09:32:07 INFO Creating changelog table `dba_test`.`_t1_ghc`
2019-06-27 09:32:07 INFO Changelog table created
2019-06-27 09:32:07 INFO Creating ghost table `dba_test`.`_t1_gho`
2019-06-27 09:32:07 INFO Ghost table created
2019-06-27 09:32:07 INFO Altering ghost table `dba_test`.`_t1_gho`
2019-06-27 09:32:07 INFO Ghost table altered
2019-06-27 09:32:07 INFO Intercepted changelog state GhostTableMigrated
2019-06-27 09:32:07 INFO Waiting for ghost table to be migrated. Current lag is 0s
2019-06-27 09:32:07 INFO Handled changelog state GhostTableMigrated
2019-06-27 09:32:07 INFO Chosen shared unique key is PRIMARY
2019-06-27 09:32:07 INFO Shared columns are id,uid,name,version
2019-06-27 09:32:07 INFO Listening on unix socket file: /tmp/gh-ost.dba_test.t1.sock
2019-06-27 09:32:07 INFO Migration min values: [1]
2019-06-27 09:32:07 INFO Migration max values: [1]
2019-06-27 09:32:07 INFO Waiting for first throttle metrics to be collected
2019-06-27 09:32:07 INFO First throttle metrics collected
# Migrating `dba_test`.`t1`; Ghost table is `dba_test`.`_t1_gho`
# Migrating rm-host1.mysql.rds.aliyuncs.com:3306; inspecting rm-host1.mysql.rds.aliyuncs.com:3306; executing on iZbp1aihmppor0oe9epotuZ
# Migration started at Thu Jun 27 09:32:07 +0800 2019
# chunk-size: 1000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: ; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.dba_test.t1.sock
Copy: 0/1 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.001413:27779913; State: migrating; ETA: N/A
2019-06-27 09:32:07 INFO Row copy complete
# Migrating `dba_test`.`t1`; Ghost table is `dba_test`.`_t1_gho`
# Migrating rm-host1.mysql.rds.aliyuncs.com:3306; inspecting rm-host1.mysql.rds.aliyuncs.com:3306; executing on iZbp1aihmppor0oe9epotuZ
# Migration started at Thu Jun 27 09:32:07 +0800 2019
# chunk-size: 1000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: ; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.dba_test.t1.sock
Copy: 1/1 100.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.001413:27780654; State: migrating; ETA: due
2019-06-27 09:32:07 INFO Grabbing voluntary lock: gh-ost.4639852.lock
2019-06-27 09:32:07 INFO Setting LOCK timeout as 6 seconds
2019-06-27 09:32:07 INFO Looking for magic cut-over table
2019-06-27 09:32:07 INFO Creating magic cut-over table `dba_test`.`_t1_del`
2019-06-27 09:32:07 INFO Magic cut-over table created
2019-06-27 09:32:07 INFO Locking `dba_test`.`t1`, `dba_test`.`_t1_del`
2019-06-27 09:32:07 INFO Tables locked
2019-06-27 09:32:07 INFO Session locking original & magic tables is 4639852
2019-06-27 09:32:07 INFO Writing changelog state: AllEventsUpToLockProcessed:1561599127457178334
2019-06-27 09:32:07 INFO Intercepted changelog state AllEventsUpToLockProcessed
2019-06-27 09:32:07 INFO Handled changelog state AllEventsUpToLockProcessed
2019-06-27 09:32:07 INFO Waiting for events up to lock
Copy: 1/1 100.0%; Applied: 0; Backlog: 1/1000; Time: 1s(total), 0s(copy); streamer: mysql-bin.001413:27786099; State: migrating; ETA: due
2019-06-27 09:32:08 INFO Waiting for events up to lock: got AllEventsUpToLockProcessed:1561599127457178334
2019-06-27 09:32:08 INFO Done waiting for events up to lock; duration=987.966735ms
# Migrating `dba_test`.`t1`; Ghost table is `dba_test`.`_t1_gho`
# Migrating rm-host1.mysql.rds.aliyuncs.com:3306; inspecting rm-host1.mysql.rds.aliyuncs.com:3306; executing on iZbp1aihmppor0oe9epotuZ
# Migration started at Thu Jun 27 09:32:07 +0800 2019
# chunk-size: 1000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: ; critical-load: ; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.dba_test.t1.sock
Copy: 1/1 100.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 0s(copy); streamer: mysql-bin.001413:27786557; State: migrating; ETA: due
2019-06-27 09:32:08 INFO Setting RENAME timeout as 3 seconds
2019-06-27 09:32:08 INFO Session renaming tables is 4639848
2019-06-27 09:32:08 INFO Issuing and expecting this to block: rename /* gh-ost */ table `dba_test`.`t1` to `dba_test`.`_t1_del`, `dba_test`.`_t1_gho` to `dba_test`.`t1`
2019-06-27 09:32:08 INFO Found atomic RENAME to be blocking, as expected. Double checking the lock is still in place (though I don't strictly have to)
2019-06-27 09:32:08 INFO Checking session lock: gh-ost.4639852.lock
2019-06-27 09:32:08 INFO Connection holding lock on original table still exists
2019-06-27 09:32:08 INFO Will now proceed to drop magic table and unlock tables
2019-06-27 09:32:08 INFO Dropping magic cut-over table
2019-06-27 09:32:08 INFO Releasing lock from `dba_test`.`t1`, `dba_test`.`_t1_del`
2019-06-27 09:32:08 INFO Tables unlocked
2019-06-27 09:32:08 INFO Tables renamed
2019-06-27 09:32:08 INFO Lock & rename duration: 1.010443741s. During this time, queries on `t1` were blocked
2019-06-27 09:32:08 INFO Looking for magic cut-over table
[2019/06/27 09:32:08] [info] binlogsyncer.go:164 syncer is closing...
[2019/06/27 09:32:08] [error] binlogstreamer.go:77 close sync with err: sync is been closing...
2019-06-27 09:32:08 INFO Closed streamer connection. err=<nil>
2019-06-27 09:32:08 INFO Dropping table `dba_test`.`_t1_ghc`
[2019/06/27 09:32:08] [info] binlogsyncer.go:179 syncer is closed
2019-06-27 09:32:08 INFO Table dropped
2019-06-27 09:32:08 INFO Dropping table `dba_test`.`_t1_del`
2019-06-27 09:32:08 INFO Table dropped
2019-06-27 09:32:08 INFO Done migrating `dba_test`.`t1`
2019-06-27 09:32:08 INFO Removing socket file: /tmp/gh-ost.dba_test.t1.sock
2019-06-27 09:32:08 INFO Tearing down inspector
2019-06-27 09:32:08 INFO Tearing down applier
2019-06-27 09:32:08 INFO Tearing down streamer
2019-06-27 09:32:08 INFO Tearing down throttler
# Done