一、实验背景
canal有一个参数canal.instance.standby.address可以指定源端数据库的从库为备选数据库,当源端master宕机后,canal仍能指向备库进行同步。
但是发现canal 1.1.4不兼容mariadb的gtid:
当canal instance里指定了gtid(示例:canal.instance.master.gtid=0-1-146),重启canal instance会报错,不识别这种gtid格式:
ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:ff-test[java.lang.RuntimeException: parseUUIDSet failed due to wrong format: 0-1-146
- 当canal instance里指定了binlog名称及position,canal instance不报错,但是识别不到gtid:
2021-04-22 17:17:42.168 [destination = ff-test , address = /192.168.144.246:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=slave-bin.000009,position=4585,serverId=1,gtid=,timestamp=1619016171000] cost : 238ms , the next step is binlog dump
- 当主库宕机,发生主从切换后,canal由于识别不到gtid,因此按时间戳找位置:
2021-04-22 17:42:37.002 [destination = ff-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2021-04-22 17:42:37.094 [destination = ff-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - prepare to find start position by switch ::1619083861000
2021-04-22 17:42:41.910 [destination = ff-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=master-bin.000013,position=7080,serverId=1,gtid=,timestamp=1619021230000] cost : 4400ms , the next step is binlog dump
按时间戳找的位置是不准确的,经过多次验证,发现找的位置靠前,因此会重复执行一些SQL,造成数据混乱。
Canal 1.1.5版本修复了mariadb gtid问题,现记录下实验过程。
二、实验步骤
2.1 配置canal server
## detecing config
canal.instance.detecting.enable = true
#canal.instance.detecting.sql = insert into retl.xdual values(1,now()) on duplicate key update x=now()
canal.instance.detecting.sql = select 1
canal.instance.detecting.interval.time = 30
canal.instance.detecting.retry.threshold = 4
canal.instance.detecting.heartbeatHaEnable = true
canal.instance.tsdb.enable = false
2.2 配置canal instance
canal.instance.mysql.slaveId=5 #不要和集群里现有的server_id重复
canal.instance.gtidon=true
canal.instance.master.address=192.168.144.245:3306 #源端ip:端口
canal.instance.tsdb.enable=false
canal.instance.standby.address =192.168.144.246:3306 #备库ip:端口
canal.instance.master.gtid=0-1-165 #可在主库上执行SELECT @@gtid_binlog_pos;查询
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal
canal.instance.filter.regex=hh.* #同步的库表
启动canal instance后,可看到识别到gtid了:
2021-04-22 17:57:46.687 [destination = hh-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2021-04-22 17:57:46.724 [destination = hh-test , address = /192.168.144.245:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=,position=<null>,serverId=<null>,gtid=0-1-165,timestamp=<null>] cost : 1ms , the next step is binlog dump
2.3 验证mariadb主从切换
关掉主库,canal instance日志显示主从切换过程:
2021-04-22 18:03:41.114 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - try to ha switch, old:192.168.144.245/192.168.144.245:3306, new:/192.168.144.246:3306
2021-04-22 18:03:41.114 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] ERROR com.alibaba.otter.canal.common.alarm.LogAlarmHandler - destination:hh-test[try to ha switch, old:192.168.144.245/192.168.144.245:3306, new:/192.168.144.246:3306]
2021-04-22 18:03:41.126 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table filter : ^hh.*$
2021-04-22 18:03:41.126 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] WARN c.a.o.canal.parse.inbound.mysql.dbsync.LogEventConvert - --> init table black filter :
2021-04-22 18:03:41.127 [destination = hh-test , address = /192.168.144.245:3306 , HeartBeatTimeTask] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - connect failed by
java.io.IOException: connect /192.168.144.245:3306 failure
……
2021-04-22 18:03:41.183 [destination = hh-test , address = /192.168.144.246:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> begin to find start position, it will be long time for reset or first position
2021-04-22 18:03:41.184 [destination = hh-test , address = /192.168.144.246:3306 , EventParser] WARN c.a.o.c.p.inbound.mysql.rds.RdsBinlogEventParserProxy - ---> find start position successfully, EntryPosition[included=false,journalName=master-bin.000015,position=4288,serverId=1,gtid=0-1-171,timestamp=1619085735000] cost : 0ms , the next step is binlog dump
经测试,找的位置是准确的。
但是canal 1.1.5目前发现有一点不足的地方,canal server重启后,canal client不会自动连canal server,canal client停止往目标库同步,一直在报错:
AdapterProcessor - com.alibaba.otter.canal.protocol.exception.CanalClientException: java.io.IOException: Broken pipe Error sync but ACK!
需要重启canal client才行。
不知是我哪里没配对,还是canal 1.1.5目前还不完善。
————————————————
版权声明:本文为CSDN博主「雅冰石」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/yabingshi_tech/article/details/116002271