• Canal简介


    以下内容主要摘自Canal 官方wiki和网友博客:https://www.jianshu.com/p/6299048fad66

    一、背景

    早期,阿里巴巴B2B公司因为存在杭州和美国双机房部署,存在跨机房同步的业务需求。不过早期的数据库同步业务,主要是基于trigger的方式获取增量变更,不过从2010年开始,阿里系公司开始逐步的尝试基于数据库的日志解析,获取增量变更进行同步,由此衍生出了增量订阅&消费的业务,从此开启了一段新纪元。
    ps. 目前内部版本已经支持mysql和oracle部分版本的日志解析,当前的canal开源版本支持5.7及以下的版本(阿里内部mysql 5.7.13, 5.6.10, mysql 5.5.18和5.1.40/48)

    基于日志增量订阅&消费支持的业务:

    数据库镜像
    数据库实时备份
    多级索引 (卖家和买家各自分库索引)
    search build
    业务cache刷新
    价格变化等重要业务消息

    二、项目介绍

    名称:canal [kə'næl]
    译意: 水道/管道/沟渠
    语言: 纯java开发
    定位: 基于数据库增量日志解析,提供增量数据订阅&消费,目前主要支持了mysql

    三、工作原理

    1、mysql主备复制实现

     
    从上层来看,复制分成三步:
    master将改变记录到二进制日志(binary log)中(这些记录叫做二进制日志事件,binary log events,可以通过show binlog events进行查看);
    slave将master的binary log events拷贝到它的中继日志(relay log);
    slave重做中继日志中的事件,将改变反映它自己的数据。

    2、canal的工作原理:

     

    原理相对比较简单:
    canal模拟mysql slave的交互协议,伪装自己为mysql slave,向mysql master发送dump协议
    mysql master收到dump请求,开始推送binary log给slave(也就是canal)
    canal解析binary log对象(原始为byte流)

    四、canal安装

    1、下载并解压缩

    [root@master1 canal]#wget https://github.com/alibaba/canal/releases/download/v1.0.23/canal.deployer-1.0.23.tar.gz
    [root@master1 canal]#mkdir /root/canal
    [root@master1 canal]#tar zxvf canal.deployer-1.0.23.tar.gz  -C /root/canal
    

    2、创建mysql的canal用户

    mysql> CREATE USER 'canal'@'localhost' IDENTIFIED BY 'canal';
    Query OK, 0 rows affected (0.00 sec)
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'canal'@'localhost' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.01 sec)
    mysql> CREATE USER 'canal'@'%' IDENTIFIED BY 'canal';
    Query OK, 0 rows affected (0.00 sec)
    mysql> GRANT ALL PRIVILEGES ON *.* TO 'canal'@'%' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.00 sec)
    mysql> flush privileges;
    Query OK, 0 rows affected (0.00 sec)
    

    3、修改canal配置文件(如果是访问本机,并且用户密码都为canal则不需要修改配置文件)

    [root@master1 canal]# vi /root/canal/conf/example/instance.properties
    #################################################
    ## mysql serverId
    canal.instance.mysql.slaveId = 1234
    # position info
    canal.instance.master.address = 127.0.0.1:3306
    canal.instance.master.journal.name =
    canal.instance.master.position =
    canal.instance.master.timestamp =
    #canal.instance.standby.address =
    #canal.instance.standby.journal.name =
    #canal.instance.standby.position =
    #canal.instance.standby.timestamp =
    # username/password
    #修改这两项
    canal.instance.dbUsername = canal
    canal.instance.dbPassword = canal
    canal.instance.defaultDatabaseName =
    canal.instance.connectionCharset = UTF-8
    # table regex
    canal.instance.filter.regex = .*\..*
    # table black regex
    canal.instance.filter.black.regex =
    #################################################
    

    4、启动canal

    [root@master1 canal]# sh /root/canal/bin/startup.sh
    cd to /root/canal/bin for workaround relative path
    LOG CONFIGURATION : /root/canal/bin/../conf/logback.xml
    canal conf : /root/canal/bin/../conf/canal.properties
    CLASSPATH :/root/canal/bin/../conf:/root/canal/bin/../lib/zookeeper-3.4.5.jar:/root/canal/bin/../lib/zkclient-0.1.jar:/root/canal/bin/../lib/spring-2.5.6.jar:/root/canal/bin/../lib/slf4j-api-1.7.12.jar:/root/canal/bin/../lib/protobuf-java-2.6.1.jar:/root/canal/bin/../lib/oro-2.0.8.jar:/root/canal/bin/../lib/netty-3.2.5.Final.jar:/root/canal/bin/../lib/logback-core-1.1.3.jar:/root/canal/bin/../lib/logback-classic-1.1.3.jar:/root/canal/bin/../lib/log4j-1.2.14.jar:/root/canal/bin/../lib/jcl-over-slf4j-1.7.12.jar:/root/canal/bin/../lib/guava-18.0.jar:/root/canal/bin/../lib/fastjson-1.1.35.jar:/root/canal/bin/../lib/commons-logging-1.1.1.jar:/root/canal/bin/../lib/commons-lang-2.6.jar:/root/canal/bin/../lib/commons-io-2.4.jar:/root/canal/bin/../lib/commons-beanutils-1.8.2.jar:/root/canal/bin/../lib/canal.store-1.0.23.jar:/root/canal/bin/../lib/canal.sink-1.0.23.jar:/root/canal/bin/../lib/canal.server-1.0.23.jar:/root/canal/bin/../lib/canal.protocol-1.0.23.jar:/root/canal/bin/../lib/canal.parse.driver-1.0.23.jar:/root/canal/bin/../lib/canal.parse.dbsync-1.0.23.jar:/root/canal/bin/../lib/canal.parse-1.0.23.jar:/root/canal/bin/../lib/canal.meta-1.0.23.jar:/root/canal/bin/../lib/canal.instance.spring-1.0.23.jar:/root/canal/bin/../lib/canal.instance.manager-1.0.23.jar:/root/canal/bin/../lib/canal.instance.core-1.0.23.jar:/root/canal/bin/../lib/canal.filter-1.0.23.jar:/root/canal/bin/../lib/canal.deployer-1.0.23.jar:/root/canal/bin/../lib/canal.common-1.0.23.jar:/root/canal/bin/../lib/aviator-2.2.1.jar:.:/usr/java/jdk1.8.0_121/lib
    cd to /root/canal for continue
    

    5、关闭canal

    [root@master1 canal]# sh /root/canal/bin/stop.sh
    master1: stopping canal 16062 ... 
    Oook! cost:1
    

    6、相关日志位置

    [root@master1 canal]# cat /root/canal/logs/canal/canal.log
    [root@master1 canal]# cat /root/canal/logs/example/example.log
    

    五、编写代码读取binlog

    1、创建maven项目

     
    Paste_Image.png

    2、pom.xml增加依赖

            <dependency>
                <groupId>com.alibaba.otter</groupId>
                <artifactId>canal.client</artifactId>
                <version>1.0.12</version>
            </dependency>
    

    3、编写ClientSample类订阅binlog

    >package com.chainfin.canal;
    >
    >import java.net.InetSocketAddress;
    >import java.util.List;
    >
    >import com.alibaba.otter.canal.client.CanalConnector;
    import com.alibaba.otter.canal.client.CanalConnectors;
    import com.alibaba.otter.canal.protocol.CanalEntry.Column;
    import com.alibaba.otter.canal.protocol.CanalEntry.Entry;
    import com.alibaba.otter.canal.protocol.CanalEntry.EntryType;
    import com.alibaba.otter.canal.protocol.CanalEntry.EventType;
    import com.alibaba.otter.canal.protocol.CanalEntry.RowChange;
    import com.alibaba.otter.canal.protocol.CanalEntry.RowData;
    import com.alibaba.otter.canal.protocol.Message;
    >
    /**
     * 
     * @author jinxiaoxin
     *
     */
    public class ClientSample {
        public static void main(String[] args) {
            // 创建链接
            CanalConnector connector = CanalConnectors.newSingleConnector(
                    new InetSocketAddress("10.105.10.121", 11111), "example", "",
                    "");// AddressUtils.getHostIp(),
            int batchSize = 1000;
            int emptyCount = 0;
            try {
                connector.connect();
                connector.subscribe("test\..*");// .*代表database,..*代表table
                connector.rollback();//
                int totalEmptyCount = 120;
                while (emptyCount < totalEmptyCount) {
                    Message message = connector.getWithoutAck(batchSize); // 获取指定数量的数据
                    long batchId = message.getId();
                    int size = message.getEntries().size();
                    if (batchId == -1 || size == 0) {
                        emptyCount++;
                        // System.out.println("empty count : " + emptyCount);
                        try {
                            Thread.sleep(1000);
                        } catch (InterruptedException e) {
                        }
                    } else {
                        emptyCount = 0;
                        // System.out.printf("message[batchId=%s,size=%s] 
    ",
                        // batchId, size);
                        printEntry(message.getEntries());
                    }
                    connector.ack(batchId); // 提交确认
                    // connector.rollback(batchId); // 处理失败, 回滚数据
                }
                System.out.println("empty too many times, exit");
            } finally {
                connector.disconnect();
            }
        }
    >
        private static void printEntry(List<Entry> entrys) {
            for (Entry entry : entrys) {
                if (entry.getEntryType() == EntryType.TRANSACTIONBEGIN
                        || entry.getEntryType() == EntryType.TRANSACTIONEND) {
                    continue;
                }
                RowChange rowChage = null;
                try {
                    rowChage = RowChange.parseFrom(entry.getStoreValue());
                } catch (Exception e) {
                    throw new RuntimeException(
                            "ERROR ## parser of eromanga-event has an error,data:"
                                    + entry.toString(), e);
                }
                EventType eventType = rowChage.getEventType();
                System.out
                        .println(String
                                .format("================> binlog[%s:%s] ,name[%s,%s] , eventType : %s",
                                        entry.getHeader().getLogfileName(), entry
                                                .getHeader().getLogfileOffset(),
                                        entry.getHeader().getSchemaName(), entry
                                                .getHeader().getTableName(),
                                        eventType));
                for (RowData rowData : rowChage.getRowDatasList()) {
                    if (eventType == EventType.DELETE) {
                        printColumn(rowData.getBeforeColumnsList());
                    } else if (eventType == EventType.INSERT) {
                        printColumn(rowData.getAfterColumnsList());
                    } else {
                        System.out.println("-------> before");
                        printColumn(rowData.getBeforeColumnsList());
                        System.out.println("-------> after");
                        printColumn(rowData.getAfterColumnsList());
                    }
                }
            }
        }
    >
        private static void printColumn(List<Column> columns) {
            for (Column column : columns) {
                System.out.println(column.getName() + " : " + column.getValue()
                        + "    update=" + column.getUpdated());
            }
        }
    }
    >
    

    4、MYSQL进行增删改查

    mysql> CREATE TABLE `xdual` (
        ->        `ID` int(11) NOT NULL AUTO_INCREMENT,
        ->        `X` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
        ->     PRIMARY KEY (`ID`)
        ->      ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 ;
    Query OK, 0 rows affected (0.01 sec)
    mysql> insert into xdual(id,x) values(1,now());
    Query OK, 1 row affected (0.01 sec)
    

    5、eclipse控制台可以看到如下输出

     
    Paste_Image.png

    六、canal集群搭建

    1、需要安装zookeeper

    自行了解安装及其使用,此处略过

    2、修改canal配置文件

    修改两个配置

    [root@master1 canal]# vi /root/canal/conf/canal.properties
    # 用逗号隔开且不留空格 
    canal.zkServers=10.105.10.123:2181,10.105.10.124:2181,10.105.10.125:2181
    canal.instance.global.spring.xml = classpath:spring/default-instance.xml
    

    3、拷贝canal到另一台机器并解压缩(作为从节点)

    [root@master1 canal]# scp canal.deployer-1.0.23.tar.gz root@10.105.10.122:/root
    canal.deployer-1.0.23.tar.gz          100% 9586KB   9.4MB/s   00:00   
    [root@master2 ~]# mkdir canal
    [root@master2 ~]# tar -zxvf canal.deployer-1.0.23.tar.gz -C /root/canal                           
    

    4、修改slave配置文件

    [root@master2 canal]# vi /root/canal/conf/example/instance.properties
    ## mysql serverId修改与master不同即可
    canal.instance.mysql.slaveId = 1235
    # position info修改成mysql地址即可
    canal.instance.master.address = 10.105.10.121:3306
    [root@master2 canal]# vi /root/canal/conf/canal.properties
    # 用逗号隔开且不留空格 
    canal.zkServers=10.105.10.123:2181,10.105.10.124:2181,10.105.10.125:2181
    canal.instance.global.spring.xml = classpath:spring/default-instance.xml
    

    5、两台机器分别启动canal

    [root@master2 canal]#  sh /root/canal/bin/startup.sh
    

    6、查看canal在zk中的状态

    [zk: localhost:2181(CONNECTED) 11] get /otter/canal/destinations/example/running
    {"active":true,"address":"10.105.10.121:11111","cid":1}
    cZxid = 0x250000f037
    ctime = Thu Apr 13 13:42:04 CST 2017
    mZxid = 0x250000f037
    mtime = Thu Apr 13 13:42:04 CST 2017
    pZxid = 0x250000f037
    cversion = 0
    dataVersion = 0
    aclVersion = 0
    ephemeralOwner = 0x15b5b11e9cd16e1
    dataLength = 55
    numChildren = 0
    

    10.105.10.121为active,所以10.105.10.122位stangby节点

    7、测试HA是否配置成功

    stop掉10.105.10.121这台机器

    [root@master1 canal]# sh /root/canal/bin/stop.sh
    master1: stopping canal 6107 ... 
    Oook! cost:1
    
    [zk: localhost:2181(CONNECTED) 28] get /otter/canal/destinations/example/running
    {"active":true,"address":"10.105.10.122:11111","cid":1}
    cZxid = 0x250000f0a7
    ctime = Thu Apr 13 13:46:10 CST 2017
    mZxid = 0x250000f0a7
    mtime = Thu Apr 13 13:46:10 CST 2017
    pZxid = 0x250000f0a7
    cversion = 0
    dataVersion = 0
    aclVersion = 0
    ephemeralOwner = 0x35b5b11e9d816cb
    dataLength = 55
    

    很明显active已经切换到10.105.10.122上了

    8、客户端代码

    //创建connector 为集群connector 
    CanalConnector connector = CanalConnectors.newClusterConnector("10.105.10.123:2181,10.105.10.124:2181,10.105.10.125:2181","example", "", "");
    

    启动代码


     
    Paste_Image.png

    9、错误解决

    当这个配置的ip不配成127.0.0.1时(也就是本地)
    canal.instance.master.address = 10.105.10.121:3306
    查看canal配置文件,会报如下错误导致,无法读取binlog

    [root@master1 example]# tail -f example.log
    2017-04-13 14:29:23.646 [destination = example , address = master1/10.105.10.121:3306 , EventParser] ERROR c.a.o.canal.parse.inbound.mysql.dbsync.DirectLogFetcher - I/O error while reading from client socket
    java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file
        at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:95) ~[canal.parse-1.0.23.jar:na]
        at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:121) [canal.parse-1.0.23.jar:na]
        at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:209) [canal.parse-1.0.23.jar:na]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
    

    解决办法:
    登录mysql

    mysql> show variables like "%sum%";
    +---------------------------+-------+
    | Variable_name             | Value |
    +---------------------------+-------+
    | binlog_checksum           | CRC32 |
    | innodb_checksum_algorithm | crc32 |
    | innodb_checksums          | ON    |
    | innodb_log_checksums      | ON    |
    | master_verify_checksum    | OFF   |
    | slave_sql_verify_checksum | ON    |
    +---------------------------+-------+
    6 rows in set (0.00 sec)
    

    查询资料发现mysql版本为5.6时,
    这个错误一般出现在master5.6,slave在低版本的情况下。这是由于5.6使用了crc32做binlog的checksum;
    当一个event被写入binary log(二进制日志)的时候,checksum也同时写入binary log,然后在event通过网络传输到从服务器(slave)之后,再在从服务器中对其进行验证并写入从服务器的relay log.
    由于每一步都记录了event和checksum,所以我们可以很快地找出问题所在。
    在master1中设置binlog_checksum =none;

    mysql> set global binlog_checksum='NONE';
    Query OK, 0 rows affected (0.00 sec)
    mysql>  show variables like "%sum%";
    +---------------------------+-------+
    | Variable_name             | Value |
    +---------------------------+-------+
    | binlog_checksum           | NONE  |
    | innodb_checksum_algorithm | crc32 |
    | innodb_checksums          | ON    |
    | innodb_log_checksums      | ON    |
    | master_verify_checksum    | OFF   |
    | slave_sql_verify_checksum | ON    |
    +---------------------------+-------+
    6 rows in set (0.01 sec)
    

    这样错误就可以解决了

    七、properties配置文件参数解释

    properties配置分为两部分:

    canal.properties  (系统根配置文件)
    instance.properties  (instance级别的配置文件,每个instance一份)
    

    1、canal.properties

    canal.destinations  #当前server上部署的instance列表 
    canal.conf.dir  #conf/目录所在的路径   
    canal.auto.scan #开启instance自动扫描
    #如果配置为true,canal.conf.dir目录下的instance配置变化会自动触发:
    #a. instance目录新增: 触发instance配置载入,lazy为true时则自动启动
    #b. instance目录删除:卸载对应instance配置,如已启动则进行关闭
    #c. instance.properties文件变化:reload instance配置,如已启动自动进行重启操作
    canal.auto.scan.interval    #instance自动扫描的间隔时间,单位秒  
    canal.instance.global.mode  #全局配置加载方式   
    canal.instance.global.lazy  #全局lazy模式   
    canal.instance.global.manager.address   #全局的manager配置方式的链接信息    无
    canal.instance.global.spring.xml    #全局的spring配置方式的组件文件 
    canal.instance.example.mode
    canal.instance.example.lazy
    canal.instance.example.spring.xml
    #instance级别的配置定义,如有配置,会自动覆盖全局配置定义模式
    #命名规则:canal.instance.{name}.xxx 无
    

    2、instance.properties

    canal.id    #每个canal server实例的唯一标识,暂无实际意义   
    canal.ip    #canal server绑定的本地IP信息,如果不配置,默认选择一个本机IP进行启动服务   
    canal.port  #canal server提供socket服务的端口  
    canal.zkServers #canal server链接zookeeper集群的链接信息
    #例子:10.20.144.22:2181,10.20.144.51:2181 
    canal.zookeeper.flush.period    #canal持久化数据到zookeeper上的更新频率,单位毫秒    
    canal.instance.memory.batch.mode    #canal内存store中数据缓存模式
    #1. ITEMSIZE : 根据buffer.size进行限制,只限制记录的数量
    #2. MEMSIZE : 根据buffer.size  * buffer.memunit的大小,限制缓存记录的大小  
    canal.instance.memory.buffer.size   #canal内存store中可缓存buffer记录数,需要为2的指数  
    canal.instance.memory.buffer.memunit    #内存记录的单位大小,默认1KB,和buffer.size组合决定最终的内存使用大小  
    canal.instance.transactionn.size    最大事务完整解析的长度支持
    超过该长度后,一个事务可能会被拆分成多次提交到canal store中,无法保证事务的完整可见性
    canal.instance.fallbackIntervalInSeconds    #canal发生mysql切换时,在新的mysql库上查找binlog时需要往前查找的时间,单位秒
    #说明:mysql主备库可能存在解析延迟或者时钟不统一,需要回退一段时间,保证数据不丢
    canal.instance.detecting.enable #是否开启心跳检查
    canal.instance.detecting.sql    #心跳检查sql    insert into retl.xdual values(1,now()) on duplicate key update x=now()
    canal.instance.detecting.interval.time  #心跳检查频率,单位秒
    canal.instance.detecting.retry.threshold    #心跳检查失败重试次数
    canal.instance.detecting.heartbeatHaEnable  #心跳检查失败后,是否开启自动mysql自动切换
    #说明:比如心跳检查失败超过阀值后,如果该配置为true,canal就会自动链到mysql备库获取binlog数据
    canal.instance.network.receiveBufferSize    #网络链接参数,SocketOptions.SO_RCVBUF 
    canal.instance.network.sendBufferSize   #网络链接参数,SocketOptions.SO_SNDBUF 
    canal.instance.network.soTimeout    #网络链接参数,SocketOptions.SO_TIMEOUT


    作者:献给记性不好的自己
    链接:https://www.jianshu.com/p/6299048fad66
    來源:简书
    著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
  • 相关阅读:
    在TreeView控件节点中显示图片
    PAT 甲级 1146 Topological Order (25 分)
    PAT 甲级 1146 Topological Order (25 分)
    PAT 甲级 1145 Hashing
    PAT 甲级 1145 Hashing
    PAT 甲级 1144 The Missing Number (20 分)
    PAT 甲级 1144 The Missing Number (20 分)
    PAT 甲级 1151 LCA in a Binary Tree (30 分)
    PAT 甲级 1151 LCA in a Binary Tree (30 分)
    PAT 甲级 1149 Dangerous Goods Packaging
  • 原文地址:https://www.cnblogs.com/maxigang/p/9134523.html
Copyright © 2020-2023  润新知