• Canal实战


    疑惑点

    本教程使用版本

    • Canal版本:1.1.5
    • Canal-Adapter版本:1.1.5
    • Eleasticsearch版本:7.10.2
    • MySQL版本:5.7.32

    背景


    MySQL表数据达到百万级后,复杂的查询导致响应时间很慢,计划用ES充当二级缓存。

    整体架构如下:

    整体架构

    canal可以通过监听MySQL的binlog日志来实现数据库增量的日志解析,再通过ClientAdapter把数据实时同步给ES。

    Canal工作流程图:

    Canal工作流程图

    Canal工作原理:

    • canal模拟MySQL slave的交互协议,伪装自己为MySQL slave,向MySQL master发送dump协议;
    • MySQL master收到dump请求,开始推送binary log给slave(即canal);
    • canal解析binary log对象(原始为byte流)。

    操作流程


    1. 准备工作:
      1. MySQL:存放原始数据;
      2. canal:解析数据库日志,同步获取到的增量变更;
      3. canal-adater:解析转化数据到ES;
      4. ES:接收增量数据;
    2. 在ES中创建索引
      1. 要求Mapping中定义的字段名称和类型与待同步数据保持一致;
    3. 安装JDK:Canal基于Java开发,且版本≥1.8.0
    4. 安装并启动canal-server
      1. 安装canal-server,然后修改配置文件关联MySQL。canal-server模拟MySQL集群的一个slave,获取MySQL集群Master节点的二进制日志(binary log),并将日志推送到Canal-adapter。
    5. 安装并启动canal-adapter
      1. 安装canal-adapter,然后修改配置文件关联MySQL和ES,以及定义MySQL数据到ES数据的映射字段,用来将数据同步到ES。
    6. 验证增量数据同步

    配置MySQL

    查看MySQL是否开启了binlog:

    SHOW VARIABLES LIKE 'log_bin';
    

    需要先开启Binlog写入功能,配置binlog-format为ROW模式,my.cnf配置如下:

    [mysqld]
    log-bin=mysql-bin # 开启 binlog
    binlog-format=ROW # 选择 ROW 模式
    server_id=1 # 配置 MySQL replaction 需要定义,不要和 canal 的 slaveId 重复
    

    授权 canal 链接 MySQL 账号具有作为 MySQL slave 的权限, 如果已有账户可直接 grant

    CREATE USER canal IDENTIFIED BY 'canal';  
    GRANT SELECT, REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'canal'@'%';
    -- GRANT ALL PRIVILEGES ON *.* TO 'canal'@'%' ;
    FLUSH PRIVILEGES;
    

    Canal安装配置

    1. 下载Canal

    https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.deployer-1.1.5.tar.gz

    canal.deployer对应的是canal的server端,负责订阅并解析MySQL-Binlog。

    解压后的文件目录

    canal1.1.5
    ├── bin
    │   │── restart.sh    # 重启
    │   │── startup.sh    # 启动canal
    │   │── stop.sh       # 停止canal
    │   └── ...
    ├── conf
    │   │── canal.properties  # 配置文件
    │   │── example
    │   │   ├── h2.mv.db
    │   │   ├── instance.properties  # 实例配置文件
    │   │   └── meta.dat
    │   │── logback.xml
    │   └── ...
    ├── lib
    ├── logs
    │   ├── canal
    │   │   ├── canal.log
    │   │   └── canal_stdout.log
    │   └── example
    │       ├── meta.log
    │       └── example.log
    └── plugin
    
    1. 配置Canal

    修改conf/example/instance.properties文件

    # position info
    canal.instance.master.address=127.0.0.1:3306    # 数据库地址
    canal.instance.master.journal.name=
    canal.instance.master.position=
    canal.instance.master.timestamp=
    canal.instance.master.gtid=
    
    canal.instance.dbUsername=root                   # 连接数据库用户名
    canal.instance.dbPassword=123456                 # 连接数据库密码
    canal.instance.connectionCharset = UTF-8
    
    canal.instance.filter.regex=blog.user
    
    1. 启动
    ./bin/startup.sh
    

    Canal-adapter安装配置

    1. 下载Canal adapter

    https://github.com/alibaba/canal/releases/download/canal-1.1.5/canal.adapter-1.1.5.tar.gz

    解压后的文件目录

    canal.adapter-1.1.5
    ├── bin
    │   │── restart.sh    # 重启
    │   │── startup.sh    # 启动canal
    │   │── stop.sh       # 停止canal
    │   └── ...
    ├── conf
    │   │── application.yml  # 配置文件
    │   │── bootstrap.yml
    │   │── es7
    │   │   ├── customer.yml
    │   │   └── ...
    │   └── ...
    ├── lib
    ├── logs
    └── plugin
    
    1. 配置adapter

    修改conf/application.yml文件

    server:
      port: 8081
    spring:
      jackson:
        date-format: yyyy-MM-dd HH:mm:ss
        time-zone: GMT+8
        default-property-inclusion: non_null
    
    canal.conf:
      mode: tcp #tcp kafka rocketMQ rabbitMQ
      flatMessage: true
      zookeeperHosts:
      syncBatchSize: 1000
      retries: 0
      timeout:
      accessKey:
      secretKey:
      consumerProperties:
      srcDataSources:
        defaultDS:
          url: jdbc:mysql://127.0.0.1:3306/blog?useUnicode=true
          username: root
          password: 123456
      canalAdapters:
        - instance: example # canal instance Name or mq topic name
          groups:
            - groupId: g1
              outerAdapters:
                - name: logger
                - name: es7
                  hosts: http://127.0.0.1:9200,http://10.192.8.204:9200,http://10.192.8.209:9200
                  properties:
                    mode: rest # or transport
                    # security.auth: test:123456 #  only used for rest mode
                    cluster.name: dev-cluster
    

    删除conf/es7/下的所有文件,并新建自己的yml文件。

    dataSourceKey: defaultDS        # 源数据源的key, 对应上面配置的srcDataSources中的值
    outerAdapterKey: exampleKey     # 对应application.yml中es配置的key 
    destination: example            # cannal的instance或者MQ的topic
    groupId:                        # 对应MQ模式下的groupId, 只会同步对应groupId的数据
    esMapping:
      _index: mytest_user           # es 的索引名称
      _type: _doc                   # es 的type名称, es7下无需配置此项
      _id: _id                      # es 的_id, 如果不配置该项必须配置下面的pk项_id则会由es自动分配
    #  pk: id                       # 如果不需要_id, 则需要指定一个属性为主键属性
      # sql映射
      sql: "select a.id as _id, a.name as _name, a.role_id as _role_id, b.role_name as _role_name,
            a.c_time as _c_time, c.labels as _labels from user a
            left join role b on b.id=a.role_id
            left join (select user_id, group_concat(label order by id desc separator ';') as labels from label
            group by user_id) c on c.user_id=a.id"
    #  objFields:
    #    _labels: array:;           # 数组或者对象属性, array:; 代表以;字段里面是以;分隔的
    #    _obj: object               # json对象
      etlCondition: "where a.c_time>='{0}'"     # etl 的条件参数
      commitBatch: 3000                         # 提交批大小
    
    1. 启动ES
    ./bin/startup.sh
    

    启动错误信息

    1. cannot be cast to com.alibaba.druid.pool.DruidDataSource


    2022-02-28 14:52:57.684 [main] INFO  c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## Start loading es mapping config ...
    2022-02-28 14:52:57.734 [main] INFO  c.a.o.c.client.adapter.es.core.config.ESSyncConfigLoader - ## ES mapping config loaded
    2022-02-28 14:52:58.037 [main] ERROR c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 failed
    java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassCastException: com.alibaba.druid.pool.DruidDataSource cannot be cast to com.alibaba.druid.pool.DruidDataSource
            at com.alibaba.otter.canal.client.adapter.es7x.ES7xAdapter.init(ES7xAdapter.java:54) ~[client-adapter.es7x-1.1.5-jar-with-dependencies.jar:na]
            at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.loadAdapter(CanalAdapterLoader.java:225) [client-adapter.launcher-1.1.5.jar:na]
            at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.init(CanalAdapterLoader.java:56) [client-adapter.launcher-1.1.5.jar:na]
            ...
            at com.alibaba.otter.canal.adapter.launcher.CanalAdapterApplication.main(CanalAdapterApplication.java:19) ~[client-adapter.launcher-1.1.5.jar:na]
    Caused by: java.lang.RuntimeException: java.lang.ClassCastException: com.alibaba.druid.pool.DruidDataSource cannot be cast to com.alibaba.druid.pool.DruidDataSource
            at com.alibaba.otter.canal.client.adapter.es.core.ESAdapter.init(ESAdapter.java:83) ~[client-adapter.es7x-1.1.5-jar-with-dependencies.jar:na]
            at com.alibaba.otter.canal.client.adapter.es7x.ES7xAdapter.init(ES7xAdapter.java:52) ~[client-adapter.es7x-1.1.5-jar-with-dependencies.jar:na]
            ... 42 common frames omitted
    Caused by: java.lang.ClassCastException: com.alibaba.druid.pool.DruidDataSource cannot be cast to com.alibaba.druid.pool.DruidDataSource
            at com.alibaba.otter.canal.client.adapter.es.core.ESAdapter.addSyncConfigToCache(ESAdapter.java:146) ~[client-adapter.es7x-1.1.5-jar-with-dependencies.jar:na]
            at com.alibaba.otter.canal.client.adapter.es.core.ESAdapter.init(ESAdapter.java:75) ~[client-adapter.es7x-1.1.5-jar-with-dependencies.jar:na]
            ... 43 common frames omitted
    

    在Github issues中找到#3717 解决方案

    下载源码压缩包canal-canal-1.1.5.zip,解压后用IDE打开项目,修改canal-canal-1.1.5/client-adapter/escore/pom.xml依赖

    <dependency>
        <groupId>com.alibaba</groupId>
        <artifactId>druid</artifactId>
        <scope>provided</scope>
    </dependency>
    

    然后在项目根目录下执行mvn clean package -Dmaven.test.skip=true,然后到canal-canal-1.1.5\client-adapter\es7x\target\目录下找到client-adapter.es7x-1.1.5-jar-with-dependencies.jar

    将这个文件复制到./canal_adapter/plugin/下,重启adapter。

    2. NullPointerException

    2022-03-01 09:45:29.443 [main] INFO  c.a.o.canal.adapter.launcher.loader.CanalAdapterLoader - Load canal adapter: es7 succeed
    2022-03-01 09:45:29.464 [main] INFO  c.alibaba.otter.canal.connector.core.spi.ExtensionLoader - extension classpath dir: /usr/local/canal/canal.adapter-1.1.5/plugin
    2022-03-01 09:45:29.524 [main] ERROR c.a.o.canal.adapter.launcher.loader.CanalAdapterService - ## something goes wrong when starting up the canal client adapters:
    java.lang.NullPointerException: null
            at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:782) ~[guava-22.0.jar:na]
            ...
            at com.alibaba.otter.canal.common.zookeeper.ZkClientx.getZkClient(ZkClientx.java:28) ~[canal.common-1.1.5.jar:na]
            at com.alibaba.otter.canal.connector.tcp.consumer.CanalTCPConsumer.init(CanalTCPConsumer.java:57) ~[connector.tcp-1.1.5-jar-with-dependencies.jar:na]
            at com.alibaba.otter.canal.adapter.launcher.loader.AdapterProcessor.<init>(AdapterProcessor.java:74) ~[client-adapter.launcher-1.1.5.jar:na]
            at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.lambda$init$0(CanalAdapterLoader.java:65) ~[client-adapter.launcher-1.1.5.jar:na]
            at java.util.HashMap.computeIfAbsent(HashMap.java:1127) ~[na:1.8.0_161]
            at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterLoader.init(CanalAdapterLoader.java:60) ~[client-adapter.launcher-1.1.5.jar:na]
            at com.alibaba.otter.canal.adapter.launcher.loader.CanalAdapterService.init(CanalAdapterService.java:60) ~[client-adapter.launcher-1.1.5.jar:na]
    

    Google了半天也没找到网上类似的问题,所以怀疑是自己哪里配置文件写错了。经过对比,发现是多删除了application.yml文件的配置信息。

    3. Connection refused

    2022-03-01 11:21:49.823 [Thread-4] INFO  c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - =============> Start to connect destination: example <=============
    2022-03-01 11:21:49.824 [Thread-4] ERROR c.a.otter.canal.adapter.launcher.loader.AdapterProcessor - process error!
    com.alibaba.otter.canal.protocol.exception.CanalClientException: java.net.ConnectException: Connection refused
            at com.alibaba.otter.canal.client.impl.SimpleCanalConnector.doConnect(SimpleCanalConnector.java:198) ~[na:na]
            at com.alibaba.otter.canal.client.impl.SimpleCanalConnector.connect(SimpleCanalConnector.java:115) ~[na:na]
            at com.alibaba.otter.canal.connector.tcp.consumer.CanalTCPConsumer.connect(CanalTCPConsumer.java:63) ~[na:na]
            at com.alibaba.otter.canal.adapter.launcher.loader.AdapterProcessor.process(AdapterProcessor.java:184) ~[client-adapter.launcher-1.1.5.jar:na]
            at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_161]
    Caused by: java.net.ConnectException: Connection refused
            at sun.nio.ch.Net.connect0(Native Method) ~[na:1.8.0_161]
            at sun.nio.ch.Net.connect(Net.java:454) ~[na:1.8.0_161]
            at sun.nio.ch.Net.connect(Net.java:446) ~[na:1.8.0_161]
            at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648) ~[na:1.8.0_161]
            at com.alibaba.otter.canal.client.impl.SimpleCanalConnector.doConnect(SimpleCanalConnector.java:150) ~[na:na]
    

    canal server中的canal.serverMode配置的是kafka,改为tcp后,一切正常。

    解决思路

    1. 提示拒绝连接,要分析哪个连接被拒绝了;
    2. 查看服务是否开放了该端口。命令:netstat -lntp

    Kafka模式

    配置canal.serverModel为kafka模式后,需要修改application.yml文件和

    canal.conf:
      mode: kafka #tcp kafka rocketMQ rabbitMQ
        kafka.bootstrap.servers: 10.192.1.5:9092,10.192.1.6:9092,10.192.1.7:9092,10.192.1.8:9092
      canalAdapters:
      - instance: topic_dev  # mq topic name
    

    修改映射文件,如vim es7/user.yml

    dataSourceKey: defaultDS
    destination: topic_dev      # mq topic name
    groupId: g1
    esMapping: 
    

    数据初始化导入

    全量导入:

    curl http://127.0.0.1:8081/etl/es7/crm_opportunity.yml -X POST
    

    统计导入结果:

    curl http://127.0.0.1:8081/count/es7/crm_opportunity.yml
    

    参考资料

  • 相关阅读:
    matlab练习程序(RGB2HSV、HSV2RGB)
    matlab练习程序(距离变换)
    C++生成xml(使用tinyxml)
    matlab练习程序(RGB2YUV、YUV2RGB)
    修改 Google Desktop 的缓存目录
    Access 2003 中 null 和 '' 空字符串的奇怪问题
    娃娃
    【js:片断】jQuery 设置 select 下拉框的选中状态
    由 TypeInitializationException 引起的问题
    afaca 分析报告
  • 原文地址:https://www.cnblogs.com/luyanliang/p/16041480.html
Copyright © 2020-2023  润新知