• 使用ogg同步数据给kafka


    使用ogg同步数据给kafka

    为更好的配合公司自然人项目的实施,我利用虚机搭了一个测试环境,把生产库中的部分表的数据通过ogg软件同步给kafka

    1 测试环境说明
    1)目标
    源数据库hr用户下的一张表t1,当发生dml操作时,操作数据写到kafka集群里,并显示出来。
    2)环境构成
    192.168.60.88 tdb1
    192.168.60.89 reps
    192.168.60.91 kafka01
    192.168.60.92 kafka02
    192.168.60.93 kafka03

    tdb1是源,Oracle数据库,版本是10.2.0.4,sid:tdb
    reps是接口服务器,用于安装ogg for bigdata软件,接收源端ogg pump进程发过来的数据,并把这些数据写入kafka集群
    kafka01-03这三台组成了一个kafka的集群
    所有这些服务器的os都是rhel 6.8
    tdb1,kafka01-03的jdk版本是1.7.0.79
    reps的jdk版本是1.8.0.121,因ogg for bigdata 12.2.xxx的版本需要jdk的版本是1.8以上,jdk 1.7会报版本不足的错
    这五台服务器都需关掉防火墙,关掉selinux

    2 kafka集群的安装

    1)下载软件
    zookeeper,版本3.4.10,文件名:zookeeper-3.4.10.tar.gz,下载地址:http://zookeeper.apache.org/releases.html
    kafka,版本:2.10-0.10.2.0,文件名:kafka_2.10-0.10.2.0.tgz,下载地址:http://kafka.apache.org/downloads
    2)准备工作
    kafka01-03这三台机器调整/etc/hosts文件

    1. [root@kafka01 ~]# cd /etc
    2. [root@kafka01 etc]# cat hosts
    3. 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    4. ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    5. 192.168.60.91 kafka01
    6. 192.168.60.92 kafka02
    7. 192.168.60.93 kafka03

    kafka01-03这三台机器都建一个名字为kafka的用户,这三台机器的kafka用户需要做ssh互信任,如何做可以baidu。
    kafka01主机,kafka用户示例:

    1. [kafka@kafka01 ~]$ pwd
    2. /home/kafka
    3. [kafka@kafka01 ~]$ id
    4. uid=500(kafka) gid=501(kafka) groups=501(kafka)
    5. [kafka@kafka01 ~]$
    6. [kafka@kafka01 ~]$ cat .bash_profile
    7. # .bash_profile
    8.  
    9. # Get the aliases and functions
    10. if [ -f ~/.bashrc ]; then
    11.         . ~/.bashrc
    12. fi
    13.  
    14. # User specific environment and startup programs
    15.  
    16. PATH=$PATH:$HOME/bin
    17.  
    18. export PATH
    19. export JAVA_HOME=/usr/java/jdk1.7.0_79
    20. export JRE_HOME=/usr/java/jdk1.7.0_79/jre
    21. export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
    22. export ZOOKEEPER_HOME=/home/kafaka/zookeeper
    23. export PATH=$JAVA_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH
    24. [kafka@kafka01 ~]$

    3)安装配置zookeeper
    如下操作在kafka01-03都要做,如下kafka01上做示例
    a.解压
    zookeeper-3.4.10.tar.gz解压后,目录重命名为:zookeeper,并挪到/home/kafka/下,效果如下:

    1. [kafka@kafka01 zookeeper]$ pwd
    2. /home/kafka/zookeeper
    3. [kafka@kafka01 zookeeper]$ ls
    4. bin        conf     data        docs             ivy.xml  LICENSE.txt  README_packaging.txt  recipes  zookeeper-3.4.10.jar      zookeeper-3.4.10.jar.md5   zookeeper.out
    5. build.xml  contrib  dist-maven  ivysettings.xml  lib      NOTICE.txt   README.txt            src      zookeeper-3.4.10.jar.asc  zookeeper-3.4.10.jar.sha1
    6. [kafka@kafka01 zookeeper]$

    b.配置zoo.cfg

    1. cd /home/kafka/ zookeeper
    2. cp zoo_sample.cfg zoo.cfg

    编辑zoo.cfg,内容如下:

    1. [kafka@kafka01 conf]$ pwd
    2. /home/kafka/zookeeper/conf
    3. zoo.cfg设置如下参数:
    4. dataDir=/home/kafka/zookeeper/data
    5. clientPort=2181
    6. server.1=kafka01:2888:3888
    7. server.2=kafka02:2888:3888
    8. server.3=kafka03:2888:3888

    c.设置节点标识

    1. cd /home/kafka/zookeeper
    2. mkdir data
    3. cd data
    4. vi myid
    5. 输入1
    6.  
    7. [kafka@kafka01 data]$ pwd
    8. /home/kafka/zookeeper/data
    9. [kafka@kafka01 data]$ cat myid
    10. 1

    d.启动Zookeeper

    1. [kafka@kafka01 bin]$ pwd
    2. /home/kafka/zookeeper/bin
    3. ./zkServer.sh start
    4.  
    5. 看状态:
    6. [kafka@kafka01 bin]$ ./zkServer.sh status
    7. ZooKeeper JMX enabled by default
    8. Using config: /home/kafka/zookeeper/bin/../conf/zoo.cfg
    9. Mode: follower
    10. 三台中一台Mode是leader,其余两台Mode为follower
    11.  
    12. 排错:
    13. 如果没启动,可以使用./zkServer.sh start-foreground启动,屏幕上会显示日志信息,能看出哪块出了问题。

    4)安装配置kafka
    如下操作在kafka01-03都要做,kafka01上示例如下:
    a.解压
    kafka_2.10-0.10.2.0.tgz解压后,目录重命名为:kafka,并挪到/home/kafka/下,效果如下:

    1. [kafka@kafka02 kafka]$ pwd
    2. /home/kafka/kafka
    3. [kafka@kafka02 kafka]$ ls
    4. bin  config  libs  LICENSE  logs  NOTICE  site-docs

    b.修改Kafka Servre配置文件

    1. cd /home/kafka/kafka/config
    2. vi server.properties 
    3.  
    4. [kafka@kafka01 config]$ cat server.properties --注:不需改的条目去掉了
    5. broker.id=1  #kafka01为1,kafka02为2,kafka03为3
    6. host.name=kafka01  #按主机名相应调整
    7. listeners=PLAINTEXT://kafka01:9092 #按主机名相应调整
    8. advertised.listeners=PLAINTEXT://kafka01:9092 #按主机名相应调整
    9. log.dirs=/home/kafka/kafka/logs
    10. num.partitions=4

    c.后台启动kakfa
    在集群中的这三个节点上分别后台启动Kafka,分别执行如下命令:

    1. cd /home/kafka/kafka/bin
    2. ./kafka-server-start.sh/home/kafka/kafka/config/server.properties &

    d.测试
    创建一个名称为oggtest的Topic,4个分区,并且复制因子为3:
    可以任意一节点

    1. ./kafka-topics.sh -create -zookeeper kafka01:2181,kafka02:2181,kafka03:2181 -replication-factor 3 -partitions 3 –topic oggtest

    查看创建的Topic,执行如下命令:
    可以任意一节点

    1. ./kafka-topics.sh -describe -zookeeper kafka01:2181,kafka02:2181,kafka03:2181 -topic oggtest
    2.  
    3. 查看所有topic:
    4. ./kafka-topics.sh -describe -zookeeper kafka01:2181,kafka02:2181,kafka03:2181

    可以通过Kafka自带的bin/kafka-console-producer.sh和bin/kafka-console-consumer.sh脚本,来验证演示如何发布消息、消费消息:
    在一个终端,启动Producer,并向我们上面创建的名称为oggtest的Topic中生产消息,执行如下脚本:

    1. ./kafka-console-producer.sh -broker-list kafka01:9092,kafka02:9092,kafka03:9092 -topic oggtest

    在另一个终端,启动Consumer,并订阅我们上面创建的Topic中生产的消息:

    1. ./kafka-console-consumer.sh –zookeeperkafka01:2181,kafka02:2181,kafka03:2181 –from-beginning –topic oggtest

    如果kafka集群配置的没有问题,随便在producer窗口敲入一些信息回车后,在consumer窗口便能看到相应的消息

    3 ogg源端(sdb1主机)的安装配置
    1)准备工作
    a.ogg 软件
    软件名:Oracle GoldenGate V11.2.1.0.3 for Oracle 11g on Linux x86-64.zip,在https://edelivery.oracle.com可以下载到
    b.源数据库要开归档,置成force logging,开追加日志

    1. [oracle@tdb1 ~]$ sqlplus / as sysdba
    2. SQL> archive log list
    3. Database log mode              Archive Mode
    4. Automatic archival             Enabled
    5. Archive destination            /oracle/arc
    6. Oldest online log sequence     9
    7. Next log sequence to archive   11
    8. Current log sequence           11
    9. SQL>
    10. SQL> Select 
    11.   2  SUPPLEMENTAL_LOG_DATA_MIN
    12.   3  ,SUPPLEMENTAL_LOG_DATA_PK
    13.   4  ,SUPPLEMENTAL_LOG_DATA_UI
    14.   5  ,SUPPLEMENTAL_LOG_DATA_ALL
    15.   6  , FORCE_LOGGING from v$database;
    16.  
    17. SUPPLEME SUP SUP SUP FOR
    18. -------- --- --- --- ---
    19. YES      NO  NO  NO  YES
    20.  
    21. SQL>

    c.推荐添加一ogg用户,以避免对oracle用户的影响,放在oracle用户的主group,根据数据extract的量为ogg用户建立一个合适大小的文件系统
    ogg用户最终效果示例:

    1. [root@tdb1 ~]# su -  ogg
    2. [ogg@tdb1 ~]$ id
    3. uid=501(ogg) gid=500(dba) groups=500(dba)
    4. [ogg@tdb1 ~]$
    5. [ogg@tdb1 ~]$ cat .bash_profile
    6. # .bash_profile
    7.  
    8. # Get the aliases and functions
    9. if [ -f ~/.bashrc ]; then
    10.         . ~/.bashrc
    11. fi
    12.  
    13. # User specific environment and startup programs
    14.  
    15. PATH=$PATH:$HOME/bin
    16.  
    17. export PATH
    18.  
    19. umask 022
    20. export ORACLE_BASE=/oracle/app/oracle
    21. export ORACLE_HOME=$ORACLE_BASE/product/10.2.0
    22. export ORACLE_SID=tdb
    23. export PATH=$ORACLE_HOME/bin:$PATH:.
    24. export NLS_LANG=AMERICAN_AMERICA.ZHS16GBK
    25. export NLS_DATE_FORMAT=YYYYMMDDHH24MISS
    26. export DISPLAY=192.168.60.1:0.0
    27. #ogg
    28. export GG_HOME=/ogg
    29. #export PATH=$PATH:$GG_HOME
    30. export LD_LIBRARY_PATH=/ogg:$ORACLE_HOME/lib
    31. [ogg@tdb1 ~]$
    32. [ogg@tdb1 ~]$ sqlplus / as sysdba
    33.  
    34. SQL*Plus: Release 10.2.0.5.0 - Production on Thu Apr 13 11:10:59 2017
    35.  
    36. Copyright (c) 1982, 2010, Oracle.  All Rights Reserved.
    37.  
    38.  
    39. Connected to:
    40. Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
    41. With the Partitioning, Data Mining and Real Application Testing options
    42.  
    43. SQL>

    d.数据库内建立ogg用户,并给其授权

    1. create tablespace ogg datafile '/oracle/oradata/tdb/ogg.dbf' size 1G;
    2. create user ogg identified by gg_888 default tablespace ogg;
    3. grant connect,resource to ogg;
    4. grant dba to ogg; --如果不做ddl trigger,dba权限可以不给
    5.  
    6. GRANT CREATE SESSION TO ogg;
    7. GRANT ALTER SESSION  TO ogg;
    8. GRANT SELECT ANY DICTIONARY TO ogg;
    9. GRANT SELECT ANY TABLE      TO ogg;
    10. GRANT ALTER ANY TABLE       TO ogg; --用户配置表级追加日志
    11.  
    12. GRANT FLASHBACK ANY TABLE   TO ogg;
    13. GRANT EXECUTE on DBMS_FLASHBACK package TO ogg;
    14. GRANT EXECUTE on DBMS_FLASHBACK TO ogg;
    15. GRANT EXECUTE ON utl_file TO ogg;
    16. execute DBMS_STREAMS_AUTH.GRANT_ADMIN_PRIVILEGE('OGG');
    17. grant execute on sys.dbms_lob to ogg;
    18.  
    19. --如下pl/sql块是在oracle 11g之上版本用的,10g版本不需要执行
    20. BEGIN
    21. DBMS_GOLDENGATE_AUTH.GRANT_ADMIN_PRIVILEGE(
    22. Grantee                 => 'OGG',
    23. privilege_type          => 'CAPTURE',
    24. grant_select_privileges => TRUE,
    25. do_grants               => TRUE);
    26. END;
    27. /

    e.为了测试,我建立了一个hr用户,并在其下面建了一个t1的表

    1. -- Create table
    2. create table T1
    3. (
    4.   id   NUMBER not null,
    5.   name VARCHAR2(100)
    6. )
    7. tablespace USERS;
    8. -- Create/Recreate primary, unique and foreign key constraints
    9. alter table T1
    10.   add constraint PK_T1_ID primary key (ID)
    11.   using index
    12.   tablespace USERS;

    2)配置ogg
    a.解压缩ogg软件,放在$GG_HOME下
    效果类似如下:

    1. [ogg@tdb1 ogg]$ ls -l gg*
    2. -rwxr-x--- 1 ogg dba  6577392 Aug 24  2012 ggcmd
    3. -rw-r----- 1 ogg dba     1841 Apr 12 15:58 gglog-defgen.dmp
    4. -rw-r----- 1 ogg dba     1239 Apr 12 16:40 gglog-DPE_TEST-43680.dmp
    5. -rw-r----- 1 ogg dba      962 Apr 12 16:49 gglog-DPE_TEST-43782.dmp
    6. -rw-r----- 1 ogg dba        0 Apr 12 16:40 gglog-DPE_TEST.dmp
    7. -rw-r----- 1 ogg dba  1280336 Aug 24  2012 ggMessage.dat
    8. -rwxr-x--- 1 ogg dba 13899588 Aug 24  2012 ggsci
    9. -rw-rw-rw- 1 ogg dba    21819 Apr 13 08:47 ggserr.log
    10. [ogg@tdb1 ogg]$

    b.创建ogg相关子目录

    1. [ogg@tdb1 ogg]$ pwd
    2. /ogg
    3. [ogg@tdb1 ogg]$ ./ggsci
    4. GGSCI>create subdirs

    ggsci报错处理:

    1. [ogg@tdb1 ogg]$ ggsci
    2. ggsci: error while loading shared libraries: libnnz11.so: cannot open shared object file: No such file or directory
    3. [ogg@tdb1 ogg]$ ldd ggsci
    4.         linux-vdso.so.1 =>  (0x00007ffd3db73000)
    5.         libdl.so.2 => /lib64/libdl.so.2 (0x00000035bbc00000)
    6.         libgglog.so => /ogg/./libgglog.so (0x00007ff824130000)
    7.         libggrepo.so => /ogg/./libggrepo.so (0x00007ff823fdc000)
    8.         libdb-5.2.so => /ogg/./libdb-5.2.so (0x00007ff823d3b000)
    9.         libicui18n.so.38 => /ogg/./libicui18n.so.38 (0x00007ff8239da000)
    10.         libicuuc.so.38 => /ogg/./libicuuc.so.38 (0x00007ff8236a1000)
    11.         libicudata.so.38 => /ogg/./libicudata.so.38 (0x00007ff8226c5000)
    12.         libpthread.so.0 => /lib64/libpthread.so.0 (0x00000035bc400000)
    13.         libxerces-c.so.28 => /ogg/./libxerces-c.so.28 (0x00007ff8221ad000)
    14.         libantlr3c.so => /ogg/./libantlr3c.so (0x00007ff822097000)
    15.         libnnz11.so => not found
    16.         libclntsh.so.11.1 => not found
    17.         libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00000035c7400000)
    18.         libm.so.6 => /lib64/libm.so.6 (0x00000035bcc00000)
    19.         libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00000035c7000000)
    20.         libc.so.6 => /lib64/libc.so.6 (0x00000035bc000000)
    21.         /lib64/ld-linux-x86-64.so.2 (0x00000035bb800000)
    22.         
    23. [oracle@tdb1 ~]$ cd $ORACLE_HOME/lib
    24. [oracle@tdb1 lib]$ ln -s libnnz10.so libnnz11.so
    25. [oracle@tdb1 lib]$ ln -s libclntsh.so  libclntsh.so.11.1
    26. [oracle@tdb1 lib]$ ls -l  libclntsh.so.11.1
    27. lrwxrwxrwx 1 oracle dba 12 Apr 11 22:33 libclntsh.so.11.1 -> libclntsh.so
    28. [oracle@tdb1 lib]$ ls -l  libnnz11.so
    29. lrwxrwxrwx 1 oracle dba       11 Apr 11 22:31 libnnz11.so -> libnnz10.so

    c.打开hr.t1表级附加日志

    1. [ogg@tdb1 ogg]$ ./ggsci
    2. GGSCI>Dblogin userid ogg, password gg_888
    3. Add trandata hr.t1

    d.配置ogg manager

    1. [ogg@tdb1 ogg]$ ./ggsci
    2. GGSCI>edit params mgr
    3. 内容如下,保存
    4. PORT 7809
    5. DYNAMICPORTLIST 7810-7860
    6. AUTORESTART EXTRACT *, RETRIES 5, WAITMINUTES 3, RESETMINUTES 60
    7. PURGEOLDEXTRACTS ./dirdat/*, usecheckpoints, minkeepdays 1
    8. LAGREPORTHOURS 1
    9. LAGINFOMINUTES 30
    10. LAGCRITICALMINUTES 45
    11.  
    12. 启动OGG manager
    13. GGSCI>start mgr
    14.  
    15. 查看manager进程状态,正确的形态如下:
    16. GGSCI (tdb1) 1> info mgr
    17. Manager is running (IP port tdb1.7809).

    e.创建Extract

    1. GGSCI>edit params ext_test
    2. 内容如下,保存
    3. EXTRACT ext_test
    4. Setenv (NLS_LANG="AMERICAN_AMERICA.ZHS16GBK")
    5. USERID ogg, PASSWORD gg_888
    6. gettruncates
    7. DISCARDFILE ./dirrpt/ext_test.dsc, APPEND, MEGABYTES 1024
    8. DBOPTIONS  ALLOWUNUSEDCOLUMN
    9. REPORTCOUNT EVERY 1 MINUTES, RATE
    10. WARNLONGTRANS 2h,CHECKINTERVAL 3m
    11. FETCHOPTIONS NOUSESNAPSHOT
    12. TRANLOGOPTIONS  CONVERTUCS2CLOBS
    13. EXTTRAIL ./dirdat/te
    14. WILDCARDRESOLVE DYNAMIC
    15. GETUPDATEBEFORES
    16. NOCOMPRESSUPDATES
    17. NOCOMPRESSDELETES
    18. dynamicresolution
    19. table hr.t1;
    20.  
    21. 添加抽取进程组:
    22. GGSCI>add extract ext_test, TRANLOG, BEGIN NOW
    23.  
    24. 定义trail文件:
    25. GGSCI> ADD EXTTRAIL ./dirdat/te, EXTRACT ext_test, MEGABYTES 200

    f.pump extract进程

    1. GGSCI>edit param dpe_test
    2. 内容如下,保存
    3. EXTRACT dpe_test
    4. PASSTHRU
    5. RMTHOST 192.168.60.89, MGRPORT 7809
    6. RMTTRAIL ./dirdat/te
    7. DYNAMICRESOLUTION
    8. TABLE hr.t1;
    9.  
    10. 添加pump捕获组:
    11. GGSCI> ADD EXTRACT dpe_test, EXTTRAILSOURCE ./dirdat/te
    12.  
    13. 定义pump trail文件:
    14. GGSCI> ADD RMTTRAIL ./dirdat/te, EXTRACT dpe_test, MEGABYTES 200

    g.启动捕获进程

    1. GGSCI> start extract ext_test;
    2. GGSCI> start extract dpe_test;
    3.  
    4. 看状态,如果如正就是对的:
    5. GGSCI> info all
    6.  
    7. Program     Status      Group       Lag at Chkpt  Time Since Chkpt
    8.  
    9. MANAGER     RUNNING                                           
    10. EXTRACT     RUNNING     DPE_TEST    00:00:00      00:00:03   
    11. EXTRACT     RUNNING     EXT_TEST    00:00:00      00:00:01

    4 接口机reps安装配置
    1)安装OGG for Big Data
    a.如源端类以,解压缩ogg for big data软件,放在$GG_HOME下
    b./etc/hosts文件

    1. [root@reps etc]# cd /etc
    2. [root@reps etc]# cat hosts
    3. 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    4. ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    5. 192.168.60.89 reps
    6. 192.168.60.91 kafka01
    7. 192.168.60.92 kafka02
    8. 192.168.60.93 kafka03
    9. [root@reps etc]#

    c.安装jdk 1.8及之以的版本
    ogg for big data 12.2.xx需要jdk 1.8以上的版本,我这里用的是1.8.0_121

    1. [ogg@reps ogg]$ java -version
    2. java version "1.8.0_121"
    3. Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
    4. Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

    d.创建ogg用户,配置环境变量,安装kafka软件

    1. [root@reps etc]# su - ogg
    2. [ogg@reps ~]$ id
    3. uid=500(ogg) gid=501(ogg) groups=501(ogg)
    4. [ogg@reps ~]$ cat .bash_profile
    5. # .bash_profile
    6.  
    7. # Get the aliases and functions
    8. if [ -f ~/.bashrc ]; then
    9.         . ~/.bashrc
    10. fi
    11.  
    12. # User specific environment and startup programs
    13.  
    14. PATH=$PATH:$HOME/bin
    15.  
    16. export PATH
    17. export OGG_HOME=/ogg
    18. export PATH=$PATH:$GG_HOME
    19. export LD_LIBRARY_PATH=$OGG_HOME:$JAVA_HOME/jre/lib/amd64/libjsig.so:$JAVA_HOME/jre/lib/amd64/server/libjvm.so:$JAVA_HOME/jre/lib/amd64/server:$JAVA_HOME/jre/lib/amd64
    20. [ogg@reps ~]$
    21. [ogg@reps ~]$ ls -l
    22. total 8
    23. drwxrwxr-x 2 ogg ogg 4096 Apr 11 22:56 install
    24. drwxr-xr-x 6 ogg ogg 4096 Feb 15 01:28 kafka   --把kafka软件包解压到这,也可以从kafka主机拷贝这个目录
    25. [ogg@reps ~]$
    26. [ogg@reps ~]$ cd /ogg
    27. [ogg@reps ogg]$ ls -l ggsci
    28. -rwxr-x--- 1 ogg ogg 39120528 Oct 20 07:05 ggsci
    29. [ogg@reps ogg]$

    2)配置OGG for kafka
    a.启动ogg,并创建相关子目录

    1. ./ggsci
    2. GGSCI>create subdirs

    b.复制example

    1. cp $OGG_HOME/AdapterExamples/big-data/kafka/* $OGG_HOME/dirprm/

    c.配置manager

    1. GGSCI>edit params mgr
    2. 内容如下:
    3. PORT 7809
    4. DYNAMICPORTLIST 7810-7860
    5. AUTORESTART REPLICAT *, RETRIES 5, WAITMINUTES 3, RESETMINUTES 60
    6. PURGEOLDEXTRACTS ./dirdat/*, usecheckpoints, minkeepdays 1
    7. LAGREPORTHOURS 1
    8. LAGINFOMINUTES 30
    9. LAGCRITICALMINUTES 45

    d.配置kafka.props
    内容如下:

    1. [ogg@reps dirprm]$ cat kafka.props
    2.  
    3. gg.handlerlist = kafkahandler
    4. gg.handler.kafkahandler.type = kafka
    5. gg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties
    6. gg.handler.kafkahandler.TopicName =oggtest
    7. gg.handler.kafkahandler.format =avro_op
    8. gg.handler.kafkahandler.format=delimitedtext
    9. gg.handler.kafkahandler.format.fieldDelimiter=|
    10. gg.handler.kafkahandler.SchemaTopicName=myoggtest
    11. gg.handler.kafkahandler.BlockingSend =false
    12. gg.handler.kafkahandler.includeTokens=false
    13.  
    14. gg.handler.kafkahandler.mode =op
    15. #gg.handler.kafkahandler.maxGroupSize =100, 1Mb
    16. #gg.handler.kafkahandler.minGroupSize =50, 500Kb
    17.  
    18.  
    19. goldengate.userexit.timestamp=utc
    20. goldengate.userexit.writers=javawriter
    21. javawriter.stats.display=TRUE
    22. javawriter.stats.full=TRUE
    23.  
    24. gg.log=log4j
    25. gg.log.level=INFO
    26.  
    27. gg.report.time=30sec
    28.  
    29. #Sample gg.classpath for Apache Kafka
    30. gg.classpath=dirprm/:/home/ogg/kafka/libs/*
    31. #Sample gg.classpath for HDP
    32. #gg.classpath=/etc/kafka/conf:/usr/hdp/current/kafka-broker/libs/*
    33.  
    34. javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=ggjava/ggjava.jar

    说明:
    gg.handler.kafkahandler.TopicName必须指定kafka端定义的topic
    gg.handler.kafkahandler.format下面配置使用文本,并用”|”相隔,kafka最终接收到如下格式的消息。
    gg.classpath须指定相应的lib路径
    e.配置custom_kafka_producer.properties
    内容如下:

    1. bootstrap.servers=kafka01:9092,kafka02:9092,kafka03:9092
    2. acks=1
    3. compression.type=gzip
    4. reconnect.backoff.ms=1000
    5.  
    6. value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
    7. key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
    8. # 100KB per partition
    9. batch.size=102400
    10. linger.ms=10000

    3)表结构传递
    源端:

    1. GGSCI> ggsci
    2. GGSCI> edit param defgen
    3. 内容
    4. DEFSFILE dirdef/source.def, PURGE
    5. USERID ogg, PASSWORD gg_888
    6. TABLE hr.t1 ;
    7.  
    8. [ogg@tdb1 ogg]$ defgen paramfile dirprm/defgen.prm  --shell命令

    把defgen.prm放到接口机(reps)的/ogg/dirdef/下
    4)定义replication
    a.定义参数

    1. ./ggsci
    2. GGSCI>edit params rep_test
    3. 输入如下内容:
    4. REPLICAT rep_test
    5. TARGETDB LIBFILE libggjava.so SET property=dirprm/kafka.props
    6. SOURCEDEFS dirdef/source.def
    7. REPORTCOUNT EVERY 1 MINUTES, RATE
    8. GROUPTRANSOPS 10000
    9. MAP hr.*, TARGET hr.*;

    b.指定Trail文件

    1. GGSCI> add replicat rep_test, exttrail ./dirdat/te

    c.启动replicat进程,并检查状态

    1. GGSCI> start replicat rep_test
    2.  
    3. 检查状态,类似如下输出表示正常
    4. GGSCI (reps) 1> info all
    5.  
    6. Program     Status      Group       Lag at Chkpt  Time Since Chkpt
    7.  
    8. MANAGER     RUNNING                                           
    9. REPLICAT    RUNNING     REP_TEST    00:00:00      00:00:05

    其它:如果replicat进程启动不了,可以使用如下命令启动,以方便诊断问题:

    1. cd $OGG_HOME
    2. ./replicat paramfile dirprm/rep_test.prm

    5 测试验证
    1)启动kafka consumerconsole
    kafka任一结点:

    1. ./kafka-console-consumer.sh -zookeeper :2181 -topic oggtest -from-beginning

    2)在源端测试表中插入数据

    1. sqlplus hr/hr
    2. SQL> insert into t1 values(5,'shihaifeng');
    3. 1 row created.
    4.  
    5. SQL> commit;
    6. Commit complete.

    3)查看kafka消费控制台是否接收到该行数据
    我的有如下显示:

    1. I|HR.T1|2017-04-13 03:31:03.835147|2017-04-13T11:31:08.973000|00000000000000001478|5|shihaifeng
  • 相关阅读:
    【C++】对象模型
    【C++多线程】读写锁shared_lock/shared_mutex
    【C++多线程】共享数据的初始化保护
    【C++多线程】共享数据保护
    【C++多线程】lock_guard<T>类和unique_lock<T>类
    【C++多线程】转移线程所有权
    【C++ 】std::ref()和std::cref()
    【C++多线程】传递参数
    【C++多线程】detach()及注意
    linux 打开CHM文件
  • 原文地址:https://www.cnblogs.com/yaoyangding/p/14891216.html
Copyright © 2020-2023  润新知