• CDH| 组件的使用-Flume| Kafka| Oozie基于Hue的任务调度


    日志采集Flume配置

     1)Flume配置分析

      

    Flume直接读log日志的数据,log日志的格式是app-yyyy-mm-dd.log。

     2)Flume的具体配置如下:

        在CM管理页面上点击Flume,

    在实例页面选择hadoop101上的Agent

      3)在CM管理页面hadoop101上Flume的配置中找到代理名称改为a1

     

      4)在配置文件如下内容(flume-kafka)

    a1.sources=r1
    a1.channels=c1 c2 
    a1.sinks=k1 k2 
    
    # configure source
    a1.sources.r1.type = TAILDIR
    a1.sources.r1.positionFile = /opt/module/flume/log_position.json
    a1.sources.r1.filegroups = f1
    a1.sources.r1.filegroups.f1 = /tmp/logs/app.+
    a1.sources.r1.fileHeader = true
    a1.sources.r1.channels = c1 c2
    
    #interceptor
    a1.sources.r1.interceptors = i1 i2
    a1.sources.r1.interceptors.i1.type = com.atguigu.flume.interceptor.LogETLInterceptor$Builder
    a1.sources.r1.interceptors.i2.type = com.atguigu.flume.interceptor.LogTypeInterceptor$Builder
    
    # selector
    a1.sources.r1.selector.type = multiplexing
    a1.sources.r1.selector.header = topic
    a1.sources.r1.selector.mapping.topic_start = c1
    a1.sources.r1.selector.mapping.topic_event = c2
    
    # configure channel
    a1.channels.c1.type = memory
    a1.channels.c1.capacity=10000
    a1.channels.c1.byteCapacityBufferPercentage=20
    
    a1.channels.c2.type = memory
    a1.channels.c2.capacity=10000
    a1.channels.c2.byteCapacityBufferPercentage=20
    
    # configure sink
    # start-sink
    a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
    a1.sinks.k1.kafka.topic = topic_start
    a1.sinks.k1.kafka.bootstrap.servers = hadoop101:9092,hadoop102:9092,hadoop103:9092
    a1.sinks.k1.kafka.flumeBatchSize = 2000
    a1.sinks.k1.kafka.producer.acks = 1
    a1.sinks.k1.channel = c1
    
    # event-sink
    a1.sinks.k2.type = org.apache.flume.sink.kafka.KafkaSink
    a1.sinks.k2.kafka.topic = topic_event
    a1.sinks.k2.kafka.bootstrap.servers = hadoop101:9092,hadoop102:9092,hadoop103:9092
    a1.sinks.k2.kafka.flumeBatchSize = 2000
    a1.sinks.k2.kafka.producer.acks = 1
    a1.sinks.k2.channel = c2
    View Code

    注意:com.xxx.flume.interceptor.LogETLInterceptor和com.xxx.flume.interceptor.LogTypeInterceptor是自定义的拦截器的全类名。需要根据用户自定义的拦截器做相应修改。

     5)修改/opt/module/flume/log_position.json的读写权限

         注意:Json文件的父目录一定要创建好,并改好权限

    [root@hadoop101 ~]# mkdir -p /opt/module/flume
    [root@hadoop101 ~]# cd /opt/module/flume/
    [root@hadoop101 flume]# touch log_position.json
    [root@hadoop101 flume]# ll
    总用量 0
    -rw-r--r-- 1 root root 0 4月   5 20:05 log_position.json
    [root@hadoop101 flume]# chmod 777 log_position.json
    [root@hadoop101 flume]# ll
    总用量 0
    -rwxrwxrwx 1 root root 0 4月   5 20:05 log_position.json
    [root@hadoop101 flume]# xsync /opt/module/flume/
    fname=flume
    pdir=/opt/module
    ------------------- hadoop102 --------------
    sending incremental file list
    flume/
    flume/log_position.json
    
    sent 108 bytes  received 35 bytes  286.00 bytes/sec
    total size is 0  speedup is 0.00
    ------------------- hadoop103 --------------
    sending incremental file list
    flume/
    flume/log_position.json
    
    sent 108 bytes  received 35 bytes  286.00 bytes/sec
    total size is 0  speedup is 0.00

    Flume拦截器

    本项目中自定义了两个拦截器,分别是:ETL拦截器、日志类型区分拦截器。

    ETL拦截器主要用于,过滤时间戳不合法和Json数据不完整的日志

    日志类型区分拦截器主要用于,将启动日志和事件日志区分开来,方便发往Kafka的不同Topic。

    打包,拦截器打包之后,只需要单独包,不需要将依赖的包上传。打包之后要放入flume的lib文件夹下面。

    注意:为什么不需要依赖包?因为依赖包在flume的lib目录下面已经存在了。

      采用root用户将flume-interceptor-1.0-SNAPSHOT.jar包放入到hadoop101的/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/flume-ng/lib/文件夹下面

    [root@hadoop101 flume]# cd /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/flume-ng/lib/
    [root@hadoop101 lib]# ll
    总用量 608
    lrwxrwxrwx 1 root root     41 4月   5 18:31 apache-log4j-extras-1.1.jar -> ../../../jars/apache-log4j-extras-1.1.jar
    lrwxrwxrwx 1 root root     29 4月   5 18:31 async-1.4.0.jar -> ../../../jars/async-1.4.0.jar
    lrwxrwxrwx 1 root root     34 4月   5 18:31 asynchbase-1.7.0.jar -> ../../../jars/asynchbase-1.7.0.jar
    lrwxrwxrwx 1 root root     30 4月   5 18:31 avro-ipc.jar -> ../../../lib/avro/avro-ipc.jar
    
    
    [root@hadoop101 lib]# ls | grep interceptor
    flume-interceptor-1.0-SNAPSHOT.jar

    分发Flume到hadoop102

    [root@hadoop101 lib]$ xsync flume-interceptor-1.0-SNAPSHOT.jar

    Flume的具体配置如下:

    1)在CM管理页面hadoop103上Flume的配置中找到代理名称

     

     

    日志生成数据传输到HDFS
    1)将log-collector-1.0-SNAPSHOT-jar-with-dependencies.jar上传都hadoop102的/opt/module目录
    2)分发log-collector-1.0-SNAPSHOT-jar-with-dependencies.jar到hadoop103
    [root@hadoop102 module]# xsync log-collector-1.0-SNAPSHOT-jar-with-dependencies.jar
    3)在/root/bin目录下创建脚本lg.sh
    [root@hadoop102 bin]$ vim lg.sh
        4)在脚本中编写如下内容
    #! /bin/bash
    
        for i in hadoop102 hadoop103 
        do
            ssh $i "java -classpath /opt/module/log-collector-1.0-SNAPSHOT-jar-with-dependencies.jar com.atguigu.appclient.AppMain $1 $2 >/opt/module/test.log &"
        done
    5)修改脚本执行权限
    [root@hadoop102 bin]$ chmod 777 lg.sh
    6)启动脚本
    [root@hadoop102 module]$ lg.sh 

    Kafka的使用

    [root@hadoop101 cloudera]# ll
    总用量 16
    drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 4月   5 20:26 csd
    drwxr-xr-x 2 root         root         4096 4月   5 20:39 parcel-cache
    drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 4月   5 20:38 parcel-repo
    drwxr-xr-x 6 cloudera-scm cloudera-scm 4096 4月   5 20:40 parcels
    [root@hadoop101 cloudera]# cd parcels/
    [root@hadoop101 parcels]# ll
    总用量 12
    lrwxrwxrwx  1 root root   27 4月   5 19:25 CDH -> CDH-5.12.1-1.cdh5.12.1.p0.3
    drwxr-xr-x 11 root root 4096 8月  25 2017 CDH-5.12.1-1.cdh5.12.1.p0.3
    drwxr-xr-x  4 root root 4096 12月 19 2014 HADOOP_LZO-0.4.15-1.gplextras.p0.123
    lrwxrwxrwx  1 root root   25 4月   5 20:40 KAFKA -> KAFKA-3.0.0-1.3.0.0.p0.40
    drwxr-xr-x  6 root root 4096 10月  6 2017 KAFKA-3.0.0-1.3.0.0.p0.40
    [root@hadoop101 parcels]# cd KAFKA
    [root@hadoop101 KAFKA]# ll
    总用量 16
    drwxr-xr-x 2 root root 4096 10月  6 2017 bin
    drwxr-xr-x 5 root root 4096 10月  6 2017 etc
    drwxr-xr-x 3 root root 4096 10月  6 2017 lib
    drwxr-xr-x 2 root root 4096 10月  6 2017 meta
    [root@hadoop101 KAFKA]# /opt/cloudera/parcels/KAFKA/bin/kafka-topics --zookeeper hadoop101:2181 --list

    创建 Kafka Topic

    进入到/opt/cloudera/parcels/KAFKA目录下分别创建:启动日志主题、事件日志主题。

    1)创建启动日志主题

    [root@hadoop101 KAFKA]# ll
    总用量 16
    drwxr-xr-x 2 root root 4096 10月  6 2017 bin
    drwxr-xr-x 5 root root 4096 10月  6 2017 etc
    drwxr-xr-x 3 root root 4096 10月  6 2017 lib
    drwxr-xr-x 2 root root 4096 10月  6 2017 meta
    [root@hadoop101 KAFKA]# bin/kafka-topics --zookeeper hadoop101:2181,hadoop102:2181,hadoop103:2181  --create --replication-factor 1 --partitions 1 --topic topic_start
    [root@hadoop101 KAFKA]# bin/kafka-topics --zookeeper hadoop101:2181,hadoop102:2181,hadoop103:2181  --create --replication-factor 1 --partitions 1 --topic topic_event

     测试Kafka 生产者:

    [root@hadoop101 KAFKA]# bin/kafka-console-producer --broker-list hadoop101:9092 --topic test
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    20/04/05 21:00:45 INFO producer.ProducerConfig: ProducerConfig values: 
        acks = 1
        batch.size = 16384
        bootstrap.servers = [hadoop101:9092]
        buffer.memory = 33554432
        client.id = console-producer
        compression.type = none
        connections.max.idle.ms = 540000
        enable.idempotence = false
        interceptor.classes = null
        key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
        linger.ms = 1000
        max.block.ms = 60000
        max.in.flight.requests.per.connection = 5
        max.request.size = 1048576
        metadata.max.age.ms = 300000
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 30000
        partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
        receive.buffer.bytes = 32768
        reconnect.backoff.max.ms = 1000
        reconnect.backoff.ms = 50
        request.timeout.ms = 1500
        retries = 3
        retry.backoff.ms = 100
        sasl.jaas.config = null
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 60000
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.mechanism = GSSAPI
        security.protocol = PLAINTEXT
        send.buffer.bytes = 102400
        ssl.cipher.suites = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
        ssl.endpoint.identification.algorithm = null
        ssl.key.password = null
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.location = null
        ssl.keystore.password = null
        ssl.keystore.type = JKS
        ssl.protocol = TLS
        ssl.provider = null
        ssl.secure.random.implementation = null
        ssl.trustmanager.algorithm = PKIX
        ssl.truststore.location = null
        ssl.truststore.password = null
        ssl.truststore.type = JKS
        transaction.timeout.ms = 60000
        transactional.id = null
        value.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
    
    20/04/05 21:00:45 INFO utils.AppInfoParser: Kafka version : 0.11.0-kafka-3.0.0
    20/04/05 21:00:45 INFO utils.AppInfoParser: Kafka commitId : unknown
    >Hello World
    >Hello kris
    View Code

    Kafka消费者:

    [root@hadoop101 KAFKA]# bin/kafka-console-consumer --bootstrap-server hadoop101:9092 --from-beginning --topic test
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    20/04/05 21:00:54 INFO consumer.ConsumerConfig: ConsumerConfig values: 
        auto.commit.interval.ms = 5000
        auto.offset.reset = earliest
        bootstrap.servers = [hadoop101:9092]
        check.crcs = true
        client.id = 
        connections.max.idle.ms = 540000
        enable.auto.commit = true
        exclude.internal.topics = true
        fetch.max.bytes = 52428800
        fetch.max.wait.ms = 500
        fetch.min.bytes = 1
        group.id = console-consumer-53066
        heartbeat.interval.ms = 3000
        interceptor.classes = null
        internal.leave.group.on.close = true
        isolation.level = read_uncommitted
        key.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
        max.partition.fetch.bytes = 1048576
        max.poll.interval.ms = 300000
        max.poll.records = 500
        metadata.max.age.ms = 300000
        metric.reporters = []
        metrics.num.samples = 2
        metrics.recording.level = INFO
        metrics.sample.window.ms = 30000
        partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor]
        receive.buffer.bytes = 65536
        reconnect.backoff.max.ms = 1000
        reconnect.backoff.ms = 50
        request.timeout.ms = 305000
        retry.backoff.ms = 100
        sasl.jaas.config = null
        sasl.kerberos.kinit.cmd = /usr/bin/kinit
        sasl.kerberos.min.time.before.relogin = 60000
        sasl.kerberos.service.name = null
        sasl.kerberos.ticket.renew.jitter = 0.05
        sasl.kerberos.ticket.renew.window.factor = 0.8
        sasl.mechanism = GSSAPI
        security.protocol = PLAINTEXT
        send.buffer.bytes = 131072
        session.timeout.ms = 10000
        ssl.cipher.suites = null
        ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1]
        ssl.endpoint.identification.algorithm = null
        ssl.key.password = null
        ssl.keymanager.algorithm = SunX509
        ssl.keystore.location = null
        ssl.keystore.password = null
        ssl.keystore.type = JKS
        ssl.protocol = TLS
        ssl.provider = null
        ssl.secure.random.implementation = null
        ssl.trustmanager.algorithm = PKIX
        ssl.truststore.location = null
        ssl.truststore.password = null
        ssl.truststore.type = JKS
        value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer
    
    20/04/05 21:00:54 INFO utils.AppInfoParser: Kafka version : 0.11.0-kafka-3.0.0
    20/04/05 21:00:54 INFO utils.AppInfoParser: Kafka commitId : unknown
    20/04/05 21:00:55 INFO internals.AbstractCoordinator: Discovered coordinator hadoop101:9092 (id: 2147483594 rack: null) for group console-consumer-53066.
    20/04/05 21:00:55 INFO internals.ConsumerCoordinator: Revoking previously assigned partitions [] for group console-consumer-53066
    20/04/05 21:00:55 INFO internals.AbstractCoordinator: (Re-)joining group console-consumer-53066
    20/04/05 21:00:58 INFO internals.AbstractCoordinator: Successfully joined group console-consumer-53066 with generation 1
    20/04/05 21:00:58 INFO internals.ConsumerCoordinator: Setting newly assigned partitions [test-0] for group console-consumer-53066
    Hello World
    Hello kris
    View Code

    查看某个Topic的详情

    [root@hadoop101 KAFKA]# bin/kafka-topics --zookeeper hadoop101:2181  --describe --topic test
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/lib/kafka/libs/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    20/04/05 21:02:39 INFO zkclient.ZkEventThread: Starting ZkClient event thread.
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f, built on 03/23/2017 10:13 GMT
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:host.name=hadoop101
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_144
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.home=/opt/module/jdk1.8.0_144/jre
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.class.path=:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/activation-1.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/aopalliance-1.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/aopalliance-repackaged-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/apacheds-i18n-2.0.0-M15.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/apacheds-kerberos-codec-2.0.0-M15.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/api-asn1-api-1.0.0-M20.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/api-util-1.0.0-M20.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/argparse4j-0.7.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/asm-3.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/avro-1.7.6-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/cglib-2.2.1-v20090111.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-beanutils-1.8.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-beanutils-core-1.8.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-cli-1.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-codec-1.9.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-collections-3.2.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-compress-1.4.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-configuration-1.6.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-digester-1.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-el-1.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-httpclient-3.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-io-2.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-lang-2.6.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-lang3-3.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-logging-1.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-math3-3.1.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-net-3.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/commons-pool2-2.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-api-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-file-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-json-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-runtime-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/connect-transforms-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/gson-2.2.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/guava-20.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/guice-3.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/guice-servlet-3.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-annotations-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-auth-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-common-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-mapreduce-client-common-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-mapreduce-client-core-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-mapreduce-client-jobclient-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-mapreduce-client-shuffle-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-api-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-client-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-common-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-server-common-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hadoop-yarn-server-nodemanager-2.6.0-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hk2-api-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hk2-locator-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/hk2-utils-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/htrace-core4-4.0.1-incubating.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/httpclient-4.4.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/httpcore-4.4.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-annotations-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-core-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-databind-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-jaxrs-base-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-jaxrs-json-provider-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-module-jaxb-annotations-2.8.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jasper-compiler-5.5.23.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javassist-3.20.0-GA.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javassist-3.21.0-GA.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.annotation-api-1.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.inject-1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.inject-2.5.0-b05.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/java-xmlbuilder-0.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/javax.ws.rs-api-2.0.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jaxb-api-2.2.2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-client-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-common-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-container-servlet-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-container-servlet-core-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-guava-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-guice-1.9.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-media-jaxb-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jersey-server-2.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jets3t-0.9.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jettison-1.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-http-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-io-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-security-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-server-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-util-6.1.26.cloudera.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jetty-util-9.2.15.v20160210.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jopt-simple-5.0.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jsch-0.1.42.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jsp-api-2.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/jsr305-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka_2.11-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka_2.11-0.11.0-kafka-3.0.0-sources.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka_2.11-0.11.0-kafka-3.0.0-test-sources.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-clients-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-log4j-appender-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-streams-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-streams-examples-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/kafka-tools-0.11.0-kafka-3.0.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/leveldbjni-all-1.8.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/libthrift-0.9.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/log4j-1.2.17.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/lz4-1.3.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/maven-artifact-3.5.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/metrics-core-2.2.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/metrics-servlet-2.2.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/osgi-resource-locator-1.0.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/paranamer-2.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/plexus-utils-3.0.24.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/protobuf-java-2.5.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/reflections-0.9.11.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/rocksdbjni-5.0.1.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/scala-library-2.11.11.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/scala-parser-combinators_2.11-1.0.4.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-binding-kafka-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-core-common-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-core-model-db-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-core-model-kafka-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-policy-common-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-policy-kafka-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-provider-common-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-provider-db-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/sentry-provider-file-1.5.1-cdh5.11.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/servlet-api-2.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/shiro-core-1.2.3.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/slf4j-api-1.7.25.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/slf4j-api-1.7.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/slf4j-log4j12-1.7.25.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/slf4j-log4j12-1.7.5.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/snappy-java-1.1.2.6.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/stax-api-1.0-2.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/validation-api-1.1.0.Final.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/xmlenc-0.52.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/xz-1.0.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/zkclient-0.10.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/zookeeper-3.4.10.jar:/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40/bin/../lib/kafka/bin/../libs/zookeeper-3.4.5-cdh5.13.0.jar
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:os.version=2.6.32-642.el6.x86_64
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:user.name=root
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/cloudera/parcels/KAFKA-3.0.0-1.3.0.0.p0.40
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=hadoop101:2181 sessionTimeout=30000 watcher=org.I0Itec.zkclient.ZkClient@52e6fdee
    20/04/05 21:02:39 INFO zkclient.ZkClient: Waiting for keeper state SyncConnected
    20/04/05 21:02:39 INFO zookeeper.ClientCnxn: Opening socket connection to server hadoop101/192.168.1.101:2181. Will not attempt to authenticate using SASL (unknown error)
    20/04/05 21:02:39 INFO zookeeper.ClientCnxn: Socket connection established to hadoop101/192.168.1.101:2181, initiating session
    20/04/05 21:02:39 INFO zookeeper.ClientCnxn: Session establishment complete on server hadoop101/192.168.1.101:2181, sessionid = 0x3714a18244c0005, negotiated timeout = 30000
    20/04/05 21:02:39 INFO zkclient.ZkClient: zookeeper state changed (SyncConnected)
    Topic:test    PartitionCount:1    ReplicationFactor:1    Configs:
        Topic: test    Partition: 0    Leader: 52    Replicas: 52    Isr: 52
    20/04/05 21:02:39 INFO zkclient.ZkEventThread: Terminate ZkClient event thread.
    20/04/05 21:02:39 INFO zookeeper.ZooKeeper: Session: 0x3714a18244c0005 closed
    20/04/05 21:02:39 INFO zookeeper.ClientCnxn: EventThread shut down for session: 0x3714a18244c0005
    View Code
    删除 Kafka Topic
    1)删除启动日志主题
    [root@hadoop101 KAFKA]$ bin/kafka-topics --delete --zookeeper hadoop101:2181,hadoop102:2181,hadoop103:2181 --topic topic_start
    2)删除事件日志主题
    [root@hadoop101 KAFKA]$ bin/kafka-topics --delete --zookeeper hadoop101:2181,hadoop102:2181,hadoop103:2181 --topic topic_event

    Oozie基于Hue实现GMV指标流程调度

     1 执行前的准备

      1)添加MySQL驱动文件

    [root@hadoop101 ~]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/cloudera/parcels/CDH
    CDH/                         CDH-5.12.1-1.cdh5.12.1.p0.3/ 
    [root@hadoop101 ~]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib
    lib/     lib64/   libexec/ 
    [root@hadoop101 ~]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/lib
    [root@hadoop101 ~]# cp /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/sqoop/lib/
    [root@hadoop101 ~]# hadoop fs -put /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /user/oozie/share/lib/lib_20190903103134/sqoop
    put: `/user/oozie/share/lib/lib_20190903103134/sqoop': No such file or directory
    [root@hadoop101 ~]# hadoop fs -put /opt/software/mysql-libs/mysql-connector-java-5.1.27/mysql-connector-java-5.1.27-bin.jar /user/oozie/share/lib/lib_20200405214946/sqoop
    [root@hadoop101 ~]# xsync /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/lib 
    fname=lib
    pdir=/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop
    ------------------- hadoop102 --------------
    sending incremental file list
    lib/
    lib/mysql-connector-java-5.1.27-bin.jar
    
    sent 877414 bytes  received 246 bytes  1755320.00 bytes/sec
    total size is 3041243  speedup is 3.47
    ------------------- hadoop103 --------------
    sending incremental file list
    lib/
    lib/mysql-connector-java-5.1.27-bin.jar
    
    sent 877414 bytes  received 246 bytes  585106.67 bytes/sec
    total size is 3041243  speedup is 3.47
    [root@hadoop101 ~]# xsync /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/sqoop/lib/
    fname=lib
    pdir=/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/sqoop
    ------------------- hadoop102 --------------
    sending incremental file list
    lib/
    lib/mysql-connector-java-5.1.27-bin.jar
    
    sent 874734 bytes  received 134 bytes  1749736.00 bytes/sec
    total size is 873514  speedup is 1.00
    ------------------- hadoop103 --------------
    sending incremental file list
    lib/
    lib/mysql-connector-java-5.1.27-bin.jar
    
    sent 874734 bytes  received 134 bytes  1749736.00 bytes/sec
    total size is 873514  speedup is 1.00

     2)修改YARN的容器内存yarn.nodemanager.resource.memory-mb为4G

     

     

     

    Hue中创建Oozie任务GMV

    1)生成数据

    安装Oozie可视化js,复制ext-2.2.zip到/opt/cloudera/parcels/CDH/lib/oozie/libext(或/var/lib/oozie)中,解压

    [root@hadoop101 libext]# ll
    总用量 860
    drwxr-xr-x 9 root  root    4096 8月   4 2008 ext-2.2
    -rw-r--r-- 1 oozie oozie 872303 4月   5 22:24 mysql-connector-java.jar
    drwxr-xr-x 5 oozie oozie   4096 4月   5 22:24 tomcat-deployment
    [root@hadoop101 libext]# unzip /opt/software/ext-2.2.zip

    查看Hue的web页面

    选择query->scheduler->workflow

     点击My Workflow->输入gmv

    点击保存

    编写任务脚本并上传到HDFS

     1)点击workspace,查看上传情况

    2)上传要执行的脚本导HDFS路径

      [root@hadoop101 bin]# hadoop fs -put /root/bin/*.sh /user/hue/oozie/workspaces/hue-oozie-1559457755.83/lib

    3)点击左侧的->Documents->gmv

    编写任务调度

    1)点击编辑

    2)选择actions

    3)拖拽控件编写任务

    4)选择执行脚本

    5)Files再选择执行脚本

    6)定义脚本参数名称

     

    7)按顺序执行3-6步骤。

      ods_db.sh、dwd_db.sh、dws_db_wide.sh、ads_db_gmv.sh和sqoop_export.sh

        其中ods_db.sh、dwd_db.sh、dws_db_wide.sh、ads_db_gmv.sh只有一个参数do_date;

          sqoop_export.sh只有一个参数all。

    8)点击保存

    执行任务调度

    1)点击执行

    2)点击提交

     

  • 相关阅读:
    sql2000如何完美压缩.mdf文件
    SQL Server读懂语句运行的统计信息 SET STATISTICS TIME IO PROFILE ON
    Sql Server性能优化辅助指标SET STATISTICS TIME ON和SET STATISTICS IO ON
    理解性能的奥秘——应用程序中慢,SSMS中快(6)——SQL Server如何编译动态SQL
    理解性能的奥秘——应用程序中慢,SSMS中快(5)——案例:如何应对参数嗅探
    理解性能的奥秘——应用程序中慢,SSMS中快(4)——收集解决参数嗅探问题的信息
    理解性能的奥秘——应用程序中慢,SSMS中快(3)——不总是参数嗅探的错
    理解性能的奥秘——应用程序中慢,SSMS中快(2)——SQL Server如何编译存储过程
    巧用Map缓存提升"翻译"速度
    pt-archiver配置自动归档
  • 原文地址:https://www.cnblogs.com/shengyang17/p/12638186.html
Copyright © 2020-2023  润新知