-
需求
实现flume监控某个目录下面的所有文件,然后将文件收集发送到kafka消息系统中
-
一、Flume下载地址
-
二、上传解压Flume
cd /export/softwares
tar -zxvf apache-flume-1.6.0-cdh5.14.0 -C ../servers
-
三、配置flume.conf
使用flume监控一个文件夹,一旦文件夹下面有了数据,就将数据发送到Kafka里面去
mkdir -p /export/servers/flumedata
先创建要监控的文件夹
cd /export/servers/apache-flume-1.6.0-cdh5.14.0-bin/conf
vim flume_kafka.conf
# 给各个组件起名 a1.sources = r1 a1.channels = c1 a1.sinks = k1 # 指定source收集到的数据发送到哪个管道 a1.sources.r1.channels = c1 # 指定source数据收集策略 a1.sources.r1.type = spooldir a1.sources.r1.spoolDir = /export/servers/flumedata a1.sources.r1.deletePolicy = never a1.sources.r1.fileSuffix = .COMPLETED a1.sources.r1.ignorePattern = ^(.)*\.tmp$ a1.sources.r1.inputCharset = UTF-8 #指定channel为memory,即表示所有的数据都装进memory当中 a1.channels.c1.type = memory #指定sink为kafka sink,并指定sink从哪个channel当中读取数据 a1.sinks.k1.channel = c1 a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink a1.sinks.k1.kafka.topic = test a1.sinks.k1.kafka.bootstrap.servers = node01:9092,node02:9092,node03:9092 a1.sinks.k1.kafka.flumeBatchSize = 20 a1.sinks.k1.kafka.producer.acks = 1
-
四、启动flume
bin/flume-ng agent --conf conf --conf-file conf/flume.conf --name a1 -Dflume.root.logger=INFO,console
-
五、测试整合
启动flume成功后,再启动kafka
bin/kafka-console-consumer.sh --from-beginning --bootstrap-server node01:9092 --topic test
然后像/export/servers/flumedata
目录下上传文本文件即可