1.下载
[linyouyi@hadoop01 software]$ wget https://mirrors.aliyun.com/apache/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz --2019-09-05 14:39:06-- https://mirrors.aliyun.com/apache/flume/1.9.0/apache-flume-1.9.0-bin.tar.gz Resolving mirrors.aliyun.com (mirrors.aliyun.com)... 27.148.180.227, 119.147.111.230, 119.147.111.231, ... Connecting to mirrors.aliyun.com (mirrors.aliyun.com)|27.148.180.227|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 67938106 (65M) [application/gzip] Saving to: ‘apache-flume-1.9.0-bin.tar.gz’ 100%[=======================================================================>] 67,938,106 30.0MB/s in 2.2s 2019-09-05 14:39:08 (30.0 MB/s) - ‘apache-flume-1.9.0-bin.tar.gz’ saved [67938106/67938106]
2.解压
[linyouyi@hadoop01 software]$ tar -zxvf apache-flume-1.9.0-bin.tar.gz -C /hadoop/module/ [linyouyi@hadoop01 software]$ cd /hadoop/module/ [linyouyi@hadoop01 module]$ cd apache-flume-1.9.0-bin/ [linyouyi@hadoop01 apache-flume-1.9.0-bin]$ ll total 176 drwxr-xr-x 2 linyouyi linyouyi 4096 Sep 5 14:40 bin -rw-rw-r-- 1 linyouyi linyouyi 85602 Nov 29 2018 CHANGELOG drwxr-xr-x 2 linyouyi linyouyi 4096 Sep 5 14:40 conf -rw-r--r-- 1 linyouyi linyouyi 5681 Nov 16 2017 DEVNOTES -rw-r--r-- 1 linyouyi linyouyi 2873 Nov 16 2017 doap_Flume.rdf drwxrwxr-x 12 linyouyi linyouyi 4096 Dec 18 2018 docs drwxrwxr-x 2 linyouyi linyouyi 4096 Sep 5 14:40 lib -rw-rw-r-- 1 linyouyi linyouyi 43405 Dec 10 2018 LICENSE -rw-r--r-- 1 linyouyi linyouyi 249 Nov 29 2018 NOTICE -rw-r--r-- 1 linyouyi linyouyi 2483 Nov 16 2017 README.md -rw-rw-r-- 1 linyouyi linyouyi 1958 Dec 10 2018 RELEASE-NOTES drwxrwxr-x 2 linyouyi linyouyi 4096 Sep 5 14:40 tools [linyouyi@hadoop01 apache-flume-1.9.0-bin]$ ll conf/ total 16
3.启动agent
使用名为flume-ng的shell脚本启动代理程序,该脚本位于Flume发行版的bin目录中。您需要在命令行上指定代理名称,config目录和配置文件:
bin/flume-ng agent -n $agent_name -c conf -f conf/flume-conf.properties.template -n agent_name #取名称 -c conf #配置文件夹 -f conf/flume-conf.properties.template #配置文件
4.一个简单的例子
http://flume.apache.org/FlumeUserGuide.html#netcat-tcp-source
在这里,我们给出一个示例配置文件,描述单节点Flume部署。此配置允许用户生成事件,然后将其记录到控制台。
[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ vim conf/example.conf #example.conf:单节点Flume配置 #在此代理上命名组件 a1.sources = r1 a1.sinks = k1 a1.channels = c1 #描述/配置源 a1.sources.r1.type = netcat a1。 sources.r1.bind = localhost a1.sources.r1.port = 44444 #描述接收器 a1.sinks.k1.type = logger #使用一个缓冲内存中事件的通道 a1.channels.c1.type = memory a1.channels .c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 #将源和接收器绑定到通道 a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
此配置定义名为a1的单个代理。a1有一个侦听端口44444上的数据的源,一个缓冲内存中事件数据的通道,以及一个将事件数据记录到控制台的接收器。配置文件命名各种组件,然后描述其类型和配置参数。给定的配置文件可能会定义几个命名的代理 当一个给定的Flume进程启动时,会传递一个标志,告诉它要显示哪个命名代理。
鉴于此配置文件,我们可以按如下方式启动Flume:
[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ bin/flume-ng agent --conf conf --conf-file conf/example.conf --name a1 -Dflume.root.logger=INFO,console
请注意,在完整部署中,我们通常会包含一个选项: - conf=<conf-dir>。所述<CONF-DIR>目录将包括一个shell脚本flume-env.sh和潜在的一个log4j的属性文件。在这个例子中,我们传递一个Java选项来强制Flume登录到控制台,我们没有自定义环境脚本。
从一个单独的终端,我们可以telnet端口44444并向Flume发送一个事件:
$ telnet localhost 44444 Trying 127.0.0.1... Connected to localhost.localdomain (127.0.0.1). Escape character is '^]'. Hello world! <ENTER> OK
原始的Flume终端将在日志消息中输出事件。
12/06/19 15:32:19 INFO source.NetcatSource: Source starting 12/06/19 15:32:19 INFO source.NetcatSource: Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/127.0.0.1:44444] 12/06/19 15:32:34 INFO sink.LoggerSink: Event: { headers:{} body: 48 65 6C 6C 6F 20 77 6F 72 6C 64 21 0D Hello world!. }
恭喜 - 您已成功配置并部署了Flume代理!后续部分更详细地介绍了代理配置。
5.exec采集
http://flume.apache.org/FlumeUserGuide.html#exec-source
[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ cp conf/example.conf conf/example-exec.conf [linyouyi@hadoop01 apache-flume-1.9.0-bin]$ vim conf/example-exec.conf #example.conf:单节点Flume配置 #在此代理上命名组件 a1.sources = r1 a1.sinks = k1 a1.channels = c1 #描述/配置源 a1.sources.r1.type = exec a1。 sources.r1.command = tail -F /hadoop/module/text.log #描述接收器 a1.sinks.k1.type = logger #使用一个缓冲内存中事件的通道 a1.channels.c1.type = memory a1.channels .c1.capacity = 1000 a1.channels.c1.transactionCapacity = 100 #将源和接收器绑定到通道 a1.sources.r1.channels = c1 a1.sinks.k1.channel = c1
启动
[linyouyi@hadoop01 apache-flume-1.9.0-bin]$ bin/flume-ng agent --conf conf --conf-file conf/example-exec.conf --name a1 -Dflume.root.logger=INFO,console
打开另一个客户端往/hadoop/module/text.log不断写数据,发现原始的Flume终端消息中输出信息
[linyouyi@hadoop01 module]$ echo "flume" >> text.log [linyouyi@hadoop01 module]$ echo "flume" >> text.log [linyouyi@hadoop01 module]$ cat text.log flume flume flume [linyouyi@hadoop01 module]$ echo "flume" >> text.log [linyouyi@hadoop01 module]$ echo "hello linyouyi" >> text.log
原始终端输出信息
2019-09-05 15:38:29,208 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.ExecSource.start(ExecSource.java:170)] Exec source starting with command: tail -F /hadoop/module/text.log 2019-09-05 15:38:29,209 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] Monitored counter group for type: SOURCE, name: r1: Successfully registered new MBean. 2019-09-05 15:38:29,209 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: SOURCE, name: r1 started 2019-09-05 15:38:33,213 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 66 6C 75 6D 65 flume } 2019-09-05 15:38:33,213 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 66 6C 75 6D 65 flume } 2019-09-05 15:38:33,213 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 66 6C 75 6D 65 flume } 2019-09-05 15:38:35,263 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 66 6C 75 6D 65 flume } 2019-09-05 15:39:05,265 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 68 65 6C 6C 6F 20 6C 69 6E 79 6F 75 79 69 hello linyouyi }
在配置文件中使用环境变