1. zookeeper配置
kafka是依赖zookeeper的,所以先要运行zookeeper,下载的tar包里面包含zookeeper
需要改一下dataDir ,放在/tmp可是非常危险的
dataDir=/data/zookeeper
# the port at which the clients will connect
clientPort=2181
# disable the per-ip limit on the number of connections since this is a non-production config
maxClientCnxns=0
maxClientCnxns可以限制每个ip的连接数。可以适当开。。
2. kafka 配置
可配置项还是挺多的,总结几个must配置的地方。
# The address the socket server listens on. It will get the value returned from
# java.net.InetAddress.getCanonicalHostName() if not configured.
# FORMAT:
# listeners = listener_name://host_name:port
# EXAMPLE:
# listeners = PLAINTEXT://your.host.name:9092
#listeners=PLAINTEXT://:9092
# Hostname and port the broker will advertise to producers and consumers. If not set,
# it uses the value for "listeners" if configured. Otherwise, it will use the value
# returned from java.net.InetAddress.getCanonicalHostName().
#advertised.listeners=PLAINTEXT://your.host.name:9092
这个listeners 很多地方都能用到。如果不配置的话默认用hostname ,建议配上。advertised.listeners 一般跟listeners一致就可以,如果在一些特殊情况下会用到。
log.dirs=/tmp/kafka-logs
别放/tmp 。
# The default number of log partitions per topic. More partitions allow greater
# parallelism for consumption, but this will also result in more files across
# the brokers.
num.partitions=1
默认分区数,可以设置一下,
# The minimum age of a log file to be eligible for deletion due to age
log.retention.hours=168
数据存多久,默认是168小时,也就是7天,看需求改。
# Zookeeper connection string (see zookeeper docs for details).
# This is a comma separated host:port pairs, each corresponding to a zk
# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".
# You can also append an optional chroot string to the urls to specify the
# root directory for all kafka znodes.
zookeeper.connect=localhost:2181
# Timeout in ms for connecting to zookeeper
zookeeper.connection.timeout.ms=6000
zookeeper配置,必须要跟zookeeper对上,这没什么说的。
3. filebeat 配置input,output kafka
https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-input-log.html
These options make it possible for Filebeat to decode logs structured as JSON messages. Filebeat processes the logs line by line, so the JSON decoding only works if there is one JSON object per line.
json用行隔开。
The decoding happens before line filtering and multiline. You can combine JSON decoding with filtering and multiline if you set the message_key option. This can be helpful in situations where the application logs are wrapped in JSON objects, as with like it happens for example with Docker.
json:
keys_under_root: true
add_error_key: true
# message_key: log
keys_under_root
By default, the decoded JSON is placed under a "json" key in the output document. If you enable this setting, the keys are copied top level in the output document. The default is false.
add_error_key
If this setting is enabled, Filebeat adds a "error.message" and "error.type: json" key in case of JSON unmarshalling errors or when a message_key is defined in the configuration but cannot be used.
设置两个,如果你日志全部输出,想要过滤的设置message_key ,如果单独放文件,就不需要。
filebeat 比想象中的强大很多。