一、下载kafka:
http://kafka.apache.org/downloads
二、解压
tar -zxvf kafka_2.10-0.10.0.1.tgz
三、kafka需要用到zookeeper,可以是单节点,也可以是zk集群。
(1)、单节点zk
kafka本身自带了一个测试zk,可以使用kafka自带的zk节点来测试。
1、启动单节点zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties
2、启动kafka 服务:
bin/kafka-server-start.sh config/server.properties
3、创建一个topic
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
4、创建一个produce,生产者角色,产生数据,并发送给kafka
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
5、创建一个 consumer,消费者角色,消费数据,接收由produce产生,kafka传递过来的数据。
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
在produce控制台输入一些字符,就可以在消费者控制台看到数据了。
[hadoop@master2 kafka_2.10-0.10.0.1]$ bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning java This is a message This is another message
(2)、zk集群模式:
1、编辑vi config/server.properties 文件,将配置文件中zookeeper的地址改成zk集群节点和kafka数据存放路径
#zookeeper.connect=localhost:2181 zookeeper.connect=node1:2181,node2:2181,node3:2181 # kafka数据存放路径 # A comma seperated list of directories under which to store log files log.dirs=/data/kafka_2.10-0.10.0.1/message-folder delete.topic.enable=true # 设置hostname,不然可能报org.apache.kafka.common.errors.TimeoutException的错误 # https://blog.csdn.net/lifuxiangcaohui/article/details/73350940 host.name=192.168.232.128
2、启动zk集群
3、使用修改后的server.properties文件启动kafka
bin/kafka-server-start.sh config/server.properties
或者采用后台执行
nohup bin/kafka-server-start.sh config/server.properties > kafka_run.log 2>1 &
启动日志:
[hadoop@master2 kafka_2.10-0.10.0.1]$ bin/kafka-server-start.sh config/server.properties [2016-10-09 01:21:38,298] INFO KafkaConfig values: request.timeout.ms = 30000 log.roll.hours = 168 inter.broker.protocol.version = 0.10.0-IV1 log.preallocate = false security.inter.broker.protocol = PLAINTEXT ....... (kafka.server.KafkaConfig) [2016-10-09 01:21:38,373] INFO starting (kafka.server.KafkaServer) [2016-10-09 01:21:38,383] INFO Connecting to zookeeper on node1:2181,node2:2181,node3:2181 (kafka.server.KafkaServer) [2016-10-09 01:21:38,414] INFO Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,414] INFO Client environment:host.name=master2 (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,415] INFO Client environment:java.version=1.7.0_79 (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,428] INFO Client environment:java.vendor=Oracle Corporation (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,428] INFO Client environment:java.home=/data/jdk1.7.0_79/jre (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,429] INFO Client environment:java.class.path=:/data/kafka_2.10-0.10.0.1/bin/../libs/aopalliance-repackaged-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/argparse4j-0.5.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/connect-api-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/connect-file-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/connect-json-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/connect-runtime-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/guava-18.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/hk2-api-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/hk2-locator-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/hk2-utils-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-annotations-2.6.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-core-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-databind-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-jaxrs-base-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-jaxrs-json-provider-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jackson-module-jaxb-annotations-2.6.3.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javassist-3.18.2-GA.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.annotation-api-1.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.inject-1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.inject-2.4.0-b34.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.servlet-api-3.1.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/javax.ws.rs-api-2.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-client-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-common-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-container-servlet-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-container-servlet-core-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-guava-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-media-jaxb-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jersey-server-2.22.2.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-continuation-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-http-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-io-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-security-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-server-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-servlet-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-servlets-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jetty-util-9.2.15.v20160210.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/jopt-simple-4.9.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka_2.10-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka_2.10-0.10.0.1-sources.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka_2.10-0.10.0.1-test-sources.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-clients-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-log4j-appender-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-streams-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-streams-examples-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/kafka-tools-0.10.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/log4j-1.2.17.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/lz4-1.3.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/metrics-core-2.2.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/osgi-resource-locator-1.0.1.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/reflections-0.9.10.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/rocksdbjni-4.8.0.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/scala-library-2.10.6.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/slf4j-api-1.7.21.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/slf4j-log4j12-1.7.21.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/snappy-java-1.1.2.6.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/validation-api-1.1.0.Final.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/zkclient-0.8.jar:/data/kafka_2.10-0.10.0.1/bin/../libs/zookeeper-3.4.6.jar (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,430] INFO Client environment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,430] INFO Client environment:java.io.tmpdir=/tmp (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,430] INFO Client environment:java.compiler=<NA> (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,430] INFO Client environment:os.name=Linux (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,431] INFO Client environment:os.arch=i386 (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,431] INFO Client environment:os.version=2.6.18-92.el5 (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,431] INFO Client environment:user.name=hadoop (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,431] INFO Client environment:user.home=/home/hadoop (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,431] INFO Client environment:user.dir=/data/kafka_2.10-0.10.0.1 (org.apache.zookeeper.ZooKeeper) [2016-10-09 01:21:38,433] INFO Starting ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread) ........... [2016-10-09 01:21:39,870] INFO Kafka commitId : a7a17cdec9eaa6c5 (org.apache.kafka.common.utils.AppInfoParser) [2016-10-09 01:21:39,872] INFO [Kafka Server 0], started (kafka.server.KafkaServer)
4、创建一个topic
bin/kafka-topics.sh --create --zookeeper node1:2181,node2:2181,node3:2181 --replication-factor 1 --partitions 1 --topic test
如果该topic已经存在,就报错:
[2016-10-09 01:23:35,106] ERROR kafka.common.TopicExistsException: Topic "test" already exists. at kafka.admin.AdminUtils$.createOrUpdateTopicPartitionAssignmentPathInZK(AdminUtils.scala:420) at kafka.admin.AdminUtils$.createTopic(AdminUtils.scala:404) at kafka.admin.TopicCommand$.createTopic(TopicCommand.scala:110) at kafka.admin.TopicCommand$.main(TopicCommand.scala:61) at kafka.admin.TopicCommand.main(TopicCommand.scala) (kafka.admin.TopicCommand$)
5、查看已经创建的topic
[hadoop@master2 kafka_2.10-0.10.0.1]$ bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181 test
6、创建一个数据生产者
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
7、创建一个数据消费者
bin/kafka-console-consumer.sh --zookeeper node1:2181,node2:2181,node3:2181 --topic test --from-beginning
测试:
在数据生产者控制台输入数据
在数据消费者控制台可以看到相应的数据:
[hadoop@master2 kafka_2.10-0.10.0.1]$ bin/kafka-console-consumer.sh --zookeeper node1:2181,node2:2181,node3:2181 --topic test --from-beginning java This is a message This is another message
四、安装kafka集群
我使用两台机安装了两个kafka节点。
1、把kafka复制到其他机器上去,
2、修改config/server.properties文件,分别把broker.id改为其他数字,一定要是正数,不能跟其他节点相同
broker.id=2
3、分别启动kafka
bin/kafka-server-start.sh config/server.properties
4、如果server.properties文件里配置(即log.dirs配置项)的kafka数据存放目录下,meta数据已经存在,需要清空该文件夹。否则可能会报以下错误。
或者修改kafka数据存放目录下meta.properties文件中broker.id配置项,使之跟server.properties中的broker.id一致。
[2016-10-12 00:09:10,898] FATAL Fatal error during KafkaServer startup. Prepare to shutdown (kafka.server.KafkaServer) kafka.common.InconsistentBrokerIdException: Configured broker.id 1 doesn't match stored broker.id 0 in meta.properties. If you moved your data, make sure your configured broker.id matches. If you intend to create a new broker, you should remove all data in your data directories (log.dirs). at kafka.server.KafkaServer.getBrokerId(KafkaServer.scala:648) at kafka.server.KafkaServer.startup(KafkaServer.scala:187) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37) at kafka.Kafka$.main(Kafka.scala:67) at kafka.Kafka.main(Kafka.scala) [2016-10-12 00:09:10,900] INFO shutting down (kafka.server.KafkaServer) [2016-10-12 00:09:10,914] INFO Shutting down. (kafka.log.LogManager) [2016-10-12 00:09:11,113] INFO Shutdown complete. (kafka.log.LogManager) [2016-10-12 00:09:11,115] INFO Terminate ZkClient event thread. (org.I0Itec.zkclient.ZkEventThread) [2016-10-12 00:09:11,136] INFO EventThread shut down (org.apache.zookeeper.ClientCnxn) [2016-10-12 00:09:11,136] INFO Session: 0x257b7b394f70000 closed (org.apache.zookeeper.ZooKeeper) [2016-10-12 00:09:11,140] INFO shut down completed (kafka.server.KafkaServer) [2016-10-12 00:09:11,142] FATAL Fatal error during KafkaServerStartable startup. Prepare to shutdown (kafka.server.KafkaServerStartable) kafka.common.InconsistentBrokerIdException: Configured broker.id 1 doesn't match stored broker.id 0 in meta.properties. If you moved your data, make sure your configured broker.id matches. If you intend to create a new broker, you should remove all data in your data directories (log.dirs). at kafka.server.KafkaServer.getBrokerId(KafkaServer.scala:648) at kafka.server.KafkaServer.startup(KafkaServer.scala:187) at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:37) at kafka.Kafka$.main(Kafka.scala:67) at kafka.Kafka.main(Kafka.scala)
5、在其中一台机上创建一个topic,
bin/kafka-topics.sh --create --zookeeper node1:2181,node2:2181,node3:2181 --replication-factor 2 --partitions 2 --topic test-3
6、查看topic,已经创建成功,
[hadoop@master1 kafka_2.10-0.10.0.1]$ bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181 test-3
查看数据存放目录:两台机器上都有了:
[hadoop@master2 message-folder]$ ll total 24 -rw-rw-r-- 1 hadoop hadoop 4 Oct 12 00:51 cleaner-offset-checkpoint -rw-rw-r-- 1 hadoop hadoop 54 Oct 9 20:55 meta.properties -rw-rw-r-- 1 hadoop hadoop 26 Oct 12 00:52 recovery-point-offset-checkpoint -rw-rw-r-- 1 hadoop hadoop 26 Oct 12 00:52 replication-offset-checkpoint drwxrwxr-x 2 hadoop hadoop 4096 Oct 12 00:52 test-3-0 drwxrwxr-x 2 hadoop hadoop 4096 Oct 12 00:52 test-3-1
kafka集群安装成功。
五、server.properties常用配置项:
broker.id=0 # kafka节点id,必须是正数,不能相同 num.network.threads=2 # kafka处理消息的线程数 num.io.threads=8 #kafka IO线程数 # 等待IO线程处理的请求队列最大数 queued.max.requests = 500 # socket发送数据的缓冲区大小 socket.send.buffer.bytes=1048576 # socket接收数据的缓冲区大小 socket.receive.buffer.bytes=1048576 # socket请求的最大字节数 socket.request.max.bytes=104857600 # kafka数据存放目录,多个目录使用逗号分隔 log.dirs=/data/kafka_2.10-0.10.0.1/message-folder # 分区数量 num.partitions=2 # 数据保存时间,单位:小时,默认是7天 log.retention.hours=168 # 日志segment文件的大小的上限,-1表示不限制。 log.segment.bytes=536870912 # 日志片段文件的检查周期,查看它们是否达到了删除策略的设置(log.retention.hours或log.retention.bytes log.retention.check.interval.ms=60000 # 是否开启压缩 log.cleaner.enable=false # 对于压缩的日志保留的最长时间 log.cleaner.delete.retention.ms = 1 day #zookeeper连接地址,多个用逗号分隔 zookeeper.connect=localhost:2181 # zookeeper连接超时时间 zookeeper.connection.timeout.ms=1000000
六、常用命令:
(1)、kafka-topics.sh 脚本命令
1、脚本参数
--alter 修改topic分区配置,比如分区数量,replica assignment等。 --config 配置项, --create 创建一个topic --delete 删除一个topic --delete-config 删除一个topic配置项 --describe 列出topic详细信息 --disable-rack-aware Disable rack aware replica assignment --help 打印帮助选项 --if-exists 在alter、删除一个topic时,仅在topic存在时执行 --if-not-exists 创建一个topic时,在topic不存在时执行 --list 列出所有可用topic --partitions 设置分区数 --replica-assignment A list of manual partition-to-broker --topic 设置topic名 --topics-with-overrides if set when describing topics, only how topics that have overridden configs --unavailable-partitions 在列出topic信息(即describe)时,列出不用的分区 --under-replicated-partitions if set when describing topics, only show under replicated partitions --zookeeper zookeeper连接地址,格式host:port,host:port
示例:
1、创建一个topic
创建一个名为test-1,partition备份数为1,分区数为1的topic。
bin/kafka-topics.sh --create --zookeeper node1:2181,node2:2181,node3:2181 --replication-factor 1 --partitions 1 --topic test-1
注意,partition备份数不可以超过kafka集群的数量,分区数可以。
2、查看topic列表
bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181
3、删除一个topic,
bin/kafka-topics.sh --delete --zookeeper node1:2181,node2:2181,node3:2181 --topic test-3
再查看topic列表,其实并没有立刻删除。。
bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181
控制台显示:Topic test-3 is marked for deletion.
解决办法:
A。手动删除方法:
先删除每个broker节点的topic数据,目录在server.properties文件的log.dirs配置项,以要删除的topic 名字开头的文件夹。
再删除zookeeper的数据:
rmr /brokers/topics/{topic_name}
rmr /admin/delete_topics/{topic_name}
rmr /config/topics/{topic_name}
B、kafka自动立刻删除:
需要设置在启动broker时候开启删除topic的开关,即在server.properties中添加:
delete.topic.enable=true
4、查看所有的topic
bin/kafka-topics.sh --list --zookeeper node1:2181,node2:2181,node3:2181
5、查看topic具体信息
bin/kafka-topics.sh --describe --zookeeper node1:2181,node2:2181,node3:2181 --topic test
结果:
Topic:test PartitionCount:2 ReplicationFactor:1 Configs: Topic: test Partition: 0 Leader: 1002 Replicas: 1002 Isr: 1002 Topic: test Partition: 1 Leader: 1003 Replicas: 1003 Isr: 1003
6、修改topic分区
bin/kafka-topics.sh --zookeeper node1:2181,node2:2181,node3:2181 --alter --topic test --partitions 2
结果:
WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected Adding partitions succeeded!
7、增加副本
不能使用kafka-topics.sh增加副本
新建一个json文件,里面的partition字段和replicas分别是分区号和副本号,这个需要用describe命令来查看
比如看到的
bin/kafka-topics.sh --describe --zookeeper node1:2181,node2:2181,node3:2181 --topic test Topic:test PartitionCount:2 ReplicationFactor:1 Configs: Topic: test Partition: 0 Leader: 1022 Replicas: 1022 Isr: 1022 Topic: test Partition: 1 Leader: 1020 Replicas: 1020 Isr: 1020
这里有两个分区,一个副本,
{ "version": 1, "partitions": [ { "topic": "test", "partition": 0, "replicas": [ 1022, 1020 ] }, { "topic": "test", "partition": 1, "replicas": [ 1022, 1020 ] } ] }
然后执行
bin/kafka-reassign-partitions.sh --zookeeper node1:2181,node2:2181,node3:2181 --reassignment-json-file add.json --execute
结果:
bin/kafka-reassign-partitions.sh --zookeeper node1:2181,node2:2181,node3:2181 --reassignment-json-file add.json --execute Current partition replica assignment {"version":1,"partitions":[{"topic":"test","partition":1,"replicas":[1022,1020],"log_dirs":["any","any"]},{"topic":"test","partition":0,"replicas":[1022,1020],"log_dirs":["any","any"]}]} Save this to use as the --reassignment-json-file option during rollback Successfully started reassignment of partitions.
再看看具体信息
bin/kafka-topics.sh --describe --zookeeper node1:2181,node2:2181,node3:2181 --topic test Topic:test PartitionCount:2 ReplicationFactor:2 Configs: Topic: test Partition: 0 Leader: 1022 Replicas: 1022,1020 Isr: 1022,1020 Topic: test Partition: 1 Leader: 1020 Replicas: 1022,1020 Isr: 1020,1022
参考:http://kafka.apache.org/quickstart