kafka可以作为kafka的任意一个组件,source、channel、sink
kafka(消息集群中间件,可重复消费,高吞吐量,暂存)和flume(落地,抓取源文件,监控目录,实时收集)
kafka与flume的集成
1.kafka作为source,从kafka中取数据,source.type=kafkaSource(消费者角色)
a1.sources.r1.type = org.apache.flume.source.kafka.KafkaSource //batch info a1.sources.r1.batchSize = 5000 a1.sources.r1.batchDurationMillis = 2000 //broker a1.sources.r1.kafka.bootstrap.servers = s202:9092 //topicname a1.sources.r1.kafka.topics = topic1 //groupname a1.sources.r1.kafka.consumer.group.id = g1
2.kafka作为sink
把消息推送到kafka,
a1.sources=r1 a1.channels=c1 a1.sinks=k1 a1.sources.r1.type=netcat a1.sources.r1.bind=localhost a1.sources.r1.port=8888 a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink a1.sinks.k1.kafka.topic = topic1 a1.sinks.k1.kafka.bootstrap.servers = s202:9092 a1.sinks.k1.kafka.flumeBatchSize = 20 a1.sinks.k1.kafka.producer.acks = 1 a1.sinks.k1.kafka.producer.linger.ms = 1 a1.channels.c1.type=memory a1.sources.r1.channels = c1 a1.sinks.k1.channel= c1
3.kafka作为channel
既作为生产者,也作为消费者
a1.sources=r1 a1.channels=c1 a1.sinks=k1 a1.sources.r1.type=netcat a1.sources.r1.bind=localhost a1.sources.r1.port=8888 a1.sinks.k1.type = logger a1.channels.c1.type = org.apache.flume.channel.kafka.KafkaChannel a1.channels.c1.parseAsFlumeEvent = false a1.channels.c1.kafka.bootstrap.servers = s202:9092 a1.channels.c1.kafka.topic = topic1 a1.channels.c1.kafka.consumer.group.id = g5 a1.sources.r1.channels = c1 a1.sinks.k1.channel= c1