Flume更新比较慢,而elasticsearch更新非常快所以当涉及更换elasticsearch版本时会出现不兼容问题。
apache-flume-1.6.0+elasticsearch1.5.1是可以完美结合的,这里将elasticsearch版本升级到6.3.2。
低版本elasticsearch和高版本elasticsearch连接方式完全不一样所以需要重写Sink。
下载源码flume-ng-sinksflume-ng-elasticsearch-sinkElasticSearchSink.java,查看人家的源码。
我直接起个项目重写了
POM
<dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch --> <dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>6.4.3</version> </dependency> <!-- https://mvnrepository.com/artifact/org.elasticsearch.client/transport --> <dependency> <groupId>org.elasticsearch.client</groupId> <artifactId>transport</artifactId> <version>6.4.3</version> </dependency> <!-- <dependency> <groupId>io.netty</groupId> <artifactId>netty-all</artifactId> <version>4.1.32.Final</version> </dependency> --> <!-- https://mvnrepository.com/artifact/org.apache.flume.flume-ng-sinks/flume-ng-elasticsearch-sink --> <dependency> <groupId>org.apache.flume.flume-ng-sinks</groupId> <artifactId>flume-ng-elasticsearch-sink</artifactId> <version>1.6.0</version> </dependency> <!-- https://mvnrepository.com/artifact/com.google.code.gson/gson --> <dependency> <groupId>com.google.code.gson</groupId> <artifactId>gson</artifactId> <version>2.8.5</version> </dependency> </dependencies>
重写的Sink类
package com.jachs.sink.elasticsearch; import org.apache.flume.Channel; import org.apache.flume.Context; import org.apache.flume.Event; import org.apache.flume.EventDeliveryException; import org.apache.flume.Transaction; import org.apache.flume.conf.Configurable; import org.apache.flume.sink.AbstractSink; import org.elasticsearch.action.index.IndexRequestBuilder; import org.elasticsearch.action.index.IndexResponse; import org.elasticsearch.client.transport.TransportClient; import org.elasticsearch.common.settings.Settings; import org.elasticsearch.common.transport.TransportAddress; import org.elasticsearch.transport.client.PreBuiltTransportClient; import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLUSTER_NAME; import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME; import java.net.InetAddress; import java.net.UnknownHostException; import java.util.HashMap; import java.util.Map; import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.HOSTNAMES; public class ElasticSearchSink extends AbstractSink implements Configurable { private String hostNames; private String indexName; private String clusterName; static TransportClient client; public void configure(Context context) { hostNames = context.getString(HOSTNAMES); indexName = context.getString(INDEX_NAME); clusterName = context.getString(CLUSTER_NAME); } @Override public void start() { Settings settings = Settings.builder().put("cluster.name", clusterName).build(); try { client = new PreBuiltTransportClient(settings).addTransportAddress(new TransportAddress( InetAddress.getByName(hostNames.split(":")[0]), Integer.parseInt(hostNames.split(":")[1]))); } catch (UnknownHostException e) { e.printStackTrace(); } } @Override public void stop() { super.stop(); } public Status process() throws EventDeliveryException { Status status = null; Channel ch = getChannel(); Transaction txn = ch.getTransaction(); txn.begin(); try { Event event = ch.take(); Map<String, String> head = event.getHeaders(); Map<String, Object> map = new HashMap<String, Object>(); for (String key : head.keySet()) { map.put("topic", key); map.put("timestamp", head.get(key)); map.put("data", new String(event.getBody())); } IndexRequestBuilder create = client.prepareIndex(indexName, "text").setSource(map); IndexResponse response = create.execute().actionGet(); txn.commit(); status = Status.READY; } catch (Throwable t) { txn.rollback(); status = Status.BACKOFF; if (t instanceof Error) { throw (Error) t; } } finally { txn.close(); } return status; } }
mvn install -DskipTests
打包,然后将Flume下的flume-ng-kafka-sink.jar替换掉。
修改Flume配置文件将下面修改为自己的类位置
a1.sinks.elasticsearch.type=com.jachs.sink.elasticsearch.ElasticSearchSink
我这里使用的是FileBeat-kafka-flume-elasticsearch,所以是从kafka取数到elasticsearch,根据自己sources修改自己连接。然后将kafka和elasticsearch的jar包Copy到Flume下注意版本冲突保持JAR版本正确不要冲突。
官方参考
http://flume.apache.org/releases/content/1.9.0/FlumeDeveloperGuide.html#sink
http://flume.apache.org/releases/content/1.6.0/apidocs/index.html
Channel对象是管道,可以创建Transaction事务,采用回调方式将sources数据放进了Data,启动个Even事件,然后根据自己逻辑代码动态设置状态码最后返回状态码。