Flume更新比较慢,而elasticsearch更新非常快所以当涉及更换elasticsearch版本时会出现不兼容问题。
apache-flume-1.6.0+elasticsearch1.5.1是可以完美结合的,这里将elasticsearch版本升级到6.3.2。
低版本elasticsearch和高版本elasticsearch连接方式完全不一样所以需要重写Sink。
下载源码flume-ng-sinks\flume-ng-elasticsearch-sink\ElasticSearchSink.java,查看人家的源码。
我直接起个项目重写了
POM
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>6.4.</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.elasticsearch.client/transport -->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>transport</artifactId>
<version>6.4.</version>
</dependency>
<!-- <dependency> <groupId>io.netty</groupId> <artifactId>netty-all</artifactId>
<version>4.1..Final</version> </dependency> -->
<!-- https://mvnrepository.com/artifact/org.apache.flume.flume-ng-sinks/flume-ng-elasticsearch-sink -->
<dependency>
<groupId>org.apache.flume.flume-ng-sinks</groupId>
<artifactId>flume-ng-elasticsearch-sink</artifactId>
<version>1.6.</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.code.gson/gson -->
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.</version>
</dependency> </dependencies>
重写的Sink类
package com.jachs.sink.elasticsearch; import org.apache.flume.Channel;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.EventDeliveryException;
import org.apache.flume.Transaction;
import org.apache.flume.conf.Configurable;
import org.apache.flume.sink.AbstractSink;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.transport.client.PreBuiltTransportClient; import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.CLUSTER_NAME;
import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.INDEX_NAME; import java.net.InetAddress;
import java.net.UnknownHostException;
import java.util.HashMap;
import java.util.Map; import static org.apache.flume.sink.elasticsearch.ElasticSearchSinkConstants.HOSTNAMES; public class ElasticSearchSink extends AbstractSink implements Configurable {
private String hostNames;
private String indexName;
private String clusterName;
static TransportClient client; public void configure(Context context) {
hostNames = context.getString(HOSTNAMES);
indexName = context.getString(INDEX_NAME);
clusterName = context.getString(CLUSTER_NAME);
} @Override
public void start() {
Settings settings = Settings.builder().put("cluster.name", clusterName).build();
try {
client = new PreBuiltTransportClient(settings).addTransportAddress(new TransportAddress(
InetAddress.getByName(hostNames.split(":")[]), Integer.parseInt(hostNames.split(":")[])));
} catch (UnknownHostException e) {
e.printStackTrace();
} } @Override
public void stop() {
super.stop();
} public Status process() throws EventDeliveryException {
Status status = null;
Channel ch = getChannel();
Transaction txn = ch.getTransaction();
txn.begin();
try {
Event event = ch.take();
Map<String, String> head = event.getHeaders();
Map<String, Object> map = new HashMap<String, Object>(); for (String key : head.keySet()) {
map.put("topic", key);
map.put("timestamp", head.get(key));
map.put("data", new String(event.getBody()));
} IndexRequestBuilder create = client.prepareIndex(indexName, "text").setSource(map);
IndexResponse response = create.execute().actionGet(); txn.commit();
status = Status.READY;
} catch (Throwable t) {
txn.rollback();
status = Status.BACKOFF;
if (t instanceof Error) {
throw (Error) t;
}
} finally {
txn.close();
}
return status;
}
}
mvn install -DskipTests
打包,然后将Flume下的flume-ng-kafka-sink.jar替换掉。
修改Flume配置文件将下面修改为自己的类位置
a1.sinks.elasticsearch.type=com.jachs.sink.elasticsearch.ElasticSearchSink
我这里使用的是FileBeat-kafka-flume-elasticsearch,所以是从kafka取数到elasticsearch,根据自己sources修改自己连接。然后将kafka和elasticsearch的jar包Copy到Flume下注意版本冲突保持JAR版本正确不要冲突。
官方参考
http://flume.apache.org/releases/content/1.9.0/FlumeDeveloperGuide.html#sink
http://flume.apache.org/releases/content/1.6.0/apidocs/index.html
Channel对象是管道,可以创建Transaction事务,采用回调方式将sources数据放进了Data,启动个Even事件,然后根据自己逻辑代码动态设置状态码最后返回状态码。