Elasticsearch 2.3 java api

Java API 2.3

一、Preface 前言

This section describes the Java API that elasticsearch provides. All elasticsearch operations are executed using a Client object. All operations are completely asynchronous in nature (either accepts a listener, or returns a future).

Additionally, operations on a client may be accumulated and executed in Bulk.

Note, all the APIs are exposed through the Java API (actually, the Java API is used internally to execute them).

本章节介绍了es 提供的JAVA API。所有的es操作都是使用客户端执行的。所有操作都完全是异步的（无论是正在监听还是将返回的）

此外，客户端的操作可能会被积累，所以要使用批量操作（Bulk）。

请注意，所有的API都是通过JAVA API 暴露的（实际上，java api 是用于内部执行它们）。

二、Maven Repository Maven仓库

Elasticsearch is hosted on Maven Central.

For example, you can define the latest version in your pom.xml file:

es是托管在maven仓库的。

举个例子，你可以在你的pom.xml文件里定义最后的版本：

    <dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>${es.version}</version>
    </dependency>

    <dependency>  
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>2.3.5</version>
    </dependency>

三、Dealing with JAR dependency conflicts 处理jar的依赖冲突

If you want to use Elasticsearch in your Java application, you may have to deal with version conflicts with third party dependencies like Guava and Joda. For instance, perhaps Elasticsearch uses Joda 2.8, while your code uses Joda 2.1.

You have two choices:

The simplest solution is to upgrade. Newer module versions are likely to have fixed old bugs. The further behind you fall, the harder it will be to upgrade later. Of course, it is possible that you are using a third party dependency that in turn depends on an outdated version of a package, which prevents you from upgrading.
The second option is to relocate the troublesome dependencies and to shade them either with your own application or with Elasticsearch and any plugins needed by the Elasticsearch client.

The "To shade or not to shade" blog post describes all the steps for doing so.

如果你使用es在你的java应用里，你可能需要去处理和第三方依赖（例如Guava和Joda）的版本冲突。例如，也许es使用Joda2.8，而你的代码使用Joda2.1。

你有两个选择：

最简单的解决方法就是升级。较新模块的版本可能会有固定的旧bug。在你解决后，升级后将更难。当然你使用第三方依赖可能会反过来依赖一个过时的版本来阻碍你的升级。
第二个选择就是迁移这令人烦恼的依赖，对于需要使用到es客户端的你自己的应用、es和es的插件隐藏它们

这个隐藏不隐藏的博客介绍了完整的步骤

https://www.elastic.co/blog/to-shade-or-not-to-shade

四、Embedding jar with dependencies 植入依赖的jar

If you want to create a single jar containing your application and all dependencies, you should not use maven-assembly-plugin for that because it can not deal with META-INF/services structure which is required by Lucene jars.

Instead, you can use maven-shade-plugin and configure it as follow:

如果你想创建一个单一的包含你的应用和所有依赖的jar，你不应该使用 maven-assembly-plugin 来完成，因为它不能处理Lucene jars 要求使用的 META-INF/服务结果。

相反，你可以使用maven-shade-plugin来像下面配置它：

    <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>2.4.1</version>
    <executions>
    <execution>
    <phase>package</phase>
    <goals><goal>shade</goal></goals>
    <configuration>
    <transformers>
    <transformerimplementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
    </transformers>
    </configuration>
    </execution>
    </executions>
    </plugin>

Note that if you have a main class you want to automatically call when running java -jar yourjar.jar, just add it to the transformers:

注意，如果你有一个主类，你想自动调用java -jar yourjar.jar运行时，只要把它加到transformers：

    <transformerimplementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
    <mainClass>org.elasticsearch.demo.Generate</mainClass>
    </transformer>

五、Deploying in JBoss EAP6 module 部署JBoss EAP6 模块

Elasticsearch and Lucene classes need to be in the same JBoss module.

You should define a module.xml file like this:

    <?xml version="1.0" encoding="UTF-8"?>
    <modulename="org.elasticsearch">
    <resources>
    <!-- Elasticsearch -->
    <resource-rootpath="elasticsearch-2.0.0.jar"/>
    <!-- Lucene -->
    <resource-rootpath="lucene-core-5.1.0.jar"/>
    <resource-rootpath="lucene-analyzers-common-5.1.0.jar"/>
    <resource-rootpath="lucene-queries-5.1.0.jar"/>
    <resource-rootpath="lucene-memory-5.1.0.jar"/>
    <resource-rootpath="lucene-highlighter-5.1.0.jar"/>
    <resource-rootpath="lucene-queryparser-5.1.0.jar"/>
    <resource-rootpath="lucene-sandbox-5.1.0.jar"/>
    <resource-rootpath="lucene-suggest-5.1.0.jar"/>
    <resource-rootpath="lucene-misc-5.1.0.jar"/>
    <resource-rootpath="lucene-join-5.1.0.jar"/>
    <resource-rootpath="lucene-grouping-5.1.0.jar"/>
    <resource-rootpath="lucene-spatial-5.1.0.jar"/>
    <resource-rootpath="lucene-expressions-5.1.0.jar"/>
    <!-- Insert other resources here -->
    </resources>
    <dependencies>
    <modulename="sun.jdk"export="true">
    <imports>
    <includepath="sun/misc/Unsafe"/>
    </imports>
    </module>
    <modulename="org.apache.log4j"/>
    <modulename="org.apache.commons.logging"/>
    <modulename="javax.api"/>
    </dependencies>
    </module>

六、Client 客户端

You can use the Java client in multiple ways:

Perform standard index, get, delete and search operations on an existing cluster
Perform administrative tasks on a running cluster

Obtaining an elasticsearch Client is simple. The most common way to get a client is by creating a TransportClient that connects to a cluster.

The client must have the same major version (e.g. 2.x, or 5.x) as the nodes in the cluster. Clients may connect to clusters which have a different minor version (e.g. 2.3.x) but it is possible that new funcionality may not be supported. Ideally, the client should have the same version as the cluster.

你能用多种方式来使用java客户端：

在现有的集群上执行标准的 index / get / delete / search 操作
在运行的集群上执行管理任务

获得一个es客户端很简单。最常见的方式是得到一个通过创建 TransportClient 来连接集群的客户端连接

客户端必须有相同的主版本（例如2.x，或5.x）作为集群的节点。客户端们的小版本不一样是可以连接到集群的，但是可能新的不知道。理论上，客户端应该和集群的版本相同。

6.1 Transport Client 传输客户端

The TransportClient connects remotely to an Elasticsearch cluster using the transport module. It does not join the cluster, but simply gets one or more initial transport addresses and communicates with them in round robin fashion on each action (though most actions will probably be "two hop" operations).

TransportClient 远程连接到一个es集群使用传输模块。他不加入到集群，仅仅是得到一个或多个初始传输地址和在每个动作中与他们交流的循环方式（虽然大多数的行动可能会“two hop”行动）。

1 // on startup
2 Client client =TransportClient.builder().build()
3     .addTransportAddress(newInetSocketTransportAddress(InetAddress.getByName("host1"),9300))
4     .addTransportAddress(newInetSocketTransportAddress(InetAddress.getByName("host2"),9300));
5 // on shutdown
6 client.close();

Note that you have to set the cluster name if you use one different than "elasticsearch":

你得注意当你使用一个不同的es时需要设置集群的名称。

1 Settings settings =Settings.settingsBuilder()
2      .put("cluster.name","myClusterName").build();
3 Client client =TransportClient.builder().settings(settings).build();
4 //Add transport addresses and do something with the client...

The Transport client comes with a cluster sniffing feature which allows it to dynamically add new hosts and remove old ones. When sniffing is enabled, the transport client will connect to the nodes in its internal node list, which is built via calls to addTransportAddress. After this, the client will call the internal cluster state API on those nodes to discover available data nodes. The internal node list of the client will be replaced with those data nodes only. This list is refreshed every five seconds by default. Note that the IP addresses the sniffer connects to are the ones declared as the publish address in those node’s elasticsearch config.

传输客户端配备一个集群监听功能，允许它动态的添加新的主机和删除旧的主机。当监听都能被启用时，通过调用addTransportAddress，传输客户端将连接到内部节点列表中的节点。之后，这个客户端将调用那些节点内部客户端状态API来发现可用的数据节点。客户端的内部节点列表只会被那些数据节点替换。这个列表默认5秒刷新一次。需要注意的是IP地址监听连接会作为那些节点es配置文件中的publish地址来声明。

Keep in mind that the list might possibly not include the original node it connected to if that node is not a data node. If, for instance, you initially connect to a master node, after sniffing, no further requests will go to that master node, but rather to any data nodes instead. The reason the transport client excludes non-data nodes is to avoid sending search traffic to master only nodes.

请牢记，如果原始节点不是数据节点，这个列表可能会不包括原始节点。例如，如果你开始的时候连接一个主节点，开启监听后，没有新的请求到主节点，而是去其它数据节点。传输客户端排除非数据节点的原因是为了发送搜索流到仅仅作为节点的主节点。

In order to enable sniffing, set client.transport.sniff to true:

为了启用监听功能，设置client.transport.sniff为true:

1 Settings settings =Settings.settingsBuilder()
2      .put("client.transport.sniff",true).build();
3 TransportClient client =TransportClient.builder().settings(settings).build();

Other transport client level settings include:

其它传输客户端级别设置包括：

Parameter Description

Parameter	Description
`client.transport.ignore_cluster_name`	Set to `true` to ignore cluster name validation of connected nodes. (since 0.19.4) 设置为true时，忽略连接节点的集群名称验证（从0.19.4开始）
`client.transport.ping_timeout`	The time to wait for a ping response from a node. Defaults to `5s`. 等待一个节点的响应时间。默认为5秒
`client.transport.nodes_sampler_interval`	How often to sample / ping the nodes listed and connected. Defaults to `5s`. 获取节点列表和连接的频率。默认为5秒

client.transport.ignore_cluster_name

Set to true to ignore cluster name validation of connected nodes. (since 0.19.4)

设置为true时，忽略连接节点的集群名称验证（从0.19.4开始）

client.transport.ping_timeout

The time to wait for a ping response from a node. Defaults to 5s.

等待一个节点的响应时间。默认为5秒

client.transport.nodes_sampler_interval

How often to sample / ping the nodes listed and connected. Defaults to 5s.

获取节点列表和连接的频率。默认为5秒

6.2 Connecting a Client to a Client Node 连接客户端到一个节点

You can start locally a Client Node and then simply create a TransportClient in your application which connects to this Client Node.

This way, the client node will be able to load whatever plugin you need (think about discovery plugins for example).

您可以在本地启动客户端节点，然后简单地创建一个TransportClient应用来连接到该客户端节点。

通过这种方式，客户端节点将能够加载你需要任何插件（思考下发现插件的例子）。

七、Document APIs 文档API

This section describes the following CRUD APIs:

本章节介绍下面的CRUD API ：

Single document APIs 单文档api

Multi-document APIs 多文档api

All CRUD APIs are single-index APIs. The index parameter accepts a single index name, or an alias which points to a single index.

所有的CRUD API是单独索引的API。该索引参数接受一个单索引名称，或指向一个单索引的别名。

7.1 Index API

7.2 Get API

7.3 Delete API

7.4 Update API

7.5 Multi Get API

7.6 Bulk API

7.7 Using Bulk Processor

八、Search API

The search API allows one to execute a search query and get back search hits that match the query. It can be executed across one or more indices and across one or more types. The query can provided using the query Java API. The body of the search request is built using the SearchSourceBuilder. Here is an example:

 1     import org.elasticsearch.action.search.SearchResponse;
 2     import org.elasticsearch.action.search.SearchType;
 3     import org.elasticsearch.index.query.QueryBuilders.*;
 4 
 5     SearchResponse response = client.prepareSearch("index1","index2")
 6     .setTypes("type1","type2")
 7     .setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
 8     .setQuery(QueryBuilders.termQuery("multi","test"))// Query
 9     .setPostFilter(QueryBuilders.rangeQuery("age").from(12).to(18))// Filter
10     .setFrom(0).setSize(60).setExplain(true)
11     .execute()
12     .actionGet();

Note that all parameters are optional. Here is the smallest search call you can write:

文章源自微信公众号【刍荛采葑菲】，转载请注明。

相关阅读:
“图”以致用组
 水体频率小组
 2021年云开发组三等奖作品展示
 毫秒级百万数据分页存储过程[欢迎转载]
SQL Server 数据备份存储过程[原创]
博客园居然被中国电信提醒有病毒，有图为证
 网络文件夹例子
 小技巧：在DropDownList数据绑定前插入固定文字
 ASP.NET整合Discuz!NT3.5实例说明(含用户登录、评论等)
Visual Studio 2008的性能改进以及十大新功能(转)
原文地址：https://www.cnblogs.com/churao/p/5856152.html