• elasticsearch的两种初始化方式以及matchQuery,termQuery, multiMatchQuery区别与实现


    方式一:使用TransportClient方式:

    public ESConfiguration()
        {
            if(EnvUtils.isOnlineEnv())
            {
                hostName = "xxxxx1";
                hostName2 = "xxxx2";
                hostName3 = "xxxx3";
                port = "9300";
                clusterName = "yyyy";
            }else {
                hostName = "vvvvv1";
                hostName2 = "vvvv2";
                hostName3 = "vvvv3";
                port = "9300";
                clusterName = "zzzz";
            }
            createTransportClient();
        }
    
        public void createTransportClient()
        {
            try {
                // 配置信息 -- 配置 集群的名字 + 连接池的个数
                Settings esSetting = Settings.builder().put("cluster.name", clusterName)       //设置连接的集群名称
                        .put("client.transport.sniff", false)                       //增加嗅探机制,找到ES集群
                        .put("thread_pool.search.size", Integer.parseInt(poolSize))          // 增加线程池个数,暂时设为5
                        .build();
    
                client = new PreBuiltTransportClient(esSetting);
                //配置host 和 端口port
                InetSocketTransportAddress inetSocketTransportAddress = new InetSocketTransportAddress(InetAddress.getByName(hostName), Integer.valueOf(port));
                  InetSocketTransportAddress inetSocketTransportAddress2 = new InetSocketTransportAddress(InetAddress.getByName(hostName2), Integer.valueOf(port));
                  InetSocketTransportAddress inetSocketTransportAddress3 = new InetSocketTransportAddress(InetAddress.getByName(hostName3), Integer.valueOf(port));
                client.addTransportAddresses(inetSocketTransportAddress).addTransportAddresses(inetSocketTransportAddress2).addTransportAddresses(inetSocketTransportAddress3);
    
            } catch (Exception e) {
                logger.error("elasticsearch TransportClient create error!!!", e);
            }
        }
    
        public TransportClient getInstance() {
    
            return client;
        }

    方式二:使用 RestHighLevelClient + http 方式

        /**
         * es集群地址
         */
        private String servers = "xxxx1,xxxx2,xxxx3";
    
        /**
         * 端口
         */
        private int port = 9301;
    
        private int size = 3;
    
        private String scheme = "http";
    
        private RestHighLevelClient restHighLevelClient;
    
    
    
        @PostConstruct
        public void init() {
            logger.info("init Es Client...");
            RestClientBuilder builder = getRestClientBuilder();
            restHighLevelClient = new RestHighLevelClient(builder);
            logger.info("init Es Client complete...");
        }
    
        public RestClientBuilder getRestClientBuilder() {
            String[] address = StringUtils.split(servers, ",");
            if (ArrayUtils.isNotEmpty(address) && address.length == size) {
                return RestClient.builder(new HttpHost(address[0], port, scheme), new HttpHost(address[1], port, scheme), new HttpHost(address[2], port, scheme));
            }
            return null;
        }
    
    
        public RestHighLevelClient getInstance() {
            if (restHighLevelClient == null) {
                init();
            }
            return restHighLevelClient;
        }

    使用highlevelClient,使用bulk方式插入数据:

        public String executeBulkDocInsert(List<Map> jsonList)
        {
            try {
    
                BulkRequest request = new BulkRequest();
    
                for(Map crashInfo : jsonList) {
                    IndexRequest indexRequest = new IndexRequest("crash_bulk_index_2020-01-01", "crash", "11").source(crashInfo);
                  //UpdateRequest updateRequest = new UpdateRequest("twitter", "_doc", "11").doc(new IndexRequest("crash_bulk_index_2020-01-02", "_type", "11").source(jsonStr));
                  request.add(indexRequest);
                  //request.add(updateRequest);
                }
    
                request.timeout(TimeValue.timeValueMinutes(2));
                request.setRefreshPolicy(WriteRequest.RefreshPolicy.WAIT_UNTIL);
    
                BulkResponse bulkResponse = restHighLevelClient.bulk(request, RequestOptions.DEFAULT);
    
                for (BulkItemResponse bulkItemResponse : bulkResponse)
                {
                    if (bulkItemResponse.getFailure() != null) {
                        BulkItemResponse.Failure failure = bulkItemResponse.getFailure();
                        System.out.println(failure.getCause());
                        if(failure.getStatus() == RestStatus.BAD_REQUEST) {
                            System.out.println("id=" + bulkItemResponse.getId() + "为非法的请求!");
                            continue;
                        }
                    }
    
                    DocWriteResponse itemResponse = bulkItemResponse.getResponse();
    
                    if (bulkItemResponse.getOpType() == DocWriteRequest.OpType.INDEX || bulkItemResponse.getOpType() == DocWriteRequest.OpType.CREATE) {
                        if(bulkItemResponse.getFailure() != null && bulkItemResponse.getFailure().getStatus() == RestStatus.CONFLICT) {
                            System.out.println("id=" + bulkItemResponse.getId() + "与现在文档冲突");
                            continue;
                        }
                        IndexResponse indexResponse = (IndexResponse) itemResponse;
                        System.out.println("id=" + indexResponse.getId() + "的文档创建成功");
                        System.out.println("id=" + indexResponse.getId() + "文档操作类型:" + itemResponse.getResult());
                    } else if (bulkItemResponse.getOpType() == DocWriteRequest.OpType.UPDATE) {
                        UpdateResponse updateResponse = (UpdateResponse) itemResponse;
                        System.out.println("id=" + updateResponse.getId() + "的文档更新成功");
                        System.out.println("id=" + updateResponse.getId() +"文档内容为:" + updateResponse.getGetResult().sourceAsString());
                    } else if (bulkItemResponse.getOpType() == DocWriteRequest.OpType.DELETE) {
                        DeleteResponse deleteResponse = (DeleteResponse) itemResponse;
                        if (deleteResponse.getResult() == DocWriteResponse.Result.NOT_FOUND) {
                            System.out.println("id=" + deleteResponse.getId() + "的文档未找到,未执行删除!");
                        }else {
                            System.out.println("id=" + deleteResponse.getId() + "的文档删除成功");
                        }
                    }
                }
    
            } catch (Exception e) {
                e.printStackTrace();
                return "bulk insert into index failed";
            } finally {
            }
    
            return null;
        }

    ************************************************************ matchQuery,termQuery, multiMatchQuery区别与实现 ************************************************************

    区别1:matchPhraseQuery和matchQuery等的区别,在使用matchQuery等时,在执行查询时,搜索的词会被分词器分词,而使用matchPhraseQuery时,

    不会被分词器分词,而是直接以一个短语的形式查询,而如果你在创建索引所使用的field的value中没有这么一个短语(顺序无差,且连接在一起),那么将查询不出任何结果。

    区别2:

    matchQuery:会将搜索词分词,再与目标查询字段进行匹配,若分词中的任意一个词与目标字段匹配上,则可查询到。

    termQuery:不会对搜索词进行分词处理,而是作为一个整体与目标字段进行匹配,若完全匹配,则可查询到。

    matchQuery多条件查询模板:

        public BootstrapTablePaginationVo<String> searchMsgByParam(BasicCrashInfoSearchParam param) throws Exception {
    
            /**处理和检查入参**/
            String index = param.getIndex();
            String type = param.getType();
            String filed = param.getField();
            String keyWord = param.getKeyWord();
    
            if(index == null || filed == null || keyWord == null)
            {
                LOG.info("index、field、keyword 存在数据为null,无法正常查询!");
                return null;
            }
    
    
            /**查询前检查索引和client客户端**/
            if(client == null)
            {
                LOG.info("client为null,初始化异常,无法正常查询!");
                return null;
            }
    
            // 校验索引是否成功
            if (!isIndexExist(index)) {
                return null;
            }
    
            //todo 处理查询过程
            BootstrapTablePaginationVo<String> vo = new BootstrapTablePaginationVo<String>();
    
            // 响应信息
            List<String> responseStrList = new ArrayList<String>();
            MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery(filed, keyWord);
            SearchSourceBuilder sourceBuilder = SearchSourceBuilder.searchSource();
            sourceBuilder.query(matchQueryBuilder);
            // 去重的字段
            if (param.getDistictField() != null) {
                // 去重的信息
                CollapseBuilder cb = new CollapseBuilder(param.getDistictField());
                sourceBuilder.collapse(cb);
            }
    
            CardinalityAggregationBuilder acb = AggregationBuilders.cardinality("count_id").field(param.getDistictField());
            sourceBuilder.aggregation(acb).from(param.getOffset()).size(param.getLimit());
    
            SearchRequest searchRequest = new SearchRequest(index).source(sourceBuilder);
            if(StringUtils.isNotBlank(type)){
                searchRequest.types(type);
            }
            // 列表参数
            SearchResponse response = new SearchResponse();
            response = client.search(searchRequest, RequestOptions.DEFAULT);
            SearchHits shList = response.getHits();
            for (SearchHit searchHit : shList) {
                responseStrList.add(searchHit.getSourceAsString());
            }
            vo.setRows(responseStrList);
    
            // 统计模块
            NumericMetricsAggregation.SingleValue responseAgg = response.getAggregations().get("count_id");
            int count = 0;
            if (responseAgg != null) {
                double value = responseAgg.value();
                count = getInt(value);
            }
            vo.setTotal(count);
    
            return vo;
        }

    termQuery多条件查询模板:

       private Map<Long, Long> getExternalTagCountByRiskType(String id,
                                                     long startTime,
                                                     long endTime,
                                                     List<Long> tagIds,
                                                     UserStatEnum field){
            //构建查询条件
            SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
            BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
            boolQueryBuilder.must(termQuery("id", StringUtils.lowerCase(id)));
            boolQueryBuilder.must(rangeQuery("time").gte(startTime).lte(endTime));
            boolQueryBuilder.must(termsQuery("type", tagIds));
            //不需要返回内容
            sourceBuilder.size(0);
    
    
            //构建聚合条件
            AggregationBuilder dateAggBuilder = AggregationBuilders.terms(groupByExternalTag)
                    .field("type").order(Terms.Order.count(false)).size(1000)
                    .minDocCount(0);
            String date = LocalDate.fromDateFields(new Date(startTime)).toString();
    
            Map<Long, Long> result = Maps.newHashMap();
            //一天以内精确统计
            if(endTime - startTime <= DAY){
                sourceBuilder.query(boolQueryBuilder);
                sourceBuilder.aggregation(dateAggBuilder);
    
                UserStatEnum intervalEnum = UserStatEnum.DAILY;
                SearchResponse response = esClientService.getAbnormalUserSearchResponse(sourceBuilder, field, intervalEnum, date, appId);
                Terms agg = response.getAggregations().get(groupByExternalTag);
                for (Terms.Bucket entry : agg.getBuckets()) {
                    result.put((long)entry.getKey(), entry.getDocCount());
                }
            } else {
                AggregationBuilder cardinalityAggBuilder = AggregationBuilders.cardinality("total")
                        .field(field.getDesc() + ".keyword").precisionThreshold(10000);
                dateAggBuilder.subAggregation(cardinalityAggBuilder);
                sourceBuilder.query(boolQueryBuilder);
                sourceBuilder.aggregation(dateAggBuilder);
    
                UserStatEnum intervalEnum = UserStatEnum.DAILY;
                SearchResponse response = esClientService.getAbnormalUserSearchResponse(sourceBuilder, field, intervalEnum, date, appId);
                Terms agg = response.getAggregations().get(groupByExternalTag);
                for (Terms.Bucket entry : agg.getBuckets()) {
                    Cardinality cardinality = entry.getAggregations().get("total");
                    result.put((long)entry.getKey(), cardinality.getValue());
                }
            }
    
            return result;
        }

    matchPhraseQuery多条件查询模板:

      /**
         * 用户添加索引数据文档  --- 多条件查询
         * @param param 查询参数入口
         * @return
         * @throws Exception
         */
        public BootstrapTablePaginationVo<String> searchMsgByMultiParam(BasicCrashInfoSearchParam param) throws Exception {
            // 响应信息
            List<String> responseStrList = new ArrayList<String>();
    
    
            /**处理和检查入参**/
            String index = param.getIndex();                           //index
            String type = param.getType();                             //type
            HashMap<String, String> map = param.getMultkeyWord();      //精确条件 map
            String startTime = param.getStartTime();                   //起始时间范围查询
            String endTime = param.getEndTime();                       //终止时间
            String sortWord = param.getSortWord();                     //排序关键字
    
    
            if(index == null || map == null)
            {
                LOG.info("index、map 存在数据为null,无法正常查询!");
                return null;
            }
    
            /**查询前检查索引和client客户端**/
            if(client == null)
            {
                LOG.info("client为null,初始化异常,无法正常查询!");
                return null;
            }
            // 校验别名索引是否成功
            if (!isIndexExist(index)) {
                return null;
            }
    
            /**处理查询过程,先匹配精确条件,然后匹配时间范围,最后匹配排序**/
            BootstrapTablePaginationVo<String> vo = new BootstrapTablePaginationVo<String>();
    
            //精确条件遍历,分别添加,must表示and
            BoolQueryBuilder qb = QueryBuilders.boolQuery();
            for(Map.Entry<String, String> entry : map.entrySet())
            {
                String filed = entry.getKey();
                String keyWord = entry.getValue();
                MatchPhraseQueryBuilder mpq = QueryBuilders.matchPhraseQuery(filed,keyWord);
                qb.must(mpq);                                                                  //must表示and should表示or
            }
    
            //时间范围检索条件,时间范围的设定
            if(startTime != null && endTime != null)
            {
                RangeQueryBuilder rangequerybuilder = QueryBuilders.rangeQuery("xxxxxx").from(startTime).to(endTime);
                qb.must(rangequerybuilder);
            }
    
            //查询建立,index type
            SearchRequest searchRequest = new SearchRequest(index);
            if(StringUtils.isNotBlank(type)) {
                searchRequest.types(type);
            }
    
            SearchSourceBuilder sourceBuilder = SearchSourceBuilder.searchSource();
    
    
            //聚合分析参数
            CardinalityAggregationBuilder acb = null;
            if(param.getDistictField() != null)
            {
                acb = AggregationBuilders.cardinality("count_id").field(param.getDistictField()).precisionThreshold(10000);
            }
    
    
            SearchResponse response = null;
            //按照关键字排序
            if(sortWord == null)
            {
                if(param.getDistictField() != null)
                {
                    sourceBuilder.query(qb).aggregation(acb).from(param.getOffset()).size(param.getLimit()).explain(true);
                }else {
                    sourceBuilder.query(qb).from(param.getOffset()).size(param.getLimit()).explain(true);
                }
    
            }else {
    
                if(param.getDistictField() != null) {
                    sourceBuilder.query(qb).aggregation(acb).from(param.getOffset()).size(param.getLimit())
                            //.addSort(sortWord, SortOrder.ASC)
                            .sort(sortWord, SortOrder.DESC)
                            .explain(true);
                }else {
                    sourceBuilder.query(qb).from(param.getOffset()).size(param.getLimit())
                            //.addSort(sortWord, SortOrder.ASC)
                            .sort(sortWord, SortOrder.DESC)
                            .explain(true);
                }
            }
    
            response = client.search(searchRequest.source(sourceBuilder), RequestOptions.DEFAULT);
    
            SearchHits shList = response.getHits();
    
            // 列表参数
            for (SearchHit searchHit : shList) {
                responseStrList.add(searchHit.getSourceAsString());
            }
            vo.setRows(responseStrList);
    
            // 统计模块
            if(param.getDistictField() != null)
            {
                NumericMetricsAggregation.SingleValue responseAgg = response.getAggregations().get("count_id");   //聚合分析
                int count = 0;
                if (responseAgg != null) {
                    double value = responseAgg.value();
                    count = getInt(value);
                }
                vo.setTotal(count);
            }
    
            return vo;
        }

    GET查询,加.keyword与不加.keyword的区别是什么,为什么没有结果:

    1.ES5.0及以后的版本取消了string类型,将原先的string类型拆分为textkeyword两种类型。它们的区别在于text会对字段进行分词处理而keyword则不会。
    2.当你没有以IndexTemplate等形式为你的索引字段预先指定mapping的话,ES就会使用Dynamic Mapping,通过推断你传入的文档中字段的值对字段进行动态映射。例如传入的文档中字段price的值为12,那么price将被映射为long类型;字段addr的值为"192.168.0.1",那么addr将被映射为ip类型。然而对于不满足ip和date格式的普通字符串来说,情况有些不同:ES会将它们映射为text类型,但为了保留对这些字段做精确查询以及聚合的能力,又同时对它们做了keyword类型的映射,作为该字段的fields属性写到_mapping中。例如,当ES遇到一个新的字段"foobar": "some string"时,会对它做如下的Dynamic Mapping:

    {
        "foobar": {
            "type" "text",
            "fields": {
                "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                }
            }
        }
    }

    在之后的查询中使用foobar是将foobar作为text类型查询,而使用foobar.keyword则是将foobar作为keyword类型查询。前者会对查询内容做分词处理之后再匹配,而后者则是直接对查询结果做精确匹配
    3.ES的term query做的是精确匹配而不是分词查询,因此对text类型的字段做term查询将是查不到结果的(除非字段本身经过分词器处理后不变,未被转换或分词)。此时,必须使用foobar.keyword来对foobar字段以keyword类型进行精确匹配。

  • 相关阅读:
    118. Pascal's Triangle
    697. Degree of an Array
    1013. Partition Array Into Three Parts With Equal Sum
    167. Two Sum II
    ol7 禁用mysql 自启动
    pgsql常用命令
    清空history 命令记录
    pgsql启动报错
    在rhel 7.4中安装glibc-devel-2.17-196.el7.i686包的过程详录
    postgresql-9.2 install
  • 原文地址:https://www.cnblogs.com/gxyandwmm/p/12101758.html
Copyright © 2020-2023  润新知