• solr中facet及facet.pivot理解(整合两篇文章保留参考)


    Facet['fæsɪt]很难翻译,只能靠例子来理解了。Solr作者Yonik Seeley也给出更为直接的名字:导航(Guided Navigation)、参数化查询(Paramatic Search)。

    image

    上面是比较直接的Faceted Search例子,品牌、产品特征、卖家,均是 Facet 。而Apple、Lenovo等品牌,就是 Facet values 或者说 Constraints ,而Facet values所带的统计值就是 Facet count/Constraint count 。

    2 、Facet 使用

    q = 超级本 
    facet = true 
    facet.field = 产品特性 
    facet.field = 品牌 
    facet.field = 卖家

    http://…/select?q=超级本&facet=true&wt=json

    &facet.field=品牌&facet.field=产品特性&facet.field=卖家

    也可以提交查询条件,设置fq(filter query)。

    q = 电脑 
    facet = true 
    fq = 价格:[8000 TO *] 
    facet.mincount = 1 // fq将不符合的字段过滤后,会显示count为0 
    facet.field = 产品特性 
    facet.field = 品牌 
    facet.field = 卖家

    http://…/select?q=超级本&facet=true&wt=json

    &fq=价格:[8000 TO *]&facet.mincount=1

    &facet.field=品牌&facet.field=产品特性&facet.field=卖家

    "facet_counts": {
    "facet_fields": {
      "品牌": [
        "Apple", 4,
        "Lenovo", 39
          …]
      "产品特性": [
        "显卡", 42,
        "酷睿", 38
          …]
     
      …}}

    如果用户选择了Apple这个分类,查询条件中需要添加另外一个fq查询条件,并移除Apple所在的facet.field。

    http://…/select?q=超级本&facet=true&wt=json

    &fq=价格:[8000 TO *]&fq=品牌:Apple&facet.mincount=1

    &facet.field= 品牌 &facet.field=产品特性&facet.field=卖家

    3 、Facet 参数

    facet.prefix  –   限制constaints的前缀

    facet.mincount=0 –  限制constants count的最小返回值,默认为0

    facet.sort=count –  排序的方式,根据count或者index

    facet.offset=0  –   表示在当前排序情况下的偏移,可以做分页

    facet.limit=100 –  constraints返回的数目

    facet.missing=false –  是否返回没有值的field

    facet.date –  Deprecated, use facet.range

    facet.query

    指定一个查询字符串作为Facet Constraint

    facet.query = rank:[* TO 20]

    facet.query = rank:[21 TO *]

    "facet_counts": {
    "facet_fields": {
      "品牌": [
        "Apple", 4,
        "Lenovo", 10
          …]
      "产品特性": [
        "显卡", 11,
        "酷睿", 20
          …]
     
      …}}

    facet.range

    http://…/select?&facet=true

    &facet.range=price

    &facet.range.start=5000

    &facet.range.end=8000

    &facet.range.gap=1000

    <result numFound="27" ... />
     ...
     <lst name="facet_counts">
     <lst name="facet_queries">
       <int name="rank:[* TO 20]">2</int>
       <int name="rank:[21 TO *]">15</int>
     </lst>
    ...

    WARNING:  range范围是左闭右开,[start, end)

    facet.pivot

    这个是Solr 4.0的新特性,pivot和facet一样难理解,还是用例子来讲吧。

    Syntax:  facet.pivot=field1,field2,field3...

    e.g.  facet.pivot=comment_user, grade

    #docs

    #docs grade:好

    #docs 等级:中

    #docs 等级:差

    comment_user:1

    10

    8

    1

    1

    comment_user:2

    20

    18

    2

    0

    comment_user:3

    15

    12

    2

    1

    comment_user:4

    18

    15

    2

    1

    "facet_counts":{
    "facet_pivot":{
     "comment_user, grade ":[{
       "field":"comment_user",
       "value":"1",
       "count":10,
       "pivot":[{
         "field":"grade",
         "value":"",
         "count":8}, {
         "field":"grade",
         "value":"",
         "count":1}, {
         "field":"grade",
         "value":"",
         "count":1}]
       }, {
         "field":" comment_user ",
         "value":"2",
         "count":20,
         "pivot":[{
          …

    没有pivot机制的话,要做到上面那点可能需要多次查询:

    http://...q= comment&fq= grade:好&facet=true&facet.field=comment_user

    http://...q=comment&fq=grade:中&facet=true&facet.field=comment_user

    http://...q=comment&fq=grade:差&facet=true&facet.field=comment_user

    Facet.pivot -  Computes a Matrix of Constraint Counts across multiple Facet Fields. by Yonik Seeley.

    上面那个解释很不错,只能理解不能翻译。

     

    facet.pivot自己的理解,就是按照多个维度进行分组查询,以下是自己的实战代码,按照newsType,property两个维度统计:

    public List<ReportNewsTypeDTO> queryNewsType(
                ReportQuery reportQuery) {    
            HttpSolrServer solrServer = SolrServer.getInstance().getServer();
            SolrQuery sQuery = new SolrQuery();
            List<ReportNewsTypeDTO> list = new ArrayList<ReportNewsTypeDTO>();
            try {
                String para = this.initReportQueryPara(reportQuery, 0);
                sQuery.setFacet(true);
                sQuery.add("facet.pivot", "newsType,property");//根据这两维度来分组查询
                sQuery.setQuery(para);
                QueryResponse response = solrServer.query(sQuery,SolrRequest.METHOD.POST);     
                NamedList<List<PivotField>> namedList = response.getFacetPivot();
                System.out.println(namedList);//底下为啥要这样判断,把这个值打印出来,你就明白了
                if(namedList != null){
                    List<PivotField> pivotList = null;
                    for(int i=0;i<namedList.size();i++){
                        pivotList = namedList.getVal(i);
                        if(pivotList != null){
                            ReportNewsTypeDTO dto = null;
                            for(PivotField pivot:pivotList){
                                dto = new ReportNewsTypeDTO();
                                dto.setNewsTypeId((Integer)pivot.getValue());
                                dto.setNewsTypeName(News.newsTypeMap.get((Integer)pivot.getValue()));
                                int pos = 0;
                                int neg = 0;
                                List<PivotField> fieldList = pivot.getPivot();
                                if(fieldList != null){
                                    for(PivotField field:fieldList){
                                        int proValue = (Integer) field.getValue();
                                        int count = field.getCount();
                                        if(proValue == 1){
                                            pos = count;
                                        }else{
                                            neg = count;
                                        }
                                    }
                                }
                                dto.setPositiveCount(pos);
                                dto.setNegativeCount(neg);
                                list.add(dto);
                            }
                        }
                    }
                }
    
                return list;
            } catch (SolrServerException e) {
                log.error("查询solr失败", e);
                e.printStackTrace();
            } finally{
                solrServer.shutdown();
                solrServer = null;
            }
            return list;    
        }
    namedList打印结果:
    {newsType,property=
    [
    newsType:8 [4260] [property:1 [3698] null, property:0 [562] null], 
    newsType:1 [1507] [property:1 [1389] null, property:0 [118] null], 
    newsType:2 [1054] [property:1 [909] null, property:0 [145] null], 
    newsType:6 [715] [property:1 [581] null, property:0 [134] null], 
    newsType:4 [675] [property:1 [466] null, property:0 [209] null], 
    newsType:3 [486] [property:1 [397] null, property:0 [89] null], 
    newsType:7 [458] [property:1 [395] null, property:0 [63] null], 
    newsType:5 [289] [property:1 [263] null, property:0 [26] null], 
    newsType:9 [143] [property:1 [138] null, property:0 [5] null]
    ]
    }
    这下应该明白了。写到这里,突然想到一个,所有的分组查询统计,不管是一个维度两个维度都可以使用face.pivot来统计,不错的东东。
  • 相关阅读:
    day35作业
    进程的初识
    day34作业
    python中的文件
    python字典概述
    python中的深拷贝与浅拷贝
    python的元组和列表使用之一
    Python基本数据类型
    python的编码
    windows中安装python
  • 原文地址:https://www.cnblogs.com/cuihongyu3503319/p/10765999.html
Copyright © 2020-2023  润新知