Solr -- Solr Facet 2

时间:2022-10-04 16:33:24

solr将以导航为目的的查询结果称为facet. 它并不会修改查询结果信息, 只是在查询结果上根据分类添加了count信息, 然后用户根据count信息做进一步的查询, 比如淘宝的查询列表中, 上面会表示不同的类目相关查询结果的数量.
比如搜索数码相机, 在搜索结果栏会根据厂商, 分辨率等维度列出, 这里厂商, 分辨率就是一个个facet.
然后在厂商下面会有nikon, canon, sony等品牌, 这个叫约束(constraints)
接下来是根据选择, 列出当前的导航路径, 这个叫面包屑(breadcrumb).
solr有几种facet:
普通facet, 比如从厂商品牌的维度建立fact
查询facet, 比如根据价格查询时, 将根据价格, 设置多个区间, 比如0-10, 10-20, 20-30等
日期facet, 也是一种特殊的范围查询, 比如按照月份进行facet.
facet的主要好处就是可以任意对搜索条件进行组合, 避免无效搜索, 改善搜索体验.
facet都是在查询时通过参数指定. 比如
在http api中这样写:

引用

"&facet=true&facet.field=manu"

.
java代码这样写:

Java代码 Solr  -- Solr Facet 2

  1. new SolrQuery("*:*").setFacet(true).addFacetField("manu"); 

而xml返回的结果为这样:

Xml代码 Solr  -- Solr Facet 2

  1. <lst name="facet_fields">
  2. <lst name="manu">
  3. <int name="Canon USA">17</int>
  4. <int name="Olympus">12</int>
  5. <int name="Sony">12</int>
  6. <int name="Panasonic">9</int>
  7. <int name="Nikon">4</int>
  8. </lst>
  9. </lst>

通过java代码可以这样获取facet结果:

Java代码 Solr  -- Solr Facet 2

  1. List<FacetField> facetFields = queryResponse.getFacetFields(); 

在已有的查询基础上增加facet query, 可以这样写:

Java代码 Solr  -- Solr Facet 2

  1. solrQuery.addFacetQuery("quality:[* TO 10]") 

比如对价格按照指定的区间进行facet, 可以这样加上facet后缀:

引用

&facet=true&facet.query=price:[* TO 100]
&facet.query=price:[100 TO 200];&facet.query=[price:200 TO 300]
&facet.query=price:[300 TO 400];&facet.query=[price:400 TO 500]
&facet.query=price:[500 TO *]

如果要对价格在400到500期间的产品做进一步的搜索, 那么可以这样写(使用了solr的过滤查询):

引用

http://localhost:8983/solr/select?q=camera &facet=on&facet.field=manu&facet.field=camera_type &fq=price:[400 to 500]

注意这里的facet field不再包含price了
如果这里对类型做进一步的查询, 那么query语句可以这样写:

引用

http://localhost:8983/solr/select?q=camera &facet=on&facet.field=manu &fq=price:[400 to 500] &fq=camera_type:SLR

facet的使用场景:
1.类目导航
2.自动提示, 需要借助一个支持多值的tag field.
3.热门关键词排行, 也需要借助一个tag field

 

I've gone through the related questions on this site but haven't found a relevant solution.

When querying my Solr4 index using an HTTP request of the form

&facet=true&facet.field=country

The response contains all the different countries along with counts per country.

How can I get this information using SolrJ? I have tried the following but it only returns total counts across all countries, not per country:

solrQuery.setFacet(true);
solrQuery.addFacetField("country");

The following does seem to work, but I do not want to have to explicitly set all the groupings beforehand:

solrQuery.addFacetQuery("country:usa");
solrQuery.addFacetQuery("country:canada");

Secondly, I'm not sure how to extract the facet data from the QueryResponse object.

So two questions:

1) Using SolrJ how can I facet on a field and return the groupings without explicitly specifying the groups?

2) Using SolrJ how can I extract the facet data from the QueryResponse object?

Thanks.

Update:

I also tried something similar to Sergey's response (below).

List<FacetField> ffList = resp.getFacetFields();
log.info("size of ffList:" + ffList.size());
for(FacetField ff : ffList){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
log.info("ffname:" + ffname + "|ffcount:" + ffcount);
}

The above code shows ffList with size=1 and the loop goes through 1 iteration. In the output ffname="country" and ffcount is the total number of rows that match the original query.

There is no per-country breakdown here.

I should mention that on the same solrQuery object I am also calling addField and addFilterQuery. Not sure if this impacts faceting:

solrQuery.addField("user-name");
solrQuery.addField("user-bio");
solrQuery.addField("country");
solrQuery.addFilterQuery("user-bio:" + "(Apple OR Google OR Facebook)");

Update 2:

I think I got it, again based on what Sergey said below. I extracted the List object using FacetField.getValues().

List<FacetField> fflist = resp.getFacetFields();
for(FacetField ff : fflist){
String ffname = ff.getName();
int ffcount = ff.getValueCount();
List<Count> counts = ff.getValues();
for(Count c : counts){
String facetLabel = c.getName();
long facetCount = c.getCount();
}
}

In the above code the label variable matches each facet group and count is the corresponding count for that grouping.