I'm having a problem with the SpanNearQuery
in Lucene 4.3. I'm trying to do a query like this:
我对Lucene 4.3中的spannear查询有问题。我试着做这样的查询:
SpanTermQuery fleeceQ = new SpanTermQuery(new Term("content", "golden fleece"));
SpanTermQuery blackQ = new SpanTermQuery(new Term("content", "black"));
SpanQuery[] clauses = {fleeceQ, blackQ};
SpanNearQuery nearQ = new SpanNearQuery(clauses, 10, false);
In the field "content" of my document I have: "History looks fondly upon the black story of the golden fleece, but most people don't agree"
在我的文件的“内容”中,我有:“历史对金羊毛的黑色故事很感兴趣,但大多数人不同意。”
Well, what happens is that the query returns me nothing. But if I change "golden fleece" to "fleece" it works, so I guess the problem is with the composite words.
实际上,查询没有返回任何东西。但是如果我把“金色羊毛”改成“羊毛”,我猜问题就出在复合词上。
I'm using the SpanNearQuery
because I have to do a proximity search and I need to know how many times it occurs.
我使用的是spannear查询因为我需要进行近距离搜索我需要知道它出现了多少次。
Anyone know how to fix this?
有人知道怎么修复吗?
1 个解决方案
#1
0
The problem is that "golden fleece" is Not a term. It's two terms, golden
and fleece
. When you construct the term yourself though, with:
问题是“金羊毛”不是一个术语。有两个术语,金色和羊毛。当你自己构造这个词的时候,
new Term("content", "golden fleece")
It will take your word for it, and make it a single term. There are no matches, because the single term golden fleece
doesn't exist in your index.
它会相信你的话,并使它成为一个单独的术语。没有匹配项,因为单个术语金色羊毛在索引中不存在。
There isn't a clear way to incorporate a PhraseQuery
into a SpanNearQuery
, so I think it might make sense to create another, nested, SpanNearQuery
to create the behavior you are looking for:
没有一种清晰的方法可以将一个短语应用到一个spannear查询中,所以我认为创建另一个嵌套的、spannear查询来创建您想要的行为是有意义的:
SpanTermQuery goldenQ = new SpanTermQuery(new Term("content", "golden"));
SpanTermQuery fleeceQ = new SpanTermQuery(new Term("content", "fleece"));
SpanTermQuery blackQ = new SpanTermQuery(new Term("content", "black"));
SpanQuery[] subclauses = {goldenQ, fleeceQ};
SpanNearQuery goldfleeceQ = new SpanNearQuery(subclauses, 0, true); //No slop, in order!
SpanQuery[] mainclauses = {goldfleeceQ, blackQ};
SpanNearQuery finalQ = new SpanNearQuery(mainclauses, 10, false); //As before, 10 slop, any order
#1
0
The problem is that "golden fleece" is Not a term. It's two terms, golden
and fleece
. When you construct the term yourself though, with:
问题是“金羊毛”不是一个术语。有两个术语,金色和羊毛。当你自己构造这个词的时候,
new Term("content", "golden fleece")
It will take your word for it, and make it a single term. There are no matches, because the single term golden fleece
doesn't exist in your index.
它会相信你的话,并使它成为一个单独的术语。没有匹配项,因为单个术语金色羊毛在索引中不存在。
There isn't a clear way to incorporate a PhraseQuery
into a SpanNearQuery
, so I think it might make sense to create another, nested, SpanNearQuery
to create the behavior you are looking for:
没有一种清晰的方法可以将一个短语应用到一个spannear查询中,所以我认为创建另一个嵌套的、spannear查询来创建您想要的行为是有意义的:
SpanTermQuery goldenQ = new SpanTermQuery(new Term("content", "golden"));
SpanTermQuery fleeceQ = new SpanTermQuery(new Term("content", "fleece"));
SpanTermQuery blackQ = new SpanTermQuery(new Term("content", "black"));
SpanQuery[] subclauses = {goldenQ, fleeceQ};
SpanNearQuery goldfleeceQ = new SpanNearQuery(subclauses, 0, true); //No slop, in order!
SpanQuery[] mainclauses = {goldfleeceQ, blackQ};
SpanNearQuery finalQ = new SpanNearQuery(mainclauses, 10, false); //As before, 10 slop, any order