让Lucene在术语中包含完全匹配的空格

时间:2021-07-05 03:08:09

I want my Lucene query to contain something similar to:

我希望我的Lucene查询包含类似于:

companyNam:mercedes trucks

Where it will do an exact match for the string "mercedes trucks" in the companyName field.
The companyName is an untokenized field, but anything with a space returns null results..

它将在companyName字段中对字符串“mercedes trucks”进行精确匹配。 companyName是一个未加密的字段,但任何带空格的字段都会返回null结果。

new TermQuery(new Term("companyName", "mercedes trucks"));

Always results 0 results if there is a space involved. Otherwise my program is working fine.

如果涉及空间,则总是得到0结果。否则我的程序工作正常。

7 个解决方案

#1


Use a PhraseQuery like this:

使用像这样的PhraseQuery:

//create the query objects
BooleanQuery query = new BooleanQuery();
PhraseQuery q2 = new PhraseQuery();
//grab the search terms from the query string
string[] str = Sitecore.Context.Request.QueryString[BRAND_TERM].Split(' ');
//build the query
foreach(string word in str)
{
  //brand is the field I'm searching in
  q2.Add(new Term("brand", word.ToLower()));
}

//finally, add it to the BooleanQuery object
query.Add(q2, BooleanClause.Occur.MUST);

//Don't forget to run the query
Hits hits = searcher.Search(query);

Hope this helps!

希望这可以帮助!

#2


Maybe replace:

mercedes trucks 

with

mercedes?trucks

Works for me.

适合我。

#3


You may be using different analyzer while searching than the one with which you created the index.

您可能在搜索时使用的是与创建索引的分析器不同的分析器。

Try using KeywordAnalyzer while searching. It will create single token of the search string which is probably what you are looking for.

在搜索时尝试使用KeywordAnalyzer。它将创建搜索字符串的单个标记,这可能是您正在寻找的。

#4


I'm guessing here - does exactMask add quotes around the string? You should simply use the string "mercedes truck", without manipulating it.

我在这里猜测 - exactMask是否在字符串周围添加引号?你应该简单地使用字符串“mercedes truck”,而不是操纵它。

new TermQuery(new Term("companyName", "mercedes trucks"));

#5


Have you considered using a PhraseQuery? Does the field have to be untokenized? I believe untokenized is for ids etc. and not for fields having several words as their content.

你考虑过使用PhraseQuery吗?该领域是否必须未被说明?我相信未经说明的是ids等,而不是有几个单词作为其内容的字段。

#6


The best way that I found that works is to parse the query using the keyword analyzer with the following query "mercedes?trucks".

我发现最好的方法是使用关键字分析器解析查询,并使用以下查询“mercedes?trucks”。

#7


Even I am facing the same issue. You have to do the following thing to get rid of from this issue. 1)When add the field value to the document remove the spaces in between. 2)Make the field value in lowercase. 3)Make the search text in lowercase. 4)Remove the white spaces in the search text. Regards ~shef

即使我面临同样的问题。你必须做以下事情摆脱这个问题。 1)将字段值添加到文档时,删除其间的空格。 2)将字段值设为小写。 3)以小写形式创建搜索文本。 4)删除搜索文本中的空格。关心~shef

#1


Use a PhraseQuery like this:

使用像这样的PhraseQuery:

//create the query objects
BooleanQuery query = new BooleanQuery();
PhraseQuery q2 = new PhraseQuery();
//grab the search terms from the query string
string[] str = Sitecore.Context.Request.QueryString[BRAND_TERM].Split(' ');
//build the query
foreach(string word in str)
{
  //brand is the field I'm searching in
  q2.Add(new Term("brand", word.ToLower()));
}

//finally, add it to the BooleanQuery object
query.Add(q2, BooleanClause.Occur.MUST);

//Don't forget to run the query
Hits hits = searcher.Search(query);

Hope this helps!

希望这可以帮助!

#2


Maybe replace:

mercedes trucks 

with

mercedes?trucks

Works for me.

适合我。

#3


You may be using different analyzer while searching than the one with which you created the index.

您可能在搜索时使用的是与创建索引的分析器不同的分析器。

Try using KeywordAnalyzer while searching. It will create single token of the search string which is probably what you are looking for.

在搜索时尝试使用KeywordAnalyzer。它将创建搜索字符串的单个标记,这可能是您正在寻找的。

#4


I'm guessing here - does exactMask add quotes around the string? You should simply use the string "mercedes truck", without manipulating it.

我在这里猜测 - exactMask是否在字符串周围添加引号?你应该简单地使用字符串“mercedes truck”,而不是操纵它。

new TermQuery(new Term("companyName", "mercedes trucks"));

#5


Have you considered using a PhraseQuery? Does the field have to be untokenized? I believe untokenized is for ids etc. and not for fields having several words as their content.

你考虑过使用PhraseQuery吗?该领域是否必须未被说明?我相信未经说明的是ids等,而不是有几个单词作为其内容的字段。

#6


The best way that I found that works is to parse the query using the keyword analyzer with the following query "mercedes?trucks".

我发现最好的方法是使用关键字分析器解析查询,并使用以下查询“mercedes?trucks”。

#7


Even I am facing the same issue. You have to do the following thing to get rid of from this issue. 1)When add the field value to the document remove the spaces in between. 2)Make the field value in lowercase. 3)Make the search text in lowercase. 4)Remove the white spaces in the search text. Regards ~shef

即使我面临同样的问题。你必须做以下事情摆脱这个问题。 1)将字段值添加到文档时,删除其间的空格。 2)将字段值设为小写。 3)以小写形式创建搜索文本。 4)删除搜索文本中的空格。关心~shef