需要有关Lucene索引/查询的帮助

时间:2021-06-18 03:09:53

I want to have a "citystate" field in Lucene index which will store various city state values like:

我希望在Lucene索引中有一个“citystate”字段,它将存储各种城市状态值,例如:

  • Chicago, IL
  • Boston, MA
  • San Diego, CA
  • 加利福尼亚州圣地亚哥

How do i store these values(shud it be tokenized or non-tokenized?) in Lucene and

我如何在Lucene中存储这些值(将其标记为非标记化或非标记化?)

how do I generate a query (should it be phrasequery or termquery or something else?) which gets me all records whose citystate contain: Chicago, IL OR Boston, MA OR San Diego, CA ??

我如何生成一个查询(应该是phrasequery还是termquery或其他什么?),它可以获取所有记录,其中包含:city,IL或Boston,MA或San Diego,CA ??

I would appreciate if i can get help with the code as well.

如果我能得到代码的帮助,我将不胜感激。

Thanks.

2 个解决方案

#1


Shouldnt city state be normalized further into two separate fields ?

城市国家是否应该进一步规范化为两个独立的领域?

#2


It depends. Will you ever want to search by city alone or by state alone? In this case you need to tokenize. If not, do not tokenize. Check out the KeywordAnalyzer, though - it may suit you.

这取决于。您是否希望单独按城市或仅按州搜索?在这种情况下,您需要进行标记化。如果没有,请不要标记。看看KeywordAnalyzer,它可能适合你。

As to your second question. Suppose you call the field 'citystate'. You can then use a query such as: citystate:Chicago, IL OR citystate:Boston,MA OR citystate:San Diego, CA.

至于你的第二个问题。假设你打电话给'citystate'字段。然后,您可以使用查询,例如:citystate:Chicago,IL或citystate:Boston,MA或citystate:San Diego,CA。

The programmatic version is a BooleanQuery composed out of several TermQueryes.

程序化版本是由几个TermQuery组成的BooleanQuery。

#1


Shouldnt city state be normalized further into two separate fields ?

城市国家是否应该进一步规范化为两个独立的领域?

#2


It depends. Will you ever want to search by city alone or by state alone? In this case you need to tokenize. If not, do not tokenize. Check out the KeywordAnalyzer, though - it may suit you.

这取决于。您是否希望单独按城市或仅按州搜索?在这种情况下,您需要进行标记化。如果没有,请不要标记。看看KeywordAnalyzer,它可能适合你。

As to your second question. Suppose you call the field 'citystate'. You can then use a query such as: citystate:Chicago, IL OR citystate:Boston,MA OR citystate:San Diego, CA.

至于你的第二个问题。假设你打电话给'citystate'字段。然后,您可以使用查询,例如:citystate:Chicago,IL或citystate:Boston,MA或citystate:San Diego,CA。

The programmatic version is a BooleanQuery composed out of several TermQueryes.

程序化版本是由几个TermQuery组成的BooleanQuery。