I want to have a "citystate" field in Lucene index which will store various city state values like:
我希望在Lucene索引中有一个“citystate”字段,它将存储各种城市状态值,例如:
- Chicago, IL
- Boston, MA
- San Diego, CA
加利福尼亚州圣地亚哥
How do i store these values(shud it be tokenized or non-tokenized?) in Lucene and
我如何在Lucene中存储这些值(将其标记为非标记化或非标记化?)
how do I generate a query (should it be phrasequery or termquery or something else?) which gets me all records whose citystate contain: Chicago, IL OR Boston, MA OR San Diego, CA ??
我如何生成一个查询(应该是phrasequery还是termquery或其他什么?),它可以获取所有记录,其中包含:city,IL或Boston,MA或San Diego,CA ??
I would appreciate if i can get help with the code as well.
如果我能得到代码的帮助,我将不胜感激。
Thanks.
2 个解决方案
#1
Shouldnt city state be normalized further into two separate fields ?
城市国家是否应该进一步规范化为两个独立的领域?
#2
It depends. Will you ever want to search by city alone or by state alone? In this case you need to tokenize. If not, do not tokenize. Check out the KeywordAnalyzer, though - it may suit you.
这取决于。您是否希望单独按城市或仅按州搜索?在这种情况下,您需要进行标记化。如果没有,请不要标记。看看KeywordAnalyzer,它可能适合你。
As to your second question. Suppose you call the field 'citystate'. You can then use a query such as: citystate:Chicago, IL OR citystate:Boston,MA OR citystate:San Diego, CA
.
至于你的第二个问题。假设你打电话给'citystate'字段。然后,您可以使用查询,例如:citystate:Chicago,IL或citystate:Boston,MA或citystate:San Diego,CA。
The programmatic version is a BooleanQuery composed out of several TermQueryes.
程序化版本是由几个TermQuery组成的BooleanQuery。
#1
Shouldnt city state be normalized further into two separate fields ?
城市国家是否应该进一步规范化为两个独立的领域?
#2
It depends. Will you ever want to search by city alone or by state alone? In this case you need to tokenize. If not, do not tokenize. Check out the KeywordAnalyzer, though - it may suit you.
这取决于。您是否希望单独按城市或仅按州搜索?在这种情况下,您需要进行标记化。如果没有,请不要标记。看看KeywordAnalyzer,它可能适合你。
As to your second question. Suppose you call the field 'citystate'. You can then use a query such as: citystate:Chicago, IL OR citystate:Boston,MA OR citystate:San Diego, CA
.
至于你的第二个问题。假设你打电话给'citystate'字段。然后,您可以使用查询,例如:citystate:Chicago,IL或citystate:Boston,MA或citystate:San Diego,CA。
The programmatic version is a BooleanQuery composed out of several TermQueryes.
程序化版本是由几个TermQuery组成的BooleanQuery。