In Solr, I want to index string field as list by splitting it.
在Solr中,我希望通过拆分将字符串字段索引为列表。
Below is my indexing query in data_config.xml
file.
下面是我在data_config.xml文件中的索引查询。
<document name="Example">
<entity dataSource="example_table" name="Example"
query="select id, text from example_table"
pk="id"
transformer="RegexTransformer"
>
<field column="id" name="id" />
<field column="text" name="text" />
</entity>
Field text
is a comma separated string. Example: "A, B, C"
字段文本是逗号分隔的字符串。示例:“A,B,C”
Below is the field definition in schema.xml
file
下面是schema.xml文件中的字段定义
<field name="text" type="string" indexed="true" stored="true" required="false" multiValued="true" />
When I'm querying Solr the output is:
当我查询Solr时,输出是:
"text":["A, B, C"]
Could someone explain me how can I get the result as below?
有人可以解释我如何得到如下结果?
"text":["A","B","C"]
1 个解决方案
#1
0
To do it in your DataImportHandler definition (since you've already added the RegexTransformer):
要在DataImportHandler定义中执行此操作(因为您已经添加了RegexTransformer):
<field column="text" name="text" splitBy=", " />
Or do it in your field definition by using a TextField with a Regular Expression Pattern Tokenizer:
或者通过使用带有正则表达式模式标记符的TextField在字段定义中执行此操作:
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern=","/>
<filter class="solr.TrimFilterFactory"/>
</analyzer>
#1
0
To do it in your DataImportHandler definition (since you've already added the RegexTransformer):
要在DataImportHandler定义中执行此操作(因为您已经添加了RegexTransformer):
<field column="text" name="text" splitBy=", " />
Or do it in your field definition by using a TextField with a Regular Expression Pattern Tokenizer:
或者通过使用带有正则表达式模式标记符的TextField在字段定义中执行此操作:
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern=","/>
<filter class="solr.TrimFilterFactory"/>
</analyzer>