Solr索引,将字符串字段拆分为列表

时间:2022-12-27 04:15:35

In Solr, I want to index string field as list by splitting it.

在Solr中,我希望通过拆分将字符串字段索引为列表。

Below is my indexing query in data_config.xml file.

下面是我在data_config.xml文件中的索引查询。

<document name="Example">
<entity dataSource="example_table" name="Example" 
    query="select id, text from example_table"
    pk="id"
    transformer="RegexTransformer"
>
    <field column="id" name="id" />
    <field column="text" name="text" />
</entity>

Field text is a comma separated string. Example: "A, B, C"

字段文本是逗号分隔的字符串。示例:“A,B,C”

Below is the field definition in schema.xml file

下面是schema.xml文件中的字段定义

<field name="text" type="string" indexed="true" stored="true" required="false" multiValued="true" />

When I'm querying Solr the output is:

当我查询Solr时,输出是:

"text":["A, B, C"]

Could someone explain me how can I get the result as below?

有人可以解释我如何得到如下结果?

"text":["A","B","C"]

1 个解决方案

#1


0  

To do it in your DataImportHandler definition (since you've already added the RegexTransformer):

要在DataImportHandler定义中执行此操作(因为您已经添加了RegexTransformer):

<field column="text" name="text" splitBy=", " />

Or do it in your field definition by using a TextField with a Regular Expression Pattern Tokenizer:

或者通过使用带有正则表达式模式标记符的TextField在字段定义中执行此操作:

<analyzer>
  <tokenizer class="solr.PatternTokenizerFactory" pattern=","/>
  <filter class="solr.TrimFilterFactory"/>
</analyzer>

#1


0  

To do it in your DataImportHandler definition (since you've already added the RegexTransformer):

要在DataImportHandler定义中执行此操作(因为您已经添加了RegexTransformer):

<field column="text" name="text" splitBy=", " />

Or do it in your field definition by using a TextField with a Regular Expression Pattern Tokenizer:

或者通过使用带有正则表达式模式标记符的TextField在字段定义中执行此操作:

<analyzer>
  <tokenizer class="solr.PatternTokenizerFactory" pattern=","/>
  <filter class="solr.TrimFilterFactory"/>
</analyzer>