什么是全文索引,何时应该使用它?

时间:2022-04-05 17:21:54

As the title states, what is a fulltext index and when should I use it?

正如标题所述,什么是全文索引以及何时应该使用它?

2 个解决方案

#1


16  

In databases indices are usually used to enhance performance when looking for something defined in your where clause. However when it comes to filtering some text, e.g. using something like WHERE TextColumn LIKE '%searchstring%' then searches are slow, because the way regular database indices work are optimized for matches against the 'whole content' of a column and not just a part of it. In specific the LIKE search which includes wildcards can not make use of any kind of index.

在数据库中,索引通常用于在查找where子句中定义的内容时增强性能。但是,当涉及过滤某些文本时,例如使用像WHERE TextColumn LIKE'%searchstring%'之类的东西然后搜索很慢,因为常规数据库索引工作的方式针对列的“整个内容”而不仅仅是其一部分的匹配进行了优化。具体而言,包含通配符的LIKE搜索不能使用任何类型的索引。

As mentioned in the comment below MySQL needs the MATCH () ... AGAINST syntax to search within a fulltext index; BTW this varies depending on the database vendor. In MS SQL you can use CONTAINS so keep this in mind when you plan to support other databases too.

正如下面的评论中提到的,MySQL需要MATCH()... AGAINST语法来在全文索引中进行搜索; BTW这取决于数据库供应商。在MS SQL中,您可以使用CONTAINS,因此在计划支持其他数据库时请记住这一点。

Fulltext indices work better for regular text, because they are optimized for these type of columns. Very simplified: They split the text into words and make an index over the words and not the whole text. This works a lot faster for text searches when looking for specific words.

全文索引对于常规文本更有效,因为它们针对这些类型的列进行了优化。非常简化:他们将文本分成单词并对单词而不是整个文本进行索引。在查找特定单词时,文本搜索的速度要快得多。

#2


11  

A full text index is an index you apply in a MySQL database to text fields that you plan to run a full text search on. A full text search uses the match(field) against('text') syntax. If you want to run a full text search you must have a full text index on the columns you'll be running it against.

全文索引是您在MySQL数据库中应用于计划运行全文搜索的文本字段的索引。全文搜索使用匹配(字段)对('text')语法。如果要运行全文搜索,则必须在要运行它的列上具有全文索引。

There are three types of Full Text searches. I'll quote the manual, because I think it says it best:

全文搜索有三种类型。我会引用手册,因为我觉得它最好:

  • A boolean search interprets the search string using the rules of a special query language. The string contains the words to search for. It can also contain operators that specify requirements such that a word must be present or absent in matching rows, or that it should be weighted higher or lower than usual. Common words such as “some” or “then” are stopwords and do not match if present in the search string. The IN BOOLEAN MODE modifier specifies a boolean search. For more information, see Section 11.9.2, “Boolean Full-Text Searches”.

    布尔搜索使用特殊查询语言的规则来解释搜索字符串。该字符串包含要搜索的单词。它还可以包含指定要求的运算符,以便在匹配的行中必须存在或不存在单词,或者它应该比通常更高或更低的权重。诸如“some”或“then”之类的常用词是停用词,如果存在于搜索字符串中则不匹配。 IN BOOLEAN MODE修饰符指定布尔搜索。有关更多信息,请参见第11.9.2节“布尔全文搜索”。

  • A natural language search interprets the search string as a phrase in natural human language (a phrase in free text). There are no special operators. The stopword list applies. In addition, words that are present in 50% or more of the rows are considered common and do not match. Full-text searches are natural language searches if no modifier is given.

    自然语言搜索将搜索字符串解释为自然人类语言(*文本中的短语)中的短语。没有特殊的运营商。禁用词列表适用。另外,存在于50%或更多行中的单词被认为是常见的并且不匹配。如果没有给出修饰符,则全文搜索是自然语言搜索。

  • A query expansion search is a modification of a natural language search. The search string is used to perform a natural language search. Then words from the most relevant rows returned by the search are added to the search string and the search is done again. The query returns the rows from the second search. The WITH QUERY EXPANSION modifier specifies a query expansion search. For more information, see Section 11.9.3, “Full-Text Searches with Query Expansion”.

    查询扩展搜索是自然语言搜索的修改。搜索字符串用于执行自然语言搜索。然后将搜索返回的最相关行中的单词添加到搜索字符串中,然后再次进行搜索。查询返回第二次搜索中的行。 WITH QUERY EXPANSION修饰符指定查询扩展搜索。有关更多信息,请参见第11.9.3节“使用查询扩展的全文搜索”。

For more information take a gander at the Full Text Search Reference Page.

有关更多信息,请参阅全文搜索参考页面。

#1


16  

In databases indices are usually used to enhance performance when looking for something defined in your where clause. However when it comes to filtering some text, e.g. using something like WHERE TextColumn LIKE '%searchstring%' then searches are slow, because the way regular database indices work are optimized for matches against the 'whole content' of a column and not just a part of it. In specific the LIKE search which includes wildcards can not make use of any kind of index.

在数据库中,索引通常用于在查找where子句中定义的内容时增强性能。但是,当涉及过滤某些文本时,例如使用像WHERE TextColumn LIKE'%searchstring%'之类的东西然后搜索很慢,因为常规数据库索引工作的方式针对列的“整个内容”而不仅仅是其一部分的匹配进行了优化。具体而言,包含通配符的LIKE搜索不能使用任何类型的索引。

As mentioned in the comment below MySQL needs the MATCH () ... AGAINST syntax to search within a fulltext index; BTW this varies depending on the database vendor. In MS SQL you can use CONTAINS so keep this in mind when you plan to support other databases too.

正如下面的评论中提到的,MySQL需要MATCH()... AGAINST语法来在全文索引中进行搜索; BTW这取决于数据库供应商。在MS SQL中,您可以使用CONTAINS,因此在计划支持其他数据库时请记住这一点。

Fulltext indices work better for regular text, because they are optimized for these type of columns. Very simplified: They split the text into words and make an index over the words and not the whole text. This works a lot faster for text searches when looking for specific words.

全文索引对于常规文本更有效,因为它们针对这些类型的列进行了优化。非常简化:他们将文本分成单词并对单词而不是整个文本进行索引。在查找特定单词时,文本搜索的速度要快得多。

#2


11  

A full text index is an index you apply in a MySQL database to text fields that you plan to run a full text search on. A full text search uses the match(field) against('text') syntax. If you want to run a full text search you must have a full text index on the columns you'll be running it against.

全文索引是您在MySQL数据库中应用于计划运行全文搜索的文本字段的索引。全文搜索使用匹配(字段)对('text')语法。如果要运行全文搜索,则必须在要运行它的列上具有全文索引。

There are three types of Full Text searches. I'll quote the manual, because I think it says it best:

全文搜索有三种类型。我会引用手册,因为我觉得它最好:

  • A boolean search interprets the search string using the rules of a special query language. The string contains the words to search for. It can also contain operators that specify requirements such that a word must be present or absent in matching rows, or that it should be weighted higher or lower than usual. Common words such as “some” or “then” are stopwords and do not match if present in the search string. The IN BOOLEAN MODE modifier specifies a boolean search. For more information, see Section 11.9.2, “Boolean Full-Text Searches”.

    布尔搜索使用特殊查询语言的规则来解释搜索字符串。该字符串包含要搜索的单词。它还可以包含指定要求的运算符,以便在匹配的行中必须存在或不存在单词,或者它应该比通常更高或更低的权重。诸如“some”或“then”之类的常用词是停用词,如果存在于搜索字符串中则不匹配。 IN BOOLEAN MODE修饰符指定布尔搜索。有关更多信息,请参见第11.9.2节“布尔全文搜索”。

  • A natural language search interprets the search string as a phrase in natural human language (a phrase in free text). There are no special operators. The stopword list applies. In addition, words that are present in 50% or more of the rows are considered common and do not match. Full-text searches are natural language searches if no modifier is given.

    自然语言搜索将搜索字符串解释为自然人类语言(*文本中的短语)中的短语。没有特殊的运营商。禁用词列表适用。另外,存在于50%或更多行中的单词被认为是常见的并且不匹配。如果没有给出修饰符,则全文搜索是自然语言搜索。

  • A query expansion search is a modification of a natural language search. The search string is used to perform a natural language search. Then words from the most relevant rows returned by the search are added to the search string and the search is done again. The query returns the rows from the second search. The WITH QUERY EXPANSION modifier specifies a query expansion search. For more information, see Section 11.9.3, “Full-Text Searches with Query Expansion”.

    查询扩展搜索是自然语言搜索的修改。搜索字符串用于执行自然语言搜索。然后将搜索返回的最相关行中的单词添加到搜索字符串中,然后再次进行搜索。查询返回第二次搜索中的行。 WITH QUERY EXPANSION修饰符指定查询扩展搜索。有关更多信息,请参见第11.9.3节“使用查询扩展的全文搜索”。

For more information take a gander at the Full Text Search Reference Page.

有关更多信息,请参阅全文搜索参考页面。