如何搜索包含子字符串的行?

时间:2022-06-30 15:29:54

If I store an HTML TEXTAREA in my ODBC database each time the user submits a form, what's the SELECT statement to retrieve 1) all rows which contain a given sub-string 2) all rows which don't (and is the search case sensitive?)

如果每次用户提交表单时,我在我的ODBC数据库中存储一个HTML TEXTAREA,那么要检索的SELECT语句是什么呢?


Edit: if LIKE "%SUBSTRING%" is going to be slow, would it be better to get everything & sort it out in PHP?

编辑:如果LIKE“%SUBSTRING%”将会是慢的,那么得到所有的东西并在PHP中进行排序会更好吗?

2 个解决方案

#1


31  

Well, you can always try WHERE textcolumn LIKE "%SUBSTRING%" - but this is guaranteed to be pretty slow, as your query can't do an index match because you are looking for characters on the left side.

嗯,您可以尝试使用“%SUBSTRING%”这样的textcolumn——但这肯定会很慢,因为查询无法进行索引匹配,因为您正在查找左侧的字符。

It depends on the field type - a textarea usually won't be saved as VARCHAR, but rather as (a kind of) TEXT field, so you can use the MATCH AGAINST operator.

它取决于字段类型——textarea通常不会保存为VARCHAR,而是保存为(一种)文本字段,因此您可以使用MATCH AGAINST操作符。

To get the columns that don't match, simply put a NOT in front of the like: WHERE textcolumn NOT LIKE "%SUBSTRING%".

要获得不匹配的列,只需在like前面加上一个NOT: WHERE textcolumn不像“%SUBSTRING%”。

Whether the search is case-sensitive or not depends on how you stock the data, especially what COLLATION you use. By default, the search will be case-insensitive.

搜索是否区分大小写取决于您如何存储数据,特别是您使用的排序规则。默认情况下,搜索将不区分大小写。

Updated answer to reflect question update:

I say that doing a WHERE field LIKE "%value%" is slower than WHERE field LIKE "value%" if the column field has an index, but this is still considerably faster than getting all values and having your application filter. Both scenario's:

我说,如果列字段有索引,那么做“%value%”这样的字段会比“value%”字段更慢,但是这比获得所有值和应用程序过滤器要快得多。两个场景:

1/ If you do SELECT field FROM table WHERE field LIKE "%value%", MySQL will scan the entire table, and only send the fields containing "value".

如果你从表中选择字段,比如“%value%”,MySQL会扫描整个表,只发送包含“值”的字段。

2/ If you do SELECT field FROM table and then have your application (in your case PHP) filter only the rows with "value" in it, MySQL will also scan the entire table, but send all the fields to PHP, which then has to do additional work. This is much slower than case #1.

2/如果从表中选择字段,然后让应用程序(在PHP中)只过滤包含“value”的行,那么MySQL也会扫描整个表,但会将所有字段发送给PHP,这就需要做额外的工作。这比第1例要慢得多。

Solution: Please do use the WHERE clause, and use EXPLAIN to see the performance.

解决方案:请使用WHERE子句,并使用EXPLAIN查看性能。

#2


5  

Info on MySQL's full text search. This is restricted to MyISAM tables, so may not be suitable if you wantto use a different table type.

关于MySQL全文搜索的信息。这仅限于MyISAM表,因此如果您想使用不同的表类型,可能不太合适。

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

Even if WHERE textcolumn LIKE "%SUBSTRING%" is going to be slow, I think it is probably better to let the Database handle it rather than have PHP handle it. If it is possible to restrict searches by some other criteria (date range, user, etc) then you may find the substring search is OK (ish).

即使像“%SUBSTRING%”这样的textcolumn会很慢,我认为最好还是让数据库来处理,而不是让PHP来处理。如果可以根据其他条件(日期范围、用户等)限制搜索,那么您可能会发现子字符串搜索是OK (ish)。

If you are searching for whole words, you could pull out all the individual words into a separate table and use that to restrict the substring search. (So when searching for "my search string" you look for the the longest word "search" only do the substring search on records containing the word "search")

如果搜索整个单词,可以将所有单独的单词提取到一个单独的表中,并使用它来限制子字符串搜索。(因此,当搜索“我的搜索字符串”时,你会查找最长的单词“搜索”,只在包含单词“search”的记录中进行子字符串搜索。)

#1


31  

Well, you can always try WHERE textcolumn LIKE "%SUBSTRING%" - but this is guaranteed to be pretty slow, as your query can't do an index match because you are looking for characters on the left side.

嗯,您可以尝试使用“%SUBSTRING%”这样的textcolumn——但这肯定会很慢,因为查询无法进行索引匹配,因为您正在查找左侧的字符。

It depends on the field type - a textarea usually won't be saved as VARCHAR, but rather as (a kind of) TEXT field, so you can use the MATCH AGAINST operator.

它取决于字段类型——textarea通常不会保存为VARCHAR,而是保存为(一种)文本字段,因此您可以使用MATCH AGAINST操作符。

To get the columns that don't match, simply put a NOT in front of the like: WHERE textcolumn NOT LIKE "%SUBSTRING%".

要获得不匹配的列,只需在like前面加上一个NOT: WHERE textcolumn不像“%SUBSTRING%”。

Whether the search is case-sensitive or not depends on how you stock the data, especially what COLLATION you use. By default, the search will be case-insensitive.

搜索是否区分大小写取决于您如何存储数据,特别是您使用的排序规则。默认情况下,搜索将不区分大小写。

Updated answer to reflect question update:

I say that doing a WHERE field LIKE "%value%" is slower than WHERE field LIKE "value%" if the column field has an index, but this is still considerably faster than getting all values and having your application filter. Both scenario's:

我说,如果列字段有索引,那么做“%value%”这样的字段会比“value%”字段更慢,但是这比获得所有值和应用程序过滤器要快得多。两个场景:

1/ If you do SELECT field FROM table WHERE field LIKE "%value%", MySQL will scan the entire table, and only send the fields containing "value".

如果你从表中选择字段,比如“%value%”,MySQL会扫描整个表,只发送包含“值”的字段。

2/ If you do SELECT field FROM table and then have your application (in your case PHP) filter only the rows with "value" in it, MySQL will also scan the entire table, but send all the fields to PHP, which then has to do additional work. This is much slower than case #1.

2/如果从表中选择字段,然后让应用程序(在PHP中)只过滤包含“value”的行,那么MySQL也会扫描整个表,但会将所有字段发送给PHP,这就需要做额外的工作。这比第1例要慢得多。

Solution: Please do use the WHERE clause, and use EXPLAIN to see the performance.

解决方案:请使用WHERE子句,并使用EXPLAIN查看性能。

#2


5  

Info on MySQL's full text search. This is restricted to MyISAM tables, so may not be suitable if you wantto use a different table type.

关于MySQL全文搜索的信息。这仅限于MyISAM表,因此如果您想使用不同的表类型,可能不太合适。

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html

Even if WHERE textcolumn LIKE "%SUBSTRING%" is going to be slow, I think it is probably better to let the Database handle it rather than have PHP handle it. If it is possible to restrict searches by some other criteria (date range, user, etc) then you may find the substring search is OK (ish).

即使像“%SUBSTRING%”这样的textcolumn会很慢,我认为最好还是让数据库来处理,而不是让PHP来处理。如果可以根据其他条件(日期范围、用户等)限制搜索,那么您可能会发现子字符串搜索是OK (ish)。

If you are searching for whole words, you could pull out all the individual words into a separate table and use that to restrict the substring search. (So when searching for "my search string" you look for the the longest word "search" only do the substring search on records containing the word "search")

如果搜索整个单词,可以将所有单独的单词提取到一个单独的表中,并使用它来限制子字符串搜索。(因此,当搜索“我的搜索字符串”时,你会查找最长的单词“搜索”,只在包含单词“search”的记录中进行子字符串搜索。)