在MySql查询中RegEx与LIKE的性能

时间:2022-02-02 23:34:57

Rumour has it that this:

有传言说:

SELECT * FROM lineage_string where lineage like '%179%' and  lineage regexp '(^|/)179(/|$)'

Would be faster than this:

会比这更快:

SELECT * FROM lineage_string where lineage regexp '(^|/)179(/|$)'

Can anyone confirm? Or know a decent way to test the speed of such queries. Thanks

谁能确认一下?或者知道一种测试这种查询速度的好方法。谢谢

2 个解决方案

#1


22  

It is possible that it could be faster because the LIKE condition can be evaluated more quickly then the regular expression so if most rows fail the test it could be faster. However it will be slower if most rows succeed as two tests must be run for successful rows instead of just one. It also depends on which expression the optimizer chooses to run first.

它可能更快,因为可以比正则表达式更快地评估LIKE条件,因此如果大多数行未通过测试,它可能会更快。但是,如果大多数行成功,它将会更慢,因为必须为成功的行而不是仅一个行运行两个测试。它还取决于优化器首先选择运行的表达式。

An even bigger speedup can be witnessed if you have something like this:

如果您有类似这样的事情,可以见证更大的加速:

SELECT * FROM (
   SELECT * FROM lineage_string
   WHERE lineage LIKE '179%'
) WHERE lineage regexp '^179(/|$)'

Now an index can be used to find likely rows because LIKE '179%' is sargable. Many rows won't need to be checked at all.

现在可以使用索引来查找可能的行,因为LIKE'179%'是可以搜索的。根本不需要检查许多行。

As always the best way to be sure is to measure it for yourself on your actual data.

一如既往,最好的方法是根据实际数据自行测量。

#2


13  

Yeah, it probably would be a tiny bit faster because standard-SQL LIKE is a simpler comparison operation than a full-on regex parser.

是的,它可能会快一点,因为标准SQL LIKE比完全正则表达式解析器更简单的比较操作。

However, in real terms both are really slow, because neither can use indices. (LIKE can use an index if the match string doesn't start with a wildcard, but that's not the case here.)

然而,实际上两者都非常缓慢,因为两者都不能使用指数。 (如果匹配字符串不以通配符开头,LIKE可以使用索引,但这不是这种情况。)

If you are concerned about speed, you should change your schema so that you can put the 179 directly in a column and index it, rather than having to check through a string manually on every row.

如果您担心速度,则应更改模式,以便可以将179直接放在列中并对其进行索引,而不必在每行上手动检查字符串。

#1


22  

It is possible that it could be faster because the LIKE condition can be evaluated more quickly then the regular expression so if most rows fail the test it could be faster. However it will be slower if most rows succeed as two tests must be run for successful rows instead of just one. It also depends on which expression the optimizer chooses to run first.

它可能更快,因为可以比正则表达式更快地评估LIKE条件,因此如果大多数行未通过测试,它可能会更快。但是,如果大多数行成功,它将会更慢,因为必须为成功的行而不是仅一个行运行两个测试。它还取决于优化器首先选择运行的表达式。

An even bigger speedup can be witnessed if you have something like this:

如果您有类似这样的事情,可以见证更大的加速:

SELECT * FROM (
   SELECT * FROM lineage_string
   WHERE lineage LIKE '179%'
) WHERE lineage regexp '^179(/|$)'

Now an index can be used to find likely rows because LIKE '179%' is sargable. Many rows won't need to be checked at all.

现在可以使用索引来查找可能的行,因为LIKE'179%'是可以搜索的。根本不需要检查许多行。

As always the best way to be sure is to measure it for yourself on your actual data.

一如既往,最好的方法是根据实际数据自行测量。

#2


13  

Yeah, it probably would be a tiny bit faster because standard-SQL LIKE is a simpler comparison operation than a full-on regex parser.

是的,它可能会快一点,因为标准SQL LIKE比完全正则表达式解析器更简单的比较操作。

However, in real terms both are really slow, because neither can use indices. (LIKE can use an index if the match string doesn't start with a wildcard, but that's not the case here.)

然而,实际上两者都非常缓慢,因为两者都不能使用指数。 (如果匹配字符串不以通配符开头,LIKE可以使用索引,但这不是这种情况。)

If you are concerned about speed, you should change your schema so that you can put the 179 directly in a column and index it, rather than having to check through a string manually on every row.

如果您担心速度,则应更改模式,以便可以将179直接放在列中并对其进行索引,而不必在每行上手动检查字符串。