正则表达式的最坏情况分析

时间:2021-07-28 03:47:10

Are there any tools that will take a particular regular expression and return the worst case scenario in terms of the number of operations required for a certain number of characters that the regular expression is matched against?

有什么工具可以使用特定的正则表达式,并根据正则表达式匹配的特定字符所需的操作数返回最坏的情况吗?

So for example, given a (f|a)oo.*[ ]baz, how many steps might the engine possibly go though through to match 100 characters?

例如,给定一个(f|a)oo。*[]baz,引擎有多少步骤可以匹配100个字符?

I would also be interested if there is a tool that can take a bunch of text samples and show the average operations for each run.

如果有一个工具可以提取一堆文本样本并显示每个运行的平均操作,我也会感兴趣。

I realize this will depend a lot on the engine used and the implementation -- but I am ignorant as to how common this is. So if it is common for many languages (making my question too vague) I would be particularly interested in Perl and Python.

我意识到这将很大程度上取决于所使用的引擎和实现——但我不知道这有多普遍。因此,如果对许多语言(使我的问题过于模糊)很常见,我将对Perl和Python特别感兴趣。

3 个解决方案

#1


22  

Regexbuddy's debugger shows how many steps engine would take to conclude match or not on a given sample. More information on catastrophic backtracking and debugging regular expressions.

Regexbuddy的调试器显示了在给定的示例上,要判断匹配与否,引擎需要执行多少步骤。更多关于灾难性回溯和调试正则表达式的信息。

正则表达式的最坏情况分析

PS: It is not free but they offer a 3-month money-back guarantee.

这不是免费的,但是他们提供3个月的退款保证。

#2


11  

Note that it depends on the engine. While regex theory is based on straight automata theory, most of the engines are not strict translations of those theories. For this reason, for instance, some engines incur in exponential time while strict NFA processing would not.

注意,这取决于引擎。虽然regex理论基于直自动机理论,但大多数引擎并不是这些理论的严格翻译。由于这个原因,例如,一些引擎在指数时间发生,而严格的NFA处理不会发生。

#3


7  

You might get what you're looking for something like using re.compile with re.DEBUG. See this excellent answer from the Python Hidden Features Community wiki for an extensive explanation.

您可能会得到您想要的东西,比如使用re.compile和re.DEBUG。请参阅Python Hidden Features Community wiki的这个优秀答案,以获得详细的解释。

#1


22  

Regexbuddy's debugger shows how many steps engine would take to conclude match or not on a given sample. More information on catastrophic backtracking and debugging regular expressions.

Regexbuddy的调试器显示了在给定的示例上,要判断匹配与否,引擎需要执行多少步骤。更多关于灾难性回溯和调试正则表达式的信息。

正则表达式的最坏情况分析

PS: It is not free but they offer a 3-month money-back guarantee.

这不是免费的,但是他们提供3个月的退款保证。

#2


11  

Note that it depends on the engine. While regex theory is based on straight automata theory, most of the engines are not strict translations of those theories. For this reason, for instance, some engines incur in exponential time while strict NFA processing would not.

注意,这取决于引擎。虽然regex理论基于直自动机理论,但大多数引擎并不是这些理论的严格翻译。由于这个原因,例如,一些引擎在指数时间发生,而严格的NFA处理不会发生。

#3


7  

You might get what you're looking for something like using re.compile with re.DEBUG. See this excellent answer from the Python Hidden Features Community wiki for an extensive explanation.

您可能会得到您想要的东西,比如使用re.compile和re.DEBUG。请参阅Python Hidden Features Community wiki的这个优秀答案,以获得详细的解释。