For the Boyer-Moore algorithm to be worst-case linear, the computation of the mis-match table must be O(m). However, a naive implementation would loop through all suffixs O(m) and all positions in that that suffix could go and check for equality... which is O(m3)!
对于Boyer-Moore算法是最坏情况线性的,不匹配表的计算必须是O(m)。然而,一个天真的实现将循环遍历所有后缀O(m)和后缀可以去的所有位置并检查相等...即O(m3)!
Below is the naive implementation of table building algorithm. So this question becomes: How can I improve this algorithm's runtime to O(m)?
下面是表构建算法的简单实现。所以这个问题就变成了:如何将该算法的运行时间改进为O(m)?
def find(s, sub, no):
n = len(s)
m = len(sub)
for i in range(n, 0, -1):
if s[max(i-m, 0): i] == sub[max(0, m-i):] and \
(i-m < 1 or s[i-m-1] != no):
return n-i
return n
def table(s):
m = len(s)
b = [0]*m
for i in range(m):
b[i] = find(s, s[m-i:], s[m-i-1])
return b
print(table('anpanman'))
To put minds at rest, this isn't homework. I'll add revisions when anyone posts ideas for improvements.
为了让人们安心,这不是功课。当有人发表改进意见时,我会添加修订版。
1 个解决方案
#1
The code under "Preprocessing for the good-suffix heuristics" on this page builds the good-suffix table in O(n) time. It also explains how the code works.
此页面上的“预处理良好后缀启发式”下的代码在O(n)时间内构建了良好后缀表。它还解释了代码的工作原理。
#1
The code under "Preprocessing for the good-suffix heuristics" on this page builds the good-suffix table in O(n) time. It also explains how the code works.
此页面上的“预处理良好后缀启发式”下的代码在O(n)时间内构建了良好后缀表。它还解释了代码的工作原理。