I have the following string of anchors (where I want to change the contents of the href) and a lua table of replacements, which tells which word should be replaced for:
我有以下一系列锚(我想更改href的内容)和一个lua替换表,它告诉应该替换哪个单词:
s1 = '<a href="word1"></a><a href="word2"></a><a href="word3"></a><a href="word1"></a><a href="word5"></a><a href="word2"></a><a href="word3"><a href="word7"></a>'
replacementTable = {}
replacementTable["word1"] = "potato1"
replacementTable["word2"] = "potato2"
replacementTable["word3"] = "potato3"
replacementTable["word4"] = "potato4"
replacementTable["word5"] = "potato5"
The expected result should be:
预期的结果应是:
<a href="potato1"></a><a href="potato2"></a><a href="potato3"></a><a href="potato1"></a><a href="potato5"></a><a href="potato2"></a><a href="potato3"><a href="word7"></a>
I know I could do this iterating for each element in the replacementTable and process the string each time, but my gut feeling tells me that if by any chance the string is very big and/or the replacement table becomes big, this apporach is going to perform poorly.
我知道我可以对replacementTable中的每个元素进行迭代,并每次都处理字符串,但我的直觉告诉我,如果字符串非常大,/或替换表变得很大,这个apporach的性能就会很差。
So I though it could be best if I could do the following: apply the regular expression for finding all the matches, get an iterator for each match and replace each match for its value in the replacementTable.
因此,我认为如果我能做到以下几点就最好了:应用正则表达式查找所有匹配项,为每个匹配项获取迭代器,并在replacementTable中替换每个匹配项的值。
Something like this would be great (writing it in Javascript because I don't know yet how to write lambdas in Lua):
类似这样的东西会很棒(用Javascript编写,因为我还不知道如何用Lua编写lambdas):
var newString = patternReplacement(s1, '<a[^>]* href="([^"]*)"', function(match) { return replacementTable[match] })
Where the first parameter is the string, the second one the regular expression and the third one a function that is executed for each match to get the replacement. This way I think s1 gets parsed once, being more efficient.
第一个参数是字符串,第二个参数是正则表达式,第三个参数是为每个匹配执行的函数,以获得替换。这样,我认为s1会被解析一次,更有效率。
Is there any way to do this in Lua?
在Lua中有什么方法可以做到这一点吗?
2 个解决方案
#1
2
In your example, this simple code works:
在您的示例中,这个简单的代码可以工作:
print((s1:gsub("%w+",replacementTable)))
The point is that gsub
already accepts a table of replacements.
重点是,gsub已经接受了一个替换表。
#2
0
In the end, the solution that worked for me was the following one:
最后,我的解决方案是:
local updatedBody = string.gsub(body, '(<a[^>]* href=")(/[^"%?]*)([^"]*")', function(leftSide, url, rightSide)
local replacedUrl = url
if (urlsToReplace[url]) then replacedUrl = urlsToReplace[url] end
return leftSide .. replacedUrl .. rightSide
end)
It kept out any querystring parameter giving me just the URI. I know it's a bad idea to parse HTML bodies with regular expressions but for my case, where I required a lot of performance, this was performing a lot faster and just did the job.
它不显示任何查询字符串参数,只提供URI。我知道用正则表达式解析HTML主体是一个坏主意,但是对于我来说,我需要很多性能,它的执行速度快得多,完成了任务。
#1
2
In your example, this simple code works:
在您的示例中,这个简单的代码可以工作:
print((s1:gsub("%w+",replacementTable)))
The point is that gsub
already accepts a table of replacements.
重点是,gsub已经接受了一个替换表。
#2
0
In the end, the solution that worked for me was the following one:
最后,我的解决方案是:
local updatedBody = string.gsub(body, '(<a[^>]* href=")(/[^"%?]*)([^"]*")', function(leftSide, url, rightSide)
local replacedUrl = url
if (urlsToReplace[url]) then replacedUrl = urlsToReplace[url] end
return leftSide .. replacedUrl .. rightSide
end)
It kept out any querystring parameter giving me just the URI. I know it's a bad idea to parse HTML bodies with regular expressions but for my case, where I required a lot of performance, this was performing a lot faster and just did the job.
它不显示任何查询字符串参数,只提供URI。我知道用正则表达式解析HTML主体是一个坏主意,但是对于我来说,我需要很多性能,它的执行速度快得多,完成了任务。