I'm trying to create a split() function in lua with delimiter by choice, when the default is space. the default is working fine. The problem starts when I give a delimiter to the function. For some reason it doesn't return the last sub string. The function:
我正在尝试在lua中使用分隔符创建一个split()函数,当默认为空格时。默认工作正常。当我给函数分配符时,问题就开始了。由于某种原因,它不会返回最后一个子字符串。功能:
function split(str,sep)
if sep == nil then
words = {}
for word in str:gmatch("%w+") do table.insert(words, word) end
return words
end
return {str:match((str:gsub("[^"..sep.."]*"..sep, "([^"..sep.."]*)"..sep)))} -- BUG!! doesnt return last value
end
I try to run this:
我尝试运行这个:
local str = "a,b,c,d,e,f,g"
local sep = ","
t = split(str,sep)
for i,j in ipairs(t) do
print(i,j)
end
and I get:
我得到:
1 a
2 b
3 c
4 d
5 e
6 f
Can't figure out where the bug is...
无法弄清楚bug在哪里......
3 个解决方案
#1
5
When splitting strings, the easiest way to avoid corner cases is to append the delimiter to the string, when you know the string cannot end with the delimiter:
分割字符串时,避免极端情况的最简单方法是将分隔符附加到字符串,当您知道字符串不能以分隔符结束时:
str = "a,b,c,d,e,f,g"
str = str .. ','
for w in str:gmatch("(.-),") do print(w) end
Alternatively, you can use a pattern with an optional delimiter:
或者,您可以使用带有可选分隔符的模式:
str = "a,b,c,d,e,f,g"
for w in str:gmatch("([^,]+),?") do print(w) end
Actually, we don't need the optional delimiter since we're capturing non-delimiters:
实际上,我们不需要可选的分隔符,因为我们正在捕获非分隔符:
str = "a,b,c,d,e,f,g"
for w in str:gmatch("([^,]+)") do print(w) end
#2
0
Here's the split function I usually use for all my "splitting" needs:
这是我通常用于满足所有“分裂”需求的分割功能:
function split(s, sep)
local fields = {}
local sep = sep or " "
local pattern = string.format("([^%s]+)", sep)
string.gsub(s, pattern, function(c) fields[#fields + 1] = c end)
return fields
end
t = split("a,b,c,d,e,f,g",",")
for i,j in pairs(t) do
print(i,j)
end
#3
0
"[^"..sep.."]*"..sep
This is what causes the problem. You are matching a string of characters which are not the separator followed by the separator. However, the last substring you want to match (g
) is not followed by the separator character.
“[^”.. sep ..“] *”.. sep这是导致问题的原因。您正在匹配一个字符串,这些字符不是分隔符后跟分隔符。但是,要匹配的最后一个子字符串(g)后面没有分隔符。
The quickest way to fix this is to also consider \0
a separator ("[^"..sep.."\0]*"..sep
), as it represents the beginning and/or the end of the string. This way, g
, which is not followed by a separator but by the end of the string would still be considered a match.
解决此问题的最快方法是同时考虑\ 0分隔符(“[^”.. sep ..“\ 0] *”.. sep),因为它表示字符串的开头和/或结尾。这样,g,后面没有分隔符,但是字符串的末尾仍然被认为是匹配。
I'd say your approach is overly complicated in general; first of all you can just match individual substrings that do not contain the separator; secondly you can do this in a for
-loop using the gmatch
function
我说你的方法一般过于复杂;首先,您可以匹配不包含分隔符的各个子字符串;其次,您可以使用gmatch函数在for循环中执行此操作
local result = {}
for field in your_string:gsub(("[^%s]+"):format(your_separator)) do
table.insert(result, field)
end
return result
EDIT: The above code made a bit more simple:
编辑:上面的代码更简单:
local pattern = "[^%" .. your_separator .. "]+"
for field in string.gsub(your_string, pattern) do
-- ...and so on (The rest should be easy enough to understand)
EDIT2: Keep in mind that you should also escape your separators. A separator like %
could cause problems if you don't escape it as %%
编辑2:请记住,你也应该逃脱你的分隔符。像%一样的分隔符可能会导致问题,如果您不以%%的形式将其转义
function escape(str)
return str:gsub("([%^%$%(%)%%%.%[%]%*%+%-%?])", "%%%1")
end
#1
5
When splitting strings, the easiest way to avoid corner cases is to append the delimiter to the string, when you know the string cannot end with the delimiter:
分割字符串时,避免极端情况的最简单方法是将分隔符附加到字符串,当您知道字符串不能以分隔符结束时:
str = "a,b,c,d,e,f,g"
str = str .. ','
for w in str:gmatch("(.-),") do print(w) end
Alternatively, you can use a pattern with an optional delimiter:
或者,您可以使用带有可选分隔符的模式:
str = "a,b,c,d,e,f,g"
for w in str:gmatch("([^,]+),?") do print(w) end
Actually, we don't need the optional delimiter since we're capturing non-delimiters:
实际上,我们不需要可选的分隔符,因为我们正在捕获非分隔符:
str = "a,b,c,d,e,f,g"
for w in str:gmatch("([^,]+)") do print(w) end
#2
0
Here's the split function I usually use for all my "splitting" needs:
这是我通常用于满足所有“分裂”需求的分割功能:
function split(s, sep)
local fields = {}
local sep = sep or " "
local pattern = string.format("([^%s]+)", sep)
string.gsub(s, pattern, function(c) fields[#fields + 1] = c end)
return fields
end
t = split("a,b,c,d,e,f,g",",")
for i,j in pairs(t) do
print(i,j)
end
#3
0
"[^"..sep.."]*"..sep
This is what causes the problem. You are matching a string of characters which are not the separator followed by the separator. However, the last substring you want to match (g
) is not followed by the separator character.
“[^”.. sep ..“] *”.. sep这是导致问题的原因。您正在匹配一个字符串,这些字符不是分隔符后跟分隔符。但是,要匹配的最后一个子字符串(g)后面没有分隔符。
The quickest way to fix this is to also consider \0
a separator ("[^"..sep.."\0]*"..sep
), as it represents the beginning and/or the end of the string. This way, g
, which is not followed by a separator but by the end of the string would still be considered a match.
解决此问题的最快方法是同时考虑\ 0分隔符(“[^”.. sep ..“\ 0] *”.. sep),因为它表示字符串的开头和/或结尾。这样,g,后面没有分隔符,但是字符串的末尾仍然被认为是匹配。
I'd say your approach is overly complicated in general; first of all you can just match individual substrings that do not contain the separator; secondly you can do this in a for
-loop using the gmatch
function
我说你的方法一般过于复杂;首先,您可以匹配不包含分隔符的各个子字符串;其次,您可以使用gmatch函数在for循环中执行此操作
local result = {}
for field in your_string:gsub(("[^%s]+"):format(your_separator)) do
table.insert(result, field)
end
return result
EDIT: The above code made a bit more simple:
编辑:上面的代码更简单:
local pattern = "[^%" .. your_separator .. "]+"
for field in string.gsub(your_string, pattern) do
-- ...and so on (The rest should be easy enough to understand)
EDIT2: Keep in mind that you should also escape your separators. A separator like %
could cause problems if you don't escape it as %%
编辑2:请记住,你也应该逃脱你的分隔符。像%一样的分隔符可能会导致问题,如果您不以%%的形式将其转义
function escape(str)
return str:gsub("([%^%$%(%)%%%.%[%]%*%+%-%?])", "%%%1")
end