I'm trying to write a regex to replace strings if not surrounded by single quotes. For example I want to replace FOO with XXX in the following string:
我正在尝试写一个正则表达式来替换字符串,如果没有被单引号包围的话。例如,我想在以下字符串中将FOO替换为XXX:
string = "' FOO ' abc 123 ' def FOO ghi 345 ' FOO '' FOO ' lmno 678 FOO '"
the desired output is:
所需的输出是:
output = "' FOO ' abc 123 ' def FOO ghi 345 ' XXX '' XXX ' lmno 678 FOO '"
My current regex is:
我现在的正则表达式是:
myregex = re.compile("(?<!')+( FOO )(?!')+", re.IGNORECASE)
I think I have to use look-around operators, but I don't understand how... regex are too complicated to me :D
我认为我必须使用环视操作符,但我不明白如何……regex对我来说太复杂了:D。
Can you help me?
你能帮我吗?
2 个解决方案
#1
2
Here's how it could be done:
这是如何做到的:
import re
def replace_FOO(m):
if m.group(1) is None:
return m.group()
return m.group().replace("FOO", "XXX")
string = "' FOO ' abc 123 ' def FOO ghi 345 ' FOO '' FOO ' lmno 678 FOO '"
output = re.sub(r"'[^']*'|([^']*)", replace_FOO, string)
print(string)
print(output)
[EDIT]
(编辑)
The re.sub
function will accept as a replacement either a string template or a function. If the replacement is a function, every time it finds a match it'll call the function, passing the match object, and then use the returned value (which must be a string) as the replacement string.
re.sub函数将接受作为字符串模板或函数的替换。如果替换是一个函数,那么每次找到匹配项时,它都会调用函数,传递match对象,然后使用返回值(必须是字符串)作为替换字符串。
As for the pattern itself, as it searches, if there's a '
at the current position it'll match up to and including the next '
, otherwise it'll match up to but excluding the next '
or the end of the string.
至于模式本身,当它搜索时,如果在当前位置有一个“它将匹配并包含下一个”,否则它将匹配但排除下一个“或字符串的末尾”。
The replacement function will be called on each match and return the appropriate result.
替换函数将在每个匹配上调用并返回适当的结果。
Actually, now I think about it, I don't need to use a group at all. I could do this instead:
实际上,现在我想一下,我根本不需要使用组。我可以这样做:
def replace_FOO(m):
if m.group().startswith("'"):
return m.group().replace("FOO", "XXX")
return m.group()
string = "' FOO ' abc 123 ' def FOO ghi 345 ' FOO '' FOO ' lmno 678 FOO '"
output = re.sub(r"'[^']*'|[^']+", replace_FOO, string)
#2
1
This is hard to do without variable length lookbehind. I'm not sure if python regex support it. Anyway, a simple solution is the following:
如果没有可变长度的查找,这是很难做到的。我不确定python regex是否支持它。总之,一个简单的解决方案是:
Use this regex: (?:[^'\s]\s*)(FOO)(?:\s*[^'\s])
使用这个正则表达式:(?:[\ s]\年代^ *)(FOO)(?:\ s *[^ \年代])
The first capture group should return the right result.
第一个捕获组应该返回正确的结果。
In case this is always a quote with a single space after it, as in your example, you can use fixed length lookbehind: (?<=[^'\s]\ )FOO(?=\s*[^'\s])
which will match exactly the one you want.
如果这个引用后面总是一个空格,如您的示例中所示,您可以使用固定长度lookbehind: (?< =(^ \ s]\)FOO(? = \ s *[^ \年代])将完全匹配一个你想要的。
#1
2
Here's how it could be done:
这是如何做到的:
import re
def replace_FOO(m):
if m.group(1) is None:
return m.group()
return m.group().replace("FOO", "XXX")
string = "' FOO ' abc 123 ' def FOO ghi 345 ' FOO '' FOO ' lmno 678 FOO '"
output = re.sub(r"'[^']*'|([^']*)", replace_FOO, string)
print(string)
print(output)
[EDIT]
(编辑)
The re.sub
function will accept as a replacement either a string template or a function. If the replacement is a function, every time it finds a match it'll call the function, passing the match object, and then use the returned value (which must be a string) as the replacement string.
re.sub函数将接受作为字符串模板或函数的替换。如果替换是一个函数,那么每次找到匹配项时,它都会调用函数,传递match对象,然后使用返回值(必须是字符串)作为替换字符串。
As for the pattern itself, as it searches, if there's a '
at the current position it'll match up to and including the next '
, otherwise it'll match up to but excluding the next '
or the end of the string.
至于模式本身,当它搜索时,如果在当前位置有一个“它将匹配并包含下一个”,否则它将匹配但排除下一个“或字符串的末尾”。
The replacement function will be called on each match and return the appropriate result.
替换函数将在每个匹配上调用并返回适当的结果。
Actually, now I think about it, I don't need to use a group at all. I could do this instead:
实际上,现在我想一下,我根本不需要使用组。我可以这样做:
def replace_FOO(m):
if m.group().startswith("'"):
return m.group().replace("FOO", "XXX")
return m.group()
string = "' FOO ' abc 123 ' def FOO ghi 345 ' FOO '' FOO ' lmno 678 FOO '"
output = re.sub(r"'[^']*'|[^']+", replace_FOO, string)
#2
1
This is hard to do without variable length lookbehind. I'm not sure if python regex support it. Anyway, a simple solution is the following:
如果没有可变长度的查找,这是很难做到的。我不确定python regex是否支持它。总之,一个简单的解决方案是:
Use this regex: (?:[^'\s]\s*)(FOO)(?:\s*[^'\s])
使用这个正则表达式:(?:[\ s]\年代^ *)(FOO)(?:\ s *[^ \年代])
The first capture group should return the right result.
第一个捕获组应该返回正确的结果。
In case this is always a quote with a single space after it, as in your example, you can use fixed length lookbehind: (?<=[^'\s]\ )FOO(?=\s*[^'\s])
which will match exactly the one you want.
如果这个引用后面总是一个空格,如您的示例中所示,您可以使用固定长度lookbehind: (?< =(^ \ s]\)FOO(? = \ s *[^ \年代])将完全匹配一个你想要的。