如果没有被单引号括起来，则使用Python regex替换字符串

I'm trying to write a regex to replace strings if not surrounded by single quotes. For example I want to replace FOO with XXX in the following string:

我正在尝试写一个正则表达式来替换字符串，如果没有被单引号包围的话。例如，我想在以下字符串中将FOO替换为XXX:

string = "' FOO ' abc 123 ' def FOO ghi 345 ' FOO '' FOO ' lmno 678 FOO '"

the desired output is:

所需的输出是:

output = "' FOO ' abc 123 ' def FOO ghi 345 ' XXX '' XXX ' lmno 678 FOO '"

My current regex is:

我现在的正则表达式是:

myregex = re.compile("(?<!')+( FOO )(?!')+", re.IGNORECASE)

I think I have to use look-around operators, but I don't understand how... regex are too complicated to me :D

我认为我必须使用环视操作符，但我不明白如何……regex对我来说太复杂了:D。

Can you help me?

你能帮我吗?

2 个解决方案

#1

Here's how it could be done:

这是如何做到的:

import re

def replace_FOO(m):
    if m.group(1) is None:
        return m.group()

    return m.group().replace("FOO", "XXX")

string = "' FOO ' abc 123 ' def FOO ghi 345 ' FOO '' FOO ' lmno 678 FOO '"

output = re.sub(r"'[^']*'|([^']*)", replace_FOO, string)

print(string)
print(output)

[EDIT]

(编辑)

The re.sub function will accept as a replacement either a string template or a function. If the replacement is a function, every time it finds a match it'll call the function, passing the match object, and then use the returned value (which must be a string) as the replacement string.

re.sub函数将接受作为字符串模板或函数的替换。如果替换是一个函数，那么每次找到匹配项时，它都会调用函数，传递match对象，然后使用返回值(必须是字符串)作为替换字符串。

As for the pattern itself, as it searches, if there's a ' at the current position it'll match up to and including the next ', otherwise it'll match up to but excluding the next ' or the end of the string.

至于模式本身，当它搜索时，如果在当前位置有一个“它将匹配并包含下一个”，否则它将匹配但排除下一个“或字符串的末尾”。

The replacement function will be called on each match and return the appropriate result.

替换函数将在每个匹配上调用并返回适当的结果。

Actually, now I think about it, I don't need to use a group at all. I could do this instead:

实际上，现在我想一下，我根本不需要使用组。我可以这样做:

def replace_FOO(m):
    if m.group().startswith("'"):
        return m.group().replace("FOO", "XXX")

    return m.group()

string = "' FOO ' abc 123 ' def FOO ghi 345 ' FOO '' FOO ' lmno 678 FOO '"

output = re.sub(r"'[^']*'|[^']+", replace_FOO, string)

#2

This is hard to do without variable length lookbehind. I'm not sure if python regex support it. Anyway, a simple solution is the following:

如果没有可变长度的查找，这是很难做到的。我不确定python regex是否支持它。总之，一个简单的解决方案是:

Use this regex: (?:[^'\s]\s*)(FOO)(?:\s*[^'\s])

使用这个正则表达式:(?:[\ s]\年代^ *)(FOO)(?:\ s *[^ \年代])

The first capture group should return the right result.

第一个捕获组应该返回正确的结果。

In case this is always a quote with a single space after it, as in your example, you can use fixed length lookbehind: (?<=[^'\s]\ )FOO(?=\s*[^'\s]) which will match exactly the one you want.

如果这个引用后面总是一个空格，如您的示例中所示，您可以使用固定长度lookbehind: (?< =(^ \ s]\)FOO(? = \ s *[^ \年代])将完全匹配一个你想要的。

#1