My code doesn't seem to be working like it's supposed to:
我的代码似乎没有像它应该的那样工作:
x = "engniu4nwi5u"
print re.sub(r"\D(\d)\D", r"\1abc", x)
My desired output is: engniuabcnwiabcu
But the output actually given is: engni4abcw5abc
我想要的输出是:engniuabcnwiabcu但实际给出的输出是:engni4abcw5abc
4 个解决方案
#1
You are grouping the wrong characters it must be written as
您正在将必须写入的错误字符分组为
>>> x = "engniu4nwi5u"
>>> re.sub(r"(\D)\d(\D)", r"\1abc\2", x)
'engniuabcnwiabcu'
-
(\D)
Matches a non digit and captures it in\1
-
\d
Matches the digit -
(\D)
Matches the following digit. captures in\2
(\ D)匹配非数字并在\ 1中捕获它
\ d匹配数字
(\ D)匹配以下数字。捕获\ 2
How does it matches?
它是如何匹配的?
engniu4nwi5u
|
\D => \1
engniu4nwi5u
|
\d
engniu4nwi5u
|
\D => \2
Another Solution
You can also use look arounds to perform the same as
您也可以使用环顾四周来执行相同的操作
>>> x = "engniu4nwi5u"
>>> re.sub(r"(?<=\D)\d(?=\D)", r"abc", x)
'engniuabcnwiabcu'
-
(?<=\D)
Look behind assertion. Checks if the digit is presceded by a non digit. But not caputred -
\d
Matches the digit -
(?=\D)
Look ahead assertion. Checks if the digit is followed by the non digit. Also not captured.
(?<= \ D)看看断言背后。检查数字是否以非数字表示。但没有被束缚
\ d匹配数字
(?= \ D)向前看断言。检查数字后面是否为非数字。也没有捕获。
#2
This is because you replaced the wrong part:
这是因为你更换了错误的部分:
Let's consider the first match. \D\d\D
matches the following:
让我们考虑第一场比赛。 \ D \ d \ D符合以下条件:
engniu4nwi5u
^^^
4
is captured as \1
. Then you replace the whole match with: \1abc
, which becomes 4abc
.
4被捕获为\ 1。然后用:\ 1abc替换整个匹配,变为4abc。
You have a couple solutions here:
你有几个解决方案:
- Capture what you want to keep:
(\D)\d(\D)
and replace it with\1abc\2
- Use lookaheads:
(?<=\D)\d(?=\D)
and replace this withabc
捕获您想要保留的内容:(\ D)\ d(\ D)并将其替换为\ 1abc \ 2
使用前瞻:(?<= \ D)\ d(?= \ D)并用abc替换它
#3
Based on your regexp:
基于你的正则表达式:
>>> re.sub("(\D)\d", r"\1abc", x)
'engniuabcnwiabcu'
Although I would do this instead:
虽然我会这样做:
>>> re.sub("\d", "abc", x)
'engniuabcnwiabcu'
#4
If you plan to check also the beginning and end of string, you need to add ^
and $
to the regex:
如果您还打算检查字符串的开头和结尾,则需要将^和$添加到正则表达式:
(\D|^)\d(?=$|\D)
And replace with \1abc
.
并用\ 1abc替换。
See demo
IDEONE上的示例代码:
import re
p = re.compile(ur'(\D|^)\d(?=$|\D)')
test_str = u"1engniu4nwi5u"
subst = u"\1abc"
print re.sub(p, subst, test_str)
#1
You are grouping the wrong characters it must be written as
您正在将必须写入的错误字符分组为
>>> x = "engniu4nwi5u"
>>> re.sub(r"(\D)\d(\D)", r"\1abc\2", x)
'engniuabcnwiabcu'
-
(\D)
Matches a non digit and captures it in\1
-
\d
Matches the digit -
(\D)
Matches the following digit. captures in\2
(\ D)匹配非数字并在\ 1中捕获它
\ d匹配数字
(\ D)匹配以下数字。捕获\ 2
How does it matches?
它是如何匹配的?
engniu4nwi5u
|
\D => \1
engniu4nwi5u
|
\d
engniu4nwi5u
|
\D => \2
Another Solution
You can also use look arounds to perform the same as
您也可以使用环顾四周来执行相同的操作
>>> x = "engniu4nwi5u"
>>> re.sub(r"(?<=\D)\d(?=\D)", r"abc", x)
'engniuabcnwiabcu'
-
(?<=\D)
Look behind assertion. Checks if the digit is presceded by a non digit. But not caputred -
\d
Matches the digit -
(?=\D)
Look ahead assertion. Checks if the digit is followed by the non digit. Also not captured.
(?<= \ D)看看断言背后。检查数字是否以非数字表示。但没有被束缚
\ d匹配数字
(?= \ D)向前看断言。检查数字后面是否为非数字。也没有捕获。
#2
This is because you replaced the wrong part:
这是因为你更换了错误的部分:
Let's consider the first match. \D\d\D
matches the following:
让我们考虑第一场比赛。 \ D \ d \ D符合以下条件:
engniu4nwi5u
^^^
4
is captured as \1
. Then you replace the whole match with: \1abc
, which becomes 4abc
.
4被捕获为\ 1。然后用:\ 1abc替换整个匹配,变为4abc。
You have a couple solutions here:
你有几个解决方案:
- Capture what you want to keep:
(\D)\d(\D)
and replace it with\1abc\2
- Use lookaheads:
(?<=\D)\d(?=\D)
and replace this withabc
捕获您想要保留的内容:(\ D)\ d(\ D)并将其替换为\ 1abc \ 2
使用前瞻:(?<= \ D)\ d(?= \ D)并用abc替换它
#3
Based on your regexp:
基于你的正则表达式:
>>> re.sub("(\D)\d", r"\1abc", x)
'engniuabcnwiabcu'
Although I would do this instead:
虽然我会这样做:
>>> re.sub("\d", "abc", x)
'engniuabcnwiabcu'
#4
If you plan to check also the beginning and end of string, you need to add ^
and $
to the regex:
如果您还打算检查字符串的开头和结尾,则需要将^和$添加到正则表达式:
(\D|^)\d(?=$|\D)
And replace with \1abc
.
并用\ 1abc替换。
See demo
IDEONE上的示例代码:
import re
p = re.compile(ur'(\D|^)\d(?=$|\D)')
test_str = u"1engniu4nwi5u"
subst = u"\1abc"
print re.sub(p, subst, test_str)