I found the following regex substitution example from the documentation for Regex. I'm a little bit confused as to what the prefix r
does before the string?
我从Regex的文档中找到了以下正则表达式替换示例。关于前缀r在字符串之前做了什么,我有点困惑?
re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):',
... r'static PyObject*\npy_\1(void)\n{',
... 'def myfunc():')
2 个解决方案
#1
10
Placing r
or R
before a string literal creates what is known as a raw-string literal. Raw strings do not process escape sequences (\n
, \b
, etc.) and are thus commonly used for Regex patterns, which often contain a lot of \
characters.
在字符串文字之前放置r或R会创建所谓的原始字符串文字。原始字符串不处理转义序列(\ n,\ b等),因此通常用于正则表达式模式,它通常包含许多\字符。
Below is a demonstration:
以下是演示:
>>> print('\n') # Prints a newline character
>>> print(r'\n') # Escape sequence is not processed
\n
>>> print('\b') # Prints a backspace character
>>> print(r'\b') # Escape sequence is not processed
\b
>>>
The only other option would be to double every backslash:
唯一的另一种选择是将每个反斜杠加倍:
re.sub('def\\s+([a-zA-Z_][a-zA-Z_0-9]*)\\s*\\(\\s*\\):',
... 'static PyObject*\\npy_\\1(void)\\n{',
... 'def myfunc():')
which is just tedious.
这很乏味。
#2
2
The r means that the string is to be treated as a raw string, which means all escape codes will be ignored.
r表示将字符串视为原始字符串,这意味着将忽略所有转义码。
The python document says this precisely:
python文件准确地说:
"String literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and use different rules for interpreting backslash escape sequences. "
“字符串文字可以选择以字母'r'或'R'为前缀;这种字符串称为原始字符串,并使用不同的规则来解释反斜杠转义序列。”
#1
10
Placing r
or R
before a string literal creates what is known as a raw-string literal. Raw strings do not process escape sequences (\n
, \b
, etc.) and are thus commonly used for Regex patterns, which often contain a lot of \
characters.
在字符串文字之前放置r或R会创建所谓的原始字符串文字。原始字符串不处理转义序列(\ n,\ b等),因此通常用于正则表达式模式,它通常包含许多\字符。
Below is a demonstration:
以下是演示:
>>> print('\n') # Prints a newline character
>>> print(r'\n') # Escape sequence is not processed
\n
>>> print('\b') # Prints a backspace character
>>> print(r'\b') # Escape sequence is not processed
\b
>>>
The only other option would be to double every backslash:
唯一的另一种选择是将每个反斜杠加倍:
re.sub('def\\s+([a-zA-Z_][a-zA-Z_0-9]*)\\s*\\(\\s*\\):',
... 'static PyObject*\\npy_\\1(void)\\n{',
... 'def myfunc():')
which is just tedious.
这很乏味。
#2
2
The r means that the string is to be treated as a raw string, which means all escape codes will be ignored.
r表示将字符串视为原始字符串,这意味着将忽略所有转义码。
The python document says this precisely:
python文件准确地说:
"String literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and use different rules for interpreting backslash escape sequences. "
“字符串文字可以选择以字母'r'或'R'为前缀;这种字符串称为原始字符串,并使用不同的规则来解释反斜杠转义序列。”