在正则表达式模式之前'r'意味着什么?

时间:2022-06-05 00:37:29

I found the following regex substitution example from the documentation for Regex. I'm a little bit confused as to what the prefix r does before the string?

我从Regex的文档中找到了以下正则表达式替换示例。关于前缀r在字符串之前做了什么,我有点困惑?

re.sub(r'def\s+([a-zA-Z_][a-zA-Z_0-9]*)\s*\(\s*\):',
 ...        r'static PyObject*\npy_\1(void)\n{',
 ...        'def myfunc():')

2 个解决方案

#1


10  

Placing r or R before a string literal creates what is known as a raw-string literal. Raw strings do not process escape sequences (\n, \b, etc.) and are thus commonly used for Regex patterns, which often contain a lot of \ characters.

在字符串文字之前放置r或R会创建所谓的原始字符串文字。原始字符串不处理转义序列(\ n,\ b等),因此通常用于正则表达式模式,它通常包含许多\字符。

Below is a demonstration:

以下是演示:

>>> print('\n') # Prints a newline character


>>> print(r'\n') # Escape sequence is not processed
\n
>>> print('\b') # Prints a backspace character

>>> print(r'\b') # Escape sequence is not processed
\b
>>>

The only other option would be to double every backslash:

唯一的另一种选择是将每个反斜杠加倍:

re.sub('def\\s+([a-zA-Z_][a-zA-Z_0-9]*)\\s*\\(\\s*\\):',
 ...        'static PyObject*\\npy_\\1(void)\\n{',
 ...        'def myfunc():')

which is just tedious.

这很乏味。

#2


2  

The r means that the string is to be treated as a raw string, which means all escape codes will be ignored.

r表示将字符串视为原始字符串,这意味着将忽略所有转义码。

The python document says this precisely:

python文件准确地说:

"String literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and use different rules for interpreting backslash escape sequences. "

“字符串文字可以选择以字母'r'或'R'为前缀;这种字符串称为原始字符串,并使用不同的规则来解释反斜杠转义序列。”

#1


10  

Placing r or R before a string literal creates what is known as a raw-string literal. Raw strings do not process escape sequences (\n, \b, etc.) and are thus commonly used for Regex patterns, which often contain a lot of \ characters.

在字符串文字之前放置r或R会创建所谓的原始字符串文字。原始字符串不处理转义序列(\ n,\ b等),因此通常用于正则表达式模式,它通常包含许多\字符。

Below is a demonstration:

以下是演示:

>>> print('\n') # Prints a newline character


>>> print(r'\n') # Escape sequence is not processed
\n
>>> print('\b') # Prints a backspace character

>>> print(r'\b') # Escape sequence is not processed
\b
>>>

The only other option would be to double every backslash:

唯一的另一种选择是将每个反斜杠加倍:

re.sub('def\\s+([a-zA-Z_][a-zA-Z_0-9]*)\\s*\\(\\s*\\):',
 ...        'static PyObject*\\npy_\\1(void)\\n{',
 ...        'def myfunc():')

which is just tedious.

这很乏味。

#2


2  

The r means that the string is to be treated as a raw string, which means all escape codes will be ignored.

r表示将字符串视为原始字符串,这意味着将忽略所有转义码。

The python document says this precisely:

python文件准确地说:

"String literals may optionally be prefixed with a letter 'r' or 'R'; such strings are called raw strings and use different rules for interpreting backslash escape sequences. "

“字符串文字可以选择以字母'r'或'R'为前缀;这种字符串称为原始字符串,并使用不同的规则来解释反斜杠转义序列。”