What are the rules for the escape character \
in string literals? Is there a list of all the characters that are escaped?
在字符串中转义字符\的规则是什么?是否有所有转义字符的列表?
In particular, when I use \
in a string literal in gedit, and follow it by any three numbers, it colors them differently.
特别地,当我在gedit中使用一个字符串字面量,并跟随它的任意三个数字时,它会用不同的颜色来表示它们。
I was trying to create a std::string
constructed from a literal with the character 0
followed by the null character (\0
), followed by the character 0
. However, the syntax highlighting alerted me that maybe this would create something like the character 0
followed by the null character (\00
, aka \0
), which is to say, only two characters.
我试着创建一个std::字符串由文字与字符0组成,后跟null字符(\0),后跟字符0。然而,突出显示的语法提醒我,这可能会创建类似字符0后面跟着空字符(\00,aka \0)的东西,也就是说,只有两个字符。
For the solution to just this one problem, is this the best way to do it:
对于这个问题的解决方法,这是最好的方法吗?
std::string ("0\0" "0", 3) // String concatenation
And is there some reference for what the escape character does in string literals in general? What is '\a', for instance?
总的来说,对于转义字符在字符串中的作用有什么参考吗?例如,什么是“\a”?
5 个解决方案
#1
48
Control characters:
控制字符:
(Hex codes assume an ASCII-compatible character encoding.)
(Hex代码假设一个与ascii兼容的字符编码。)
-
\a
=\x07
= alert (bell) - \a = \x07 =警告(铃声)
-
\b
=\x08
= backspace - \b = \x08 = backspace
-
\t
=\x09
= horizonal tab - \t = \x09 =横档
-
\n
=\x0A
= newline (or line feed) - \n = \x0A =换行(或换行)
-
\v
=\x0B
= vertical tab - \v = \x0B =竖屏
-
\f
=\x0C
= form feed - \f = \x0C = form feed
-
\r
=\x0D
= carriage return - \r = \x0D =回车
-
\e
=\x1B
= escape (non-standard GCC extension) - \e = \x1B = escape(非标准GCC扩展)
Punctuation characters:
标点符号:
-
\"
= quotation mark (backslash not required for'"'
) - \" =引号("' '不需要反斜杠)
-
\'
= apostrophe (backslash not required for"'"
) - \' =撇号(“”不需要反斜杠)
-
\?
= question mark (used to avoid trigraphs) - \ ?=问号(用于避免三图)
-
\\
= backslash - \ \ =反斜杠
Numeric character references:
数字字符引用:
-
\
+ up to 3 octal digits - 最多3个八进制数字
-
\x
+ any number of hex digits - \x +任意数量的十六进制数字
-
\u
+ 4 hex digits (Unicode BMP, new in C++11) - \u + 4十六进制数字(Unicode BMP,新的c++ 11)
-
\U
+ 8 hex digits (Unicode astral planes, new in C++11) - \U + 8十六进制数字(Unicode星形平面,新的c++ 11)
\0
= \00
= \000
= octal ecape for null character
\0 = \00 = \000 = null字符的八进制ecape
If you do want an actual digit character after a \0
, then yes, I recommend string concatenation. Note that the whitespace between the parts of the literal is optional, so you can write "\0""0"
.
如果您确实想要一个真正的数字字符后的\0,那么是的,我建议字符串连接。注意,文字部分之间的空格是可选的,所以您可以写“\0”“0”。
#2
4
\0 will be interpreted as an octal escape sequence if it is followed by other digits, so \00 will be interpreted as a single character. (\0 is technically an octal escape sequence as well, at least in C).
\0将被解释为八进制转义序列,如果后面跟着其他数字,那么\00将被解释为单个字符。(\0在技术上也是一个八进制转义序列,至少在C中是这样)。
The way you're doing it:
你做事的方式:
std::string ("0\0" "0", 3) // String concatenation
works because this version of the constructor takes a char array; if you try to just pass "0\0" "0" as a const char*, it will treat it as a C string and only copy everything up until the null character.
因为构造函数的这个版本接受一个char数组;如果您尝试将“0\0”“0”作为const char*传递,它将把它当作C字符串,并且只将所有内容复制到null字符。
Here is a list of escape sequences.
这是转义序列的列表。
#3
4
\a
is the bell/alert character, which on some systems triggers a sound. \nnn
, represents an arbitrary ASCII character in octal base. However, \0
is special in that it represents the null character no matter what.
\a是铃声/警报字符,在某些系统上可以触发声音。\nnn,表示八进制基中的任意ASCII字符。然而,\0是特殊的,因为它表示空字符,无论如何。
To answer your original question, you could escape your '0' characters as well, as:
要回答你最初的问题,你也可以将“0”字符转义为:
std::string ("\060\000\060", 3);
(since an ASCII '0' is 60 in octal)
(因为ASCII 0是60个八进制)
The MSDN documentation has a pretty detailed article on this, as well cppreference
MSDN文档中有一篇关于这方面的非常详细的文章,还有cppreference
#4
1
I left something like this as a comment, but I feel it probably needs more visibility as none of the answers mention this method:
我留下这样的东西作为评论,但我觉得它可能需要更多的可见性,因为没有一个答案提到这个方法:
The method I now prefer for initializing a std::string
with non-printing characters in general (and embedded null characters in particular) is to use the C++11 feature of initializer lists.
我现在更喜欢的初始化std::字符串的方法是使用初始化列表的c++ 11特性。
std::string const str({'\0', '6', '\a', 'H', '\t'});
I am not required to perform error-prone manual counting of the number of characters that I am using, so that if later on I want to insert a '\013' in the middle somewhere, I can and all of my code will still work. It also completely sidesteps any issues of using the wrong escape sequence by accident.
我不需要对我正在使用的字符的数量进行错误的手工计数,所以如果以后我想在中间插入一个“\013”,我可以而且我的所有代码仍然可以工作。它还完全回避了偶然使用错误转义序列的任何问题。
The only downside is all of those extra '
and ,
characters.
唯一的缺点是所有这些额外的“和”字符。
#5
0
With the magic of user-defined literals, we have yet another solution to this. C++14 added a std::string
literal operator.
有了用户定义文字的魔力,我们有了另一个解决方案。c++ 14添加了一个std::字符串文字运算符。
using namespace std::string_literals;
auto const x = "\0" "0"s;
Constructs a string of length 2, with a '\0' character (null) followed by a '0' character (the digit zero). I am not sure if it is more or less clear than the initializer_list<char>
constructor approach, but it at least gets rid of the '
and ,
characters.
构造一个长度为2的字符串,后跟一个'\0'字符(null)和一个'0'字符(数字0)。我不确定它是否比initializer_list
#1
48
Control characters:
控制字符:
(Hex codes assume an ASCII-compatible character encoding.)
(Hex代码假设一个与ascii兼容的字符编码。)
-
\a
=\x07
= alert (bell) - \a = \x07 =警告(铃声)
-
\b
=\x08
= backspace - \b = \x08 = backspace
-
\t
=\x09
= horizonal tab - \t = \x09 =横档
-
\n
=\x0A
= newline (or line feed) - \n = \x0A =换行(或换行)
-
\v
=\x0B
= vertical tab - \v = \x0B =竖屏
-
\f
=\x0C
= form feed - \f = \x0C = form feed
-
\r
=\x0D
= carriage return - \r = \x0D =回车
-
\e
=\x1B
= escape (non-standard GCC extension) - \e = \x1B = escape(非标准GCC扩展)
Punctuation characters:
标点符号:
-
\"
= quotation mark (backslash not required for'"'
) - \" =引号("' '不需要反斜杠)
-
\'
= apostrophe (backslash not required for"'"
) - \' =撇号(“”不需要反斜杠)
-
\?
= question mark (used to avoid trigraphs) - \ ?=问号(用于避免三图)
-
\\
= backslash - \ \ =反斜杠
Numeric character references:
数字字符引用:
-
\
+ up to 3 octal digits - 最多3个八进制数字
-
\x
+ any number of hex digits - \x +任意数量的十六进制数字
-
\u
+ 4 hex digits (Unicode BMP, new in C++11) - \u + 4十六进制数字(Unicode BMP,新的c++ 11)
-
\U
+ 8 hex digits (Unicode astral planes, new in C++11) - \U + 8十六进制数字(Unicode星形平面,新的c++ 11)
\0
= \00
= \000
= octal ecape for null character
\0 = \00 = \000 = null字符的八进制ecape
If you do want an actual digit character after a \0
, then yes, I recommend string concatenation. Note that the whitespace between the parts of the literal is optional, so you can write "\0""0"
.
如果您确实想要一个真正的数字字符后的\0,那么是的,我建议字符串连接。注意,文字部分之间的空格是可选的,所以您可以写“\0”“0”。
#2
4
\0 will be interpreted as an octal escape sequence if it is followed by other digits, so \00 will be interpreted as a single character. (\0 is technically an octal escape sequence as well, at least in C).
\0将被解释为八进制转义序列,如果后面跟着其他数字,那么\00将被解释为单个字符。(\0在技术上也是一个八进制转义序列,至少在C中是这样)。
The way you're doing it:
你做事的方式:
std::string ("0\0" "0", 3) // String concatenation
works because this version of the constructor takes a char array; if you try to just pass "0\0" "0" as a const char*, it will treat it as a C string and only copy everything up until the null character.
因为构造函数的这个版本接受一个char数组;如果您尝试将“0\0”“0”作为const char*传递,它将把它当作C字符串,并且只将所有内容复制到null字符。
Here is a list of escape sequences.
这是转义序列的列表。
#3
4
\a
is the bell/alert character, which on some systems triggers a sound. \nnn
, represents an arbitrary ASCII character in octal base. However, \0
is special in that it represents the null character no matter what.
\a是铃声/警报字符,在某些系统上可以触发声音。\nnn,表示八进制基中的任意ASCII字符。然而,\0是特殊的,因为它表示空字符,无论如何。
To answer your original question, you could escape your '0' characters as well, as:
要回答你最初的问题,你也可以将“0”字符转义为:
std::string ("\060\000\060", 3);
(since an ASCII '0' is 60 in octal)
(因为ASCII 0是60个八进制)
The MSDN documentation has a pretty detailed article on this, as well cppreference
MSDN文档中有一篇关于这方面的非常详细的文章,还有cppreference
#4
1
I left something like this as a comment, but I feel it probably needs more visibility as none of the answers mention this method:
我留下这样的东西作为评论,但我觉得它可能需要更多的可见性,因为没有一个答案提到这个方法:
The method I now prefer for initializing a std::string
with non-printing characters in general (and embedded null characters in particular) is to use the C++11 feature of initializer lists.
我现在更喜欢的初始化std::字符串的方法是使用初始化列表的c++ 11特性。
std::string const str({'\0', '6', '\a', 'H', '\t'});
I am not required to perform error-prone manual counting of the number of characters that I am using, so that if later on I want to insert a '\013' in the middle somewhere, I can and all of my code will still work. It also completely sidesteps any issues of using the wrong escape sequence by accident.
我不需要对我正在使用的字符的数量进行错误的手工计数,所以如果以后我想在中间插入一个“\013”,我可以而且我的所有代码仍然可以工作。它还完全回避了偶然使用错误转义序列的任何问题。
The only downside is all of those extra '
and ,
characters.
唯一的缺点是所有这些额外的“和”字符。
#5
0
With the magic of user-defined literals, we have yet another solution to this. C++14 added a std::string
literal operator.
有了用户定义文字的魔力,我们有了另一个解决方案。c++ 14添加了一个std::字符串文字运算符。
using namespace std::string_literals;
auto const x = "\0" "0"s;
Constructs a string of length 2, with a '\0' character (null) followed by a '0' character (the digit zero). I am not sure if it is more or less clear than the initializer_list<char>
constructor approach, but it at least gets rid of the '
and ,
characters.
构造一个长度为2的字符串,后跟一个'\0'字符(null)和一个'0'字符(数字0)。我不确定它是否比initializer_list