c++字符串字符转义字符的规则

What are the rules for the escape character \ in string literals? Is there a list of all the characters that are escaped?

在字符串中转义字符\的规则是什么?是否有所有转义字符的列表?

In particular, when I use \ in a string literal in gedit, and follow it by any three numbers, it colors them differently.

特别地，当我在gedit中使用一个字符串字面量，并跟随它的任意三个数字时，它会用不同的颜色来表示它们。

I was trying to create a std::string constructed from a literal with the character 0 followed by the null character (\0), followed by the character 0. However, the syntax highlighting alerted me that maybe this would create something like the character 0 followed by the null character (\00, aka \0), which is to say, only two characters.

我试着创建一个std::字符串由文字与字符0组成，后跟null字符(\0)，后跟字符0。然而，突出显示的语法提醒我，这可能会创建类似字符0后面跟着空字符(\00,aka \0)的东西，也就是说，只有两个字符。

For the solution to just this one problem, is this the best way to do it:

对于这个问题的解决方法，这是最好的方法吗?

std::string ("0\0" "0", 3)  // String concatenation

And is there some reference for what the escape character does in string literals in general? What is '\a', for instance?

总的来说，对于转义字符在字符串中的作用有什么参考吗?例如，什么是“\a”?

5 个解决方案

#1

Control characters:

控制字符:

(Hex codes assume an ASCII-compatible character encoding.)

(Hex代码假设一个与ascii兼容的字符编码。)

\a = \x07 = alert (bell)
\a = \x07 =警告(铃声)
\b = \x08 = backspace
\b = \x08 = backspace
\t = \x09 = horizonal tab
\t = \x09 =横档
\n = \x0A = newline (or line feed)
\n = \x0A =换行(或换行)
\v = \x0B = vertical tab
\v = \x0B =竖屏
\f = \x0C = form feed
\f = \x0C = form feed
\r = \x0D = carriage return
\r = \x0D =回车
\e = \x1B = escape (non-standard GCC extension)
\e = \x1B = escape(非标准GCC扩展)

Punctuation characters:

标点符号:

\" = quotation mark (backslash not required for '"')
\" =引号("' '不需要反斜杠)
\' = apostrophe (backslash not required for "'")
\' =撇号(“”不需要反斜杠)
\? = question mark (used to avoid trigraphs)
\ ?=问号(用于避免三图)
\\ = backslash
\ \ =反斜杠

Numeric character references:

数字字符引用:

\ + up to 3 octal digits
最多3个八进制数字
\x + any number of hex digits
\x +任意数量的十六进制数字
\u + 4 hex digits (Unicode BMP, new in C++11)
\u + 4十六进制数字(Unicode BMP，新的c++ 11)
\U + 8 hex digits (Unicode astral planes, new in C++11)
\U + 8十六进制数字(Unicode星形平面，新的c++ 11)

\0 = \00 = \000 = octal ecape for null character

\0 = \00 = \000 = null字符的八进制ecape

If you do want an actual digit character after a \0, then yes, I recommend string concatenation. Note that the whitespace between the parts of the literal is optional, so you can write "\0""0".

如果您确实想要一个真正的数字字符后的\0，那么是的，我建议字符串连接。注意，文字部分之间的空格是可选的，所以您可以写“\0”“0”。

#2

\0 will be interpreted as an octal escape sequence if it is followed by other digits, so \00 will be interpreted as a single character. (\0 is technically an octal escape sequence as well, at least in C).

\0将被解释为八进制转义序列，如果后面跟着其他数字，那么\00将被解释为单个字符。(\0在技术上也是一个八进制转义序列，至少在C中是这样)。

The way you're doing it:

你做事的方式:

std::string ("0\0" "0", 3)  // String concatenation

works because this version of the constructor takes a char array; if you try to just pass "0\0" "0" as a const char*, it will treat it as a C string and only copy everything up until the null character.

因为构造函数的这个版本接受一个char数组;如果您尝试将“0\0”“0”作为const char*传递，它将把它当作C字符串，并且只将所有内容复制到null字符。

Here is a list of escape sequences.

这是转义序列的列表。

#3

\a is the bell/alert character, which on some systems triggers a sound. \nnn, represents an arbitrary ASCII character in octal base. However, \0 is special in that it represents the null character no matter what.

\a是铃声/警报字符，在某些系统上可以触发声音。\nnn，表示八进制基中的任意ASCII字符。然而，\0是特殊的，因为它表示空字符，无论如何。

To answer your original question, you could escape your '0' characters as well, as:

要回答你最初的问题，你也可以将“0”字符转义为:

std::string ("\060\000\060", 3);

(since an ASCII '0' is 60 in octal)

(因为ASCII 0是60个八进制)

The MSDN documentation has a pretty detailed article on this, as well cppreference

MSDN文档中有一篇关于这方面的非常详细的文章，还有cppreference

#4

I left something like this as a comment, but I feel it probably needs more visibility as none of the answers mention this method:

我留下这样的东西作为评论，但我觉得它可能需要更多的可见性，因为没有一个答案提到这个方法:

The method I now prefer for initializing a std::string with non-printing characters in general (and embedded null characters in particular) is to use the C++11 feature of initializer lists.

我现在更喜欢的初始化std::字符串的方法是使用初始化列表的c++ 11特性。

std::string const str({'\0', '6', '\a', 'H', '\t'});

I am not required to perform error-prone manual counting of the number of characters that I am using, so that if later on I want to insert a '\013' in the middle somewhere, I can and all of my code will still work. It also completely sidesteps any issues of using the wrong escape sequence by accident.

我不需要对我正在使用的字符的数量进行错误的手工计数，所以如果以后我想在中间插入一个“\013”，我可以而且我的所有代码仍然可以工作。它还完全回避了偶然使用错误转义序列的任何问题。

The only downside is all of those extra ' and , characters.

唯一的缺点是所有这些额外的“和”字符。

#5

With the magic of user-defined literals, we have yet another solution to this. C++14 added a std::string literal operator.

有了用户定义文字的魔力，我们有了另一个解决方案。c++ 14添加了一个std::字符串文字运算符。

using namespace std::string_literals;
auto const x = "\0" "0"s;

Constructs a string of length 2, with a '\0' character (null) followed by a '0' character (the digit zero). I am not sure if it is more or less clear than the initializer_list<char> constructor approach, but it at least gets rid of the ' and , characters.

构造一个长度为2的字符串，后跟一个'\0'字符(null)和一个'0'字符(数字0)。我不确定它是否比initializer_list 构造函数方法更清楚，但至少可以去掉“和”字符。

#1