How are \r and \n different? I think it has something to do with Unix vs. Windows vs. Mac, but I'm not sure exactly how they're different, and which to search for/match in regexes.


They're different characters. \r is carriage return, and \n is line feed.


On "old" printers, \r sent the print head back to the start of the line, and \n advanced the paper by one line. Both were therefore necessary to start printing on the next line.


Obviously that's somewhat irrelevant now, although depending on the console you may still be able to use \r to move to the start of the line and overwrite the existing text.


More importantly, Unix tends to use \n as a line separator; Windows tends to use \r\n as a line separator and Macs (up to OS 9) used to use \r as the line separator. (Mac OS X is Unix-y, so uses \n instead; there may be some compatibility situations where \r is used instead though.)

更重要的是,Unix倾向于使用\n作为行分隔符;Windows倾向于使用\r\n作为线分隔符,而mac(直到OS 9)使用\r作为行分隔符。(Mac OS X是Unix-y,所以用\n代替;可能会有一些兼容性的情况,但是可以使用\r。

For more information, see the Wikipedia newline article.


EDIT: This is language-sensitive. In C# and Java, for example, \n always means Unicode U+000A, which is defined as line feed. In C and C++ the water is somewhat muddier, as the meaning is platform-specific. See comments for details.

编辑:这是语言敏感。例如,在c#和Java中,\n总是意味着Unicode U+000A,它被定义为行提要。在C和c++中,水有点泥泞,因为其含义是特定于平台的。有关详细信息,请参阅注释。



In C and C++, \n is a concept, \r is a character, and \r\n is (almost always) a portability bug.


Think of an old teletype. The print head is positioned on some line and in some column. When you send a printable character to the teletype, it prints the character at the current position and moves the head to the next column. (This is conceptually the same as a typewriter, except that typewriters typically moved the paper with respect to the print head.)


When you wanted to finish the current line and start on the next line, you had to do two separate steps:


  1. move the print head back to the beginning of the line, then
  2. 然后将打印头移回线的开头。
  3. move it down to the next line.
  4. 把它移到下一行。

ASCII encodes these actions as two distinct control characters:


  • \x0D (CR) moves the print head back to the beginning of the line. (Unicode encodes this as U+000D CARRIAGE RETURN.)
  • \x0D (CR)将打印头移回行开始处。(Unicode编码为U+000D回车。)
  • \x0A (LF) moves the print head down to the next line. (Unicode encodes this as U+000A LINE FEED.)
  • \x0A (LF)将打印头移到下一行。(Unicode编码为U+000A行提要。)

In the days of teletypes and early technology printers, people actually took advantage of the fact that these were two separate operations. By sending a CR without following it by a LF, you could print over the line you already printed. This allowed effects like accents, bold type, and underlining. Some systems overprinted several times to prevent passwords from being visible in hardcopy. On early serial CRT terminals, CR was one of the ways to control the cursor position in order to update text already on the screen.


But most of the time, you actually just wanted to go to the next line. Rather than requiring the pair of control characters, some systems allowed just one or the other. For example:


  • Unix variants (including modern versions of Mac) use just a LF character to indicate a newline.
  • Unix变体(包括现代版本的Mac)仅使用一个LF字符来指示换行符。
  • Old (pre-OSX) Macintosh files used just a CR character to indicate a newline.
  • 旧(pre-OSX) Macintosh文件仅使用一个CR字符来表示换行符。
  • VMS, CP/M, DOS, Windows, and many network protocols still expect both: CR LF.
  • vm、CP/M、DOS、Windows和许多网络协议仍然期望都是:CR LF。
  • Old IBM systems that used EBCDIC standardized on NL--a character that doesn't even exist in the ASCII character set. In Unicode, NL is U+0085 NEXT LINE, but the actual EBCDIC value is 0x15.
  • 旧的IBM系统在NL上使用EBCDIC标准,在ASCII字符集中甚至不存在这个字符。在Unicode中,NL是U+0085下一行,但实际的EBCDIC值是0x15。

Why did different systems choose different methods? Simply because there was no universal standard. Where your keyboard probably says "Enter", older keyboards used to say "Return", which was short for Carriage Return. In fact, on a serial terminal, pressing Return actually sends the CR character. If you were writing a text editor, it would be tempting to just use that character as it came in from the terminal. Perhaps that's why the older Macs used just CR.


Now that we have standards, there are more ways to represent line breaks. Although extremely rare in the wild, Unicode has new characters like:


  • U + 2028行分隔符
  • U + 2029段分隔符

Even before Unicode came along, programmers wanted simple ways to represent some of the most useful control codes without worrying about the underlying character set. C has several escape sequences for representing control codes:


  • \a (for alert) which rings the teletype bell or makes the terminal beep
  • (用于警告)是哪一种铃声响起,或发出终端机嘟嘟声。
  • \f (for form feed) which moves to the beginning of the next page
  • \f(用于表单提要)移动到下一页的开头。
  • \t (for tab) which moves the print head to the next horizontal tab position
  • \t (for tab)将打印头移动到下一个水平标签位置。

(This list is intentionally incomplete.)


This mapping happens at compile-time--the compiler sees \a and puts whatever magic value is used to ring the bell.


Notice that most of these mnemonics have direct correlations to ASCII control codes. For example, \a would map to 0x07 BEL. A compiler could be written for a system that used something other than ASCII for the host character set (e.g., EBCDIC). Most of the control codes that had specific mnemonics could be mapped to control codes in other character sets.

请注意,这些助记符大多与ASCII码控制码有直接关联。例如,\a将映射到0x07 BEL。可以为一个系统编写一个编译器,该系统使用的是主机字符集(例如EBCDIC)以外的其他东西。大多数有特定助记符的控制码可以被映射到其他字符集的控制码。

Huzzah! Portability!


Well, almost. In C, I could write printf("\aHello, World!"); which rings the bell (or beeps) and outputs a message. But if I wanted to then print something on the next line, I'd still need to know what the host platform requires to move to the next line of output. CR LF? CR? LF? NL? Something else? So much for portability.

嗯,差不多。在C语言中,我可以写printf(“\aHello, World!”);它会按铃(或嘟嘟声)并输出一个信息。但是如果我想在下一行打印一些东西,我仍然需要知道主机平台需要移动到下一行的输出。CR低频?CR吗?低频?问吗?别的吗?可移植性。

C has two modes for I/O: binary and text. In binary mode, whatever data is sent gets transmitted as-is. But in text mode, there's a run-time translation that converts a special character to whatever the host platform needs for a new line (and vice versa).


Great, so what's the special character?


Well, that's implementation dependent, too, but there's an implementation-independent way to specify it: \n. It's typically called the "newline character".


This is a subtle but important point: \n is mapped at compile time to an implementation-defined character value which (in text mode) is then mapped again at run time to the actual character (or sequence of characters) required by the underlying platform to move to the next line.


\n is different than all the other backslash literals because there are two mappings involved. This two-step mapping makes \n significantly different than even \r, which is simply a compile-time mapping to CR (or the most similar control code in whatever the underlying character set is).


This trips up many C and C++ programmers. If you were to poll 100 of them, at least 99 will tell you that \n means line feed. This is not entirely true. Most (perhaps all) C and C++ implementations use LF as the magic intermediate value for \n, but that's an implementation detail. It's feasible for a compiler to use a different value. In fact, if the host character set is not a superset of ASCII (e.g., if it's EBCDIC), then \n will almost certainly not be LF.


So, in C and C++:


  • \r is literally a carriage return.
  • \r字面上就是回车。
  • \n is a magic value that gets translated (in text mode) at run-time to/from the host platform's newline semantics.
  • \n是一个神奇的值,它在运行时从主机平台的换行语义转换为(在文本模式下)。
  • \r\n is almost always a portability bug. In text mode, this gets translated to CR followed by the platform's newline sequence--probably not what's intended. In binary mode, this gets translated to CR followed by some magic value that might not be LF--possibly not what's intended.
  • \r\n几乎总是一个可移植性错误。在文本模式中,这被转换为CR,然后是平台的换行序列——可能不是预期的。在二进制模式中,这被转换为CR,然后是一些可能不是LF的魔法值——可能不是故意的。
  • \x0A is the most portable way to indicate an ASCII LF, but you only want to do that in binary mode. Most text-mode implementations will treat that like \n.
  • \x0A是表示ASCII LF的最方便的方式,但您只希望以二进制模式进行。大多数文本模式实现都将处理类似\n。



  • "\r" => Return
  • “\ r”= >返回
  • "\n" => Newline or Linefeed (semantics)

    "\n" =>换行或换行符(语义)

  • Unix based systems use just a "\n" to end a line of text.


  • Dos uses "\r\n" to end a line of text.
  • Dos使用“\r\n”来结束一行文本。
  • Some other machines used just a "\r". (Commodore, Apple II, Mac OS prior to OS X, etc..)
  • 其他一些机器只使用“\r”。(Commodore, Apple II, Mac OS在OS X之前,等等)



In short \r has ASCII value 13 (CR) and \n has ASCII value 10 (LF). Mac uses CR as line delimiter (at least, it did before, I am not sure for modern macs), *nix uses LF and Windows uses both (CRLF).

在短\r中有ASCII值13 (CR)和\n有ASCII值10 (LF)。Mac使用CR作为线分隔符(至少以前是这样的,我不确定现代的Mac), *nix使用LF和Windows同时使用两个(CRLF)。



\r is used to point to the start of a line and can replace the text from there, e.g.



Produces this output:



\n is for new line.




In addition to @Jon Skeet's answer:

除了@Jon Skeet的回答:

Traditionally Windows has used \r\n, Unix \n and Mac \r, however newer Macs use \n as they're unix based.

传统上,Windows使用的是\r\n、Unix \n和Mac \r,不过更新的Mac电脑使用的是Unix。



in C# I found they use \r\n in a string.




\r is Carriage Return; \n is New Line (Line Feed) ... depends on the OS as to what each means. Read this article for more on the difference between '\n' and '\r\n' ... in C.

\ r是回车;\n是新行(换行)……取决于操作系统对每种方法的理解。请阅读这篇文章,了解更多关于“\n”和“\r\n”的区别。在C。



\r used for carriage return. (ASCII value is 13) \n used for new line. (ASCII value is 10)




