如何将多行字符串分割成多行?

时间:2022-06-12 21:38:27

I have a multi-line string literal that I want to do an operation on each line, like so:

我有一个多行字符串字面量,我想对每一行做一个操作,如下所示:

inputString = """Line 1Line 2Line 3"""

I want to do something like the following:

我想做如下事情:

for line in inputString:    doStuff()

5 个解决方案

#1


291  

inputString.splitlines()

Will give you a list with each item, the splitlines() method is designed to split each line into a list element.

splitlines()方法将为您提供一个包含每个项目的列表,该方法旨在将每一行分割为一个列表元素。

#2


191  

Like the others said:

就像其他人说的:

inputString.split('\n')  # --> ['Line 1', 'Line 2', 'Line 3']

This is identical to the above, but the string module's functions are deprecated and should be avoided:

这与上面的一样,但是不赞成使用string模块的功能,应该避免:

import stringstring.split(inputString, '\n')  # --> ['Line 1', 'Line 2', 'Line 3']

Alternatively, if you want each line to include the break sequence (CR,LF,CRLF), use the splitlines method with a True argument:

或者,如果您希望每一行都包含中断序列(CR、LF、CRLF),则使用splitlines方法,并使用一个真正的参数:

inputString.splitlines(True)  # --> ['Line 1\n', 'Line 2\n', 'Line 3']

#3


40  

The best way to do this is to simply use str.splitlines.

最好的方法是使用string .splitlines。

splitlines() handles newlines properly, unlike split("\n").

splitlines()正确地处理换行符,不像split(“\n”)。

It also has the the advantage mentioned by @efotinis of optionally including the newline character in the split result when called with a True argument.

它还具有@efotinis提到的优点,即在使用一个真正的参数调用时,可以选择在分割结果中包含换行字符。


Detailed explanation on why you shouldn't use split("\n"):

关于为什么不使用split(“\n”)的详细说明:

\n, in Python, represents a Unix line-break (ASCII decimal code 10), independently from the platform where you run it. However, the linebreak representation is platform-dependent. On Windows, \n is two characters, CR and LF (ASCII decimal codes 13 and 10, AKA \r and \n), while on any modern Unix (including OS X), it's the single character LF.

\n,在Python中,表示一个Unix行-break (ASCII十进制码10),独立于运行它的平台。然而,linebreak表示与平台相关。在Windows上,\n是两个字符:CR和LF (ASCII十进制码13和10,即\r和\n),而在任何现代Unix(包括OS X)上,它都是一个字符LF。

print, for example, works correctly even if you have a string with line endings that don't match your platform:

例如,即使您有一个带有不匹配平台的行结尾的字符串,打印也可以正常工作:

>>> print " a \n b \r\n c " a  b  c

However, explicitly splitting on "\n", will yield platform-dependent behaviour:

然而,在“\n”上显式拆分,将产生依赖于平台的行为:

>>> " a \n b \r\n c ".split("\n")[' a ', ' b \r', ' c ']

Even if you use os.linesep, it will only split according to the newline separator on your platform, and will fail if you're processing text created in other platforms, or with a bare \n:

即使你使用操作系统。linesep只能根据平台上的换行分隔符进行拆分,如果您正在处理在其他平台上创建的文本,或者使用裸\n:

>>> " a \n b \r\n c ".split(os.linesep)[' a \n b ', ' c ']

splitlines solves all these problems:

splitlines解决了所有这些问题:

>>> " a \n b \r\n c ".splitlines()[' a ', ' b ', ' c ']

Reading files in text mode partially mitigates the newline representation problem, as it converts Python's \n into the platform's newline representation.However, text mode only exists on Windows. On Unix systems, all files are opened in binary mode, so using split('\n') in a UNIX system with a Windows file will lead to undesired behavior. Also, it's not unusual to process strings with potentially different newlines from other sources, such as from a socket.

以文本模式读取文件可以部分缓解换行表示问题,因为它将Python的\n转换为平台的换行表示。但是,文本模式只存在于Windows上。在Unix系统中,所有的文件都以二进制模式打开,因此在Unix系统中使用一个带有Windows文件的split('\n')将导致不希望的行为。此外,处理带有来自其他源(如套接字)的可能不同的新行的字符串并不少见。

#4


18  

Might be overkill in this particular case but another option involves using StringIO to create a file-like object

在这个特殊的情况下可能会被过度使用,但是另一个选择是使用StringIO创建一个类似文件的对象。

for line in StringIO.StringIO(inputString):    doStuff()

#5


1  

I wish comments had proper code text formatting, because I think @1_CR 's answer needs more bumps, and I would like to augment his answer. Anyway, He led me to the following technique; it will use cStringIO if available (BUT NOTE: cStringIO and StringIO are not the same, because you cannot subclass cStringIO... it is a built-in... but for basic operations the syntax will be identical, so you can do this):

我希望注释有正确的代码文本格式,因为我认为@1_CR的答案需要更多的障碍,我想增加他的答案。总之,他把我引向了下面的技巧;如果可以,它将使用cStringIO(但是注意:cStringIO和StringIO不是相同的,因为您不能子类化cStringIO…这是一个内置的……但是对于基本的操作,语法是相同的,所以你可以这么做):

try:    import cStringIO    StringIO = cStringIOexcept ImportError:    import StringIOfor line in StringIO.StringIO(variable_with_multiline_string):    passprint line.strip()

#1


291  

inputString.splitlines()

Will give you a list with each item, the splitlines() method is designed to split each line into a list element.

splitlines()方法将为您提供一个包含每个项目的列表,该方法旨在将每一行分割为一个列表元素。

#2


191  

Like the others said:

就像其他人说的:

inputString.split('\n')  # --> ['Line 1', 'Line 2', 'Line 3']

This is identical to the above, but the string module's functions are deprecated and should be avoided:

这与上面的一样,但是不赞成使用string模块的功能,应该避免:

import stringstring.split(inputString, '\n')  # --> ['Line 1', 'Line 2', 'Line 3']

Alternatively, if you want each line to include the break sequence (CR,LF,CRLF), use the splitlines method with a True argument:

或者,如果您希望每一行都包含中断序列(CR、LF、CRLF),则使用splitlines方法,并使用一个真正的参数:

inputString.splitlines(True)  # --> ['Line 1\n', 'Line 2\n', 'Line 3']

#3


40  

The best way to do this is to simply use str.splitlines.

最好的方法是使用string .splitlines。

splitlines() handles newlines properly, unlike split("\n").

splitlines()正确地处理换行符,不像split(“\n”)。

It also has the the advantage mentioned by @efotinis of optionally including the newline character in the split result when called with a True argument.

它还具有@efotinis提到的优点,即在使用一个真正的参数调用时,可以选择在分割结果中包含换行字符。


Detailed explanation on why you shouldn't use split("\n"):

关于为什么不使用split(“\n”)的详细说明:

\n, in Python, represents a Unix line-break (ASCII decimal code 10), independently from the platform where you run it. However, the linebreak representation is platform-dependent. On Windows, \n is two characters, CR and LF (ASCII decimal codes 13 and 10, AKA \r and \n), while on any modern Unix (including OS X), it's the single character LF.

\n,在Python中,表示一个Unix行-break (ASCII十进制码10),独立于运行它的平台。然而,linebreak表示与平台相关。在Windows上,\n是两个字符:CR和LF (ASCII十进制码13和10,即\r和\n),而在任何现代Unix(包括OS X)上,它都是一个字符LF。

print, for example, works correctly even if you have a string with line endings that don't match your platform:

例如,即使您有一个带有不匹配平台的行结尾的字符串,打印也可以正常工作:

>>> print " a \n b \r\n c " a  b  c

However, explicitly splitting on "\n", will yield platform-dependent behaviour:

然而,在“\n”上显式拆分,将产生依赖于平台的行为:

>>> " a \n b \r\n c ".split("\n")[' a ', ' b \r', ' c ']

Even if you use os.linesep, it will only split according to the newline separator on your platform, and will fail if you're processing text created in other platforms, or with a bare \n:

即使你使用操作系统。linesep只能根据平台上的换行分隔符进行拆分,如果您正在处理在其他平台上创建的文本,或者使用裸\n:

>>> " a \n b \r\n c ".split(os.linesep)[' a \n b ', ' c ']

splitlines solves all these problems:

splitlines解决了所有这些问题:

>>> " a \n b \r\n c ".splitlines()[' a ', ' b ', ' c ']

Reading files in text mode partially mitigates the newline representation problem, as it converts Python's \n into the platform's newline representation.However, text mode only exists on Windows. On Unix systems, all files are opened in binary mode, so using split('\n') in a UNIX system with a Windows file will lead to undesired behavior. Also, it's not unusual to process strings with potentially different newlines from other sources, such as from a socket.

以文本模式读取文件可以部分缓解换行表示问题,因为它将Python的\n转换为平台的换行表示。但是,文本模式只存在于Windows上。在Unix系统中,所有的文件都以二进制模式打开,因此在Unix系统中使用一个带有Windows文件的split('\n')将导致不希望的行为。此外,处理带有来自其他源(如套接字)的可能不同的新行的字符串并不少见。

#4


18  

Might be overkill in this particular case but another option involves using StringIO to create a file-like object

在这个特殊的情况下可能会被过度使用,但是另一个选择是使用StringIO创建一个类似文件的对象。

for line in StringIO.StringIO(inputString):    doStuff()

#5


1  

I wish comments had proper code text formatting, because I think @1_CR 's answer needs more bumps, and I would like to augment his answer. Anyway, He led me to the following technique; it will use cStringIO if available (BUT NOTE: cStringIO and StringIO are not the same, because you cannot subclass cStringIO... it is a built-in... but for basic operations the syntax will be identical, so you can do this):

我希望注释有正确的代码文本格式,因为我认为@1_CR的答案需要更多的障碍,我想增加他的答案。总之,他把我引向了下面的技巧;如果可以,它将使用cStringIO(但是注意:cStringIO和StringIO不是相同的,因为您不能子类化cStringIO…这是一个内置的……但是对于基本的操作,语法是相同的,所以你可以这么做):

try:    import cStringIO    StringIO = cStringIOexcept ImportError:    import StringIOfor line in StringIO.StringIO(variable_with_multiline_string):    passprint line.strip()