在Python中将具有未知空格数的字符串拆分为分隔符

时间:2021-03-14 21:43:37

I need a function similar to string.split(' ') but there might be more than one space, and different number of them between the meaningful characters. Something like that:

我需要一个类似于string.split('')的函数,但可能有多个空格,并且它们在有意义的字符之间有不同的数量。像这样的东西:

s = ' 1234    Q-24 2010-11-29         563   abc  a6G47er15               '
ss = s.magicSplit()
print ss
['1234','Q-24','2010-11-29','563','abc','a6G47er15']

Can I somehow use regular expressions to catch those spaces in between?

我可以以某种方式使用正则表达式来捕捉它们之间的空格吗?

Could someone help, please?

有人可以帮帮忙吗?

3 个解决方案

#1


68  

Try

尝试

>>> ' 1234    Q-24 2010-11-29         563   abc  a6G47er15'.split()
['1234', 'Q-24', '2010-11-29', '563', 'abc', 'a6G47er15']

Or if you want

或者如果你想

>>> class MagicString(str):
...     magicSplit = str.split
... 
>>> s = MagicString(' 1234    Q-24 2010-11-29         563   abc  a6G47er15')
>>> s.magicSplit()
['1234', 'Q-24', '2010-11-29', '563', 'abc', 'a6G47er15']

#2


16  

s = ' 1234    Q-24 2010-11-29         563   abc  a6G47er15               '
ss = s.split()
print ss
['1234','Q-24','2010-11-29','563','abc','a6G47er15']

#3


3  

If you have single spaces amid your data (like an address in one field), here's a solution for when the delimiter has two or more spaces:

如果您的数据中有单个空格(如一个字段中的地址),这里是分隔符有两个或更多空格的解决方案:

with open("textfile.txt") as f:
    content = f.readlines()

    for line in content:
        # Get all variable-length spaces down to two. Then use two spaces as the delimiter.
        while line.replace("   ", "  ") != line:
            line = line.replace("   ", "  ")

        # The strip is optional here.
        data = line.strip().split("  ")
        print(data)

#1


68  

Try

尝试

>>> ' 1234    Q-24 2010-11-29         563   abc  a6G47er15'.split()
['1234', 'Q-24', '2010-11-29', '563', 'abc', 'a6G47er15']

Or if you want

或者如果你想

>>> class MagicString(str):
...     magicSplit = str.split
... 
>>> s = MagicString(' 1234    Q-24 2010-11-29         563   abc  a6G47er15')
>>> s.magicSplit()
['1234', 'Q-24', '2010-11-29', '563', 'abc', 'a6G47er15']

#2


16  

s = ' 1234    Q-24 2010-11-29         563   abc  a6G47er15               '
ss = s.split()
print ss
['1234','Q-24','2010-11-29','563','abc','a6G47er15']

#3


3  

If you have single spaces amid your data (like an address in one field), here's a solution for when the delimiter has two or more spaces:

如果您的数据中有单个空格(如一个字段中的地址),这里是分隔符有两个或更多空格的解决方案:

with open("textfile.txt") as f:
    content = f.readlines()

    for line in content:
        # Get all variable-length spaces down to two. Then use two spaces as the delimiter.
        while line.replace("   ", "  ") != line:
            line = line.replace("   ", "  ")

        # The strip is optional here.
        data = line.strip().split("  ")
        print(data)