按字符数分割字符串

时间:2023-01-01 19:40:24

I can't figure out how to do this with string methods:

我无法弄清楚如何使用字符串方法执行此操作:

In my file I have something like 1.012345e0070.123414e-004-0.1234567891.21423... which means there is no delimiter between the numbers.

在我的文件中,我有类似1.012345e0070.123414e-004-0.1234567891.21423 ...这意味着数字之间没有分隔符。

Now if I read a line from this file I get a string like above which I want to split after e.g. 12 characters. There is no way to do this with something like str.split() or any other string method as far as I've seen but maybe I'm overlooking something?

现在,如果我从这个文件中读取一行,我会得到一个像上面一样的字符串,我想在之后将其拆分。 12个字符。就我所见,str.split()或任何其他字符串方法都无法做到这一点,但也许我忽略了什么?

Thx

谢谢

8 个解决方案

#1


22  

Since you want to iterate in an unusual way, a generator is a good way to abstract that:

由于您希望以不寻常的方式进行迭代,因此生成器是抽象的一种好方法:

def chunks(s, n):
    """Produce `n`-character chunks from `s`."""
    for start in range(0, len(s), n):
        yield s[start:start+n]

nums = "1.012345e0070.123414e-004-0.1234567891.21423"
for chunk in chunks(nums, 12):
    print chunk

produces:

生产:

1.012345e007
0.123414e-00
4-0.12345678
91.21423

(which doesn't look right, but those are the 12-char chunks)

(看起来不对,但那些是12个字符的大块)

#2


11  

You're looking for string slicing.

你正在寻找字符串切片。

>>> x = "1.012345e0070.123414e-004-0.1234567891.21423"
>>> x[2:10]
'012345e0'

#3


3  

line = "1.012345e0070.123414e-004-0.1234567891.21423"
firstNumber = line[:12]
restOfLine = line[12:]

print firstNumber
print restOfLine

Output:

输出:

1.012345e007
0.123414e-004-0.1234567891.21423

#4


2  

from itertools import izip_longest

def grouper(n, iterable, padvalue=None):
    return izip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

#5


2  

Try this function:

试试这个功能:

x = "1.012345e0070.123414e-004-0.1234567891.21423"
while len(x)>0:
  v = x[:12]
  print v
  x = x[12:]

#6


1  

you can do it like this:

你可以这样做:

step = 12
for i in range(0, len(string), 12):
    slice = string[i:step]
    step += 12

in this way on each iteration you will get one slice of 14 characters.

通过这种方式,在每次迭代中,您将获得一个包含14个字符的切片。

#7


1  

I always thought, since string addition operation is possible by a simple logic, may be division should be like this. When divided by a number, it should split by that length. So may be this is what you are looking for.

我一直以为,由于字符串加法操作可以通过简单的逻辑,可能是这样的除法。除以数字时,它应按该长度分割。所以可能这就是你要找的东西。

class MyString:
    def __init__(self, string):
        self.string = string
    def __div__(self, div):
        l = []
        for i in range(0, len(self.string), div):
            l.append(self.string[i:i+div])
        return l

>>> m = MyString(s)
>>> m/3
['abc', 'bdb', 'fbf', 'bfb']


>>> m = MyString('abcd')
>>> m/3
['abc', 'd']

If you don't want to create an entirely new class, simply use this function that re-wraps the core of the above code,

如果您不想创建一个全新的类,只需使用此函数重新包装上述代码的核心,

>>> def string_divide(string, div):
       l = []
       for i in range(0, len(string), div):
           l.append(string[i:i+div])
       return l

>>> string_divide('abcdefghijklmnopqrstuvwxyz', 15)
['abcdefghijklmno', 'pqrstuvwxyz']

#8


0  

I stumbled on this while looking for a solution for a similar problem - but in my case I wanted to split string into chunks of differing lengths. Eventually I solved it with RE

我在寻找类似问题的解决方案时偶然发现了这一点 - 但在我的情况下,我想将字符串分成不同长度的块。最终我用RE解决了它

In [13]: import re

In [14]: random_val = '07eb8010e539e2621cb100e4f33a2ff9'

In [15]: dashmap=(8, 4, 4, 4, 12)

In [16]: re.findall(''.join('(\S{{{}}})'.format(l) for l in dashmap), random_val)
Out[16]: [('07eb8010', 'e539', 'e262', '1cb1', '00e4f33a2ff9')]

Bonus

奖金

For those who may find it interesting - I tried to create pseudo-random ID by specific rules, so this code is actually part of the following function

对于那些可能觉得有趣的人 - 我试图通过特定的规则创建伪随机ID,所以这段代码实际上是以下函数的一部分

import re, time, random 
def random_id_from_time_hash(dashmap=(8, 4, 4, 4, 12)):
     random_val = ''
     while len(random_val) < sum(dashmap):
         random_val += '{:016x}'.format(hash(time.time() * random.randint(1, 1000)))
     return '-'.join(re.findall(''.join('(\S{{{}}})'.format(l) for l in dashmap), random_val)[0])

#1


22  

Since you want to iterate in an unusual way, a generator is a good way to abstract that:

由于您希望以不寻常的方式进行迭代,因此生成器是抽象的一种好方法:

def chunks(s, n):
    """Produce `n`-character chunks from `s`."""
    for start in range(0, len(s), n):
        yield s[start:start+n]

nums = "1.012345e0070.123414e-004-0.1234567891.21423"
for chunk in chunks(nums, 12):
    print chunk

produces:

生产:

1.012345e007
0.123414e-00
4-0.12345678
91.21423

(which doesn't look right, but those are the 12-char chunks)

(看起来不对,但那些是12个字符的大块)

#2


11  

You're looking for string slicing.

你正在寻找字符串切片。

>>> x = "1.012345e0070.123414e-004-0.1234567891.21423"
>>> x[2:10]
'012345e0'

#3


3  

line = "1.012345e0070.123414e-004-0.1234567891.21423"
firstNumber = line[:12]
restOfLine = line[12:]

print firstNumber
print restOfLine

Output:

输出:

1.012345e007
0.123414e-004-0.1234567891.21423

#4


2  

from itertools import izip_longest

def grouper(n, iterable, padvalue=None):
    return izip_longest(*[iter(iterable)]*n, fillvalue=padvalue)

#5


2  

Try this function:

试试这个功能:

x = "1.012345e0070.123414e-004-0.1234567891.21423"
while len(x)>0:
  v = x[:12]
  print v
  x = x[12:]

#6


1  

you can do it like this:

你可以这样做:

step = 12
for i in range(0, len(string), 12):
    slice = string[i:step]
    step += 12

in this way on each iteration you will get one slice of 14 characters.

通过这种方式,在每次迭代中,您将获得一个包含14个字符的切片。

#7


1  

I always thought, since string addition operation is possible by a simple logic, may be division should be like this. When divided by a number, it should split by that length. So may be this is what you are looking for.

我一直以为,由于字符串加法操作可以通过简单的逻辑,可能是这样的除法。除以数字时,它应按该长度分割。所以可能这就是你要找的东西。

class MyString:
    def __init__(self, string):
        self.string = string
    def __div__(self, div):
        l = []
        for i in range(0, len(self.string), div):
            l.append(self.string[i:i+div])
        return l

>>> m = MyString(s)
>>> m/3
['abc', 'bdb', 'fbf', 'bfb']


>>> m = MyString('abcd')
>>> m/3
['abc', 'd']

If you don't want to create an entirely new class, simply use this function that re-wraps the core of the above code,

如果您不想创建一个全新的类,只需使用此函数重新包装上述代码的核心,

>>> def string_divide(string, div):
       l = []
       for i in range(0, len(string), div):
           l.append(string[i:i+div])
       return l

>>> string_divide('abcdefghijklmnopqrstuvwxyz', 15)
['abcdefghijklmno', 'pqrstuvwxyz']

#8


0  

I stumbled on this while looking for a solution for a similar problem - but in my case I wanted to split string into chunks of differing lengths. Eventually I solved it with RE

我在寻找类似问题的解决方案时偶然发现了这一点 - 但在我的情况下,我想将字符串分成不同长度的块。最终我用RE解决了它

In [13]: import re

In [14]: random_val = '07eb8010e539e2621cb100e4f33a2ff9'

In [15]: dashmap=(8, 4, 4, 4, 12)

In [16]: re.findall(''.join('(\S{{{}}})'.format(l) for l in dashmap), random_val)
Out[16]: [('07eb8010', 'e539', 'e262', '1cb1', '00e4f33a2ff9')]

Bonus

奖金

For those who may find it interesting - I tried to create pseudo-random ID by specific rules, so this code is actually part of the following function

对于那些可能觉得有趣的人 - 我试图通过特定的规则创建伪随机ID,所以这段代码实际上是以下函数的一部分

import re, time, random 
def random_id_from_time_hash(dashmap=(8, 4, 4, 4, 12)):
     random_val = ''
     while len(random_val) < sum(dashmap):
         random_val += '{:016x}'.format(hash(time.time() * random.randint(1, 1000)))
     return '-'.join(re.findall(''.join('(\S{{{}}})'.format(l) for l in dashmap), random_val)[0])