按字符串长度分割字符串?

时间:2023-01-14 21:40:26

Is there a way to take a string that is 4*x characters long, and cut it into 4 strings, each x characters long, without knowing the length of the string?

有没有办法获取一个4 * x字符长的字符串,并将其切成4个字符串,每个x字符长,不知道字符串的长度?

For example:

>>>x = "qwertyui"
>>>split(x, one, two, three, four)
>>>two
'er'

13 个解决方案

#1


60  

>>> x = "qwertyui"
>>> chunks, chunk_size = len(x), len(x)/4
>>> [ x[i:i+chunk_size] for i in range(0, chunks, chunk_size) ]
['qw', 'er', 'ty', 'ui']

#2


11  

I tried Alexanders answer but got this error in Python3:

我尝试过Alexanders的回答但在Python3中遇到了这个错误:

TypeError: 'float' object cannot be interpreted as an integer

This is because the division operator in Python3 is returning a float. This works for me:

这是因为Python3中的除法运算符返回了一个浮点数。这对我有用:

>>> x = "qwertyui"
>>> chunks, chunk_size = len(x), len(x)//4
>>> [ x[i:i+chunk_size] for i in range(0, chunks, chunk_size) ]
['qw', 'er', 'ty', 'ui']

Notice the // at the end of line 2, to ensure truncation to an integer.

注意第2行末尾的//,以确保截断为整数。

#3


4  

  • :param s: str; source string
  • :param s:str;源字符串

  • :param w: int; width to split on
  • :param w:int;要分开的宽度

Using the textwrap module:

使用textwrap模块:

PyDocs-textwrap

import textwrap
def wrap(s, w):
    return textwrap.fill(s, w)

:return str:

Inspired by Alexander's Answer

灵感来自亚历山大的答案

PyDocs-data structures

def wrap(s, w):
    return [s[i:i + w] for i in range(0, len(s), w)]
  • :return list:

Inspired by Eric's answer

灵感来自Eric的回答

PyDocs-regex

import re
def wrap(s, w):    
    sre = re.compile(rf'(.{{{w}}})')
    return [x for x in re.split(sre, s) if x]
  • :return list:

Complete Code Examples/Alternative Methods

完整的代码示例/替代方法

#4


3  

Here is a one-liner that doesn't need to know the length of the string beforehand:

这是一个单行,不需要事先知道字符串的长度:

from functools import partial
from StringIO import StringIO

[l for l in iter(partial(StringIO(data).read, 4), '')]

If you have a file or socket, then you don't need the StringIO wrapper:

如果您有文件或套接字,那么您不需要StringIO包装器:

[l for l in iter(partial(file_like_object.read, 4), '')]

#5


3  

def split2len(s, n):
    def _f(s, n):
        while s:
            yield s[:n]
            s = s[n:]
    return list(_f(s, n))

#6


1  

Here are two generic approaches. Probably worth adding to your own lib of reusables. First one requires the item to be sliceable and second one works with any iterables (but requires their constructor to accept iterable).

这是两种通用方法。可能值得添加到您自己的可重用的库中。第一个要求项是可切片的,第二个要求任何迭代(但要求它们的构造函数接受可迭代)。

def split_bylen(item, maxlen):
    '''
    Requires item to be sliceable (with __getitem__ defined)
    '''
    return [item[ind:ind+maxlen] for ind in range(0, len(item), maxlen)]
    #You could also replace outer [ ] brackets with ( ) to use as generator.

def split_bylen_any(item, maxlen, constructor=None):
    '''
    Works with any iterables.
    Requires item's constructor to accept iterable or alternatively 
    constructor argument could be provided (otherwise use item's class)
    '''
    if constructor is None: constructor = item.__class__
    return [constructor(part) for part in zip(* ([iter(item)] * maxlen))]
    #OR: return map(constructor, zip(* ([iter(item)] * maxlen)))
    #    which would be faster if you need an iterable, not list

So, in topicstarter's case, the usage is:

因此,在topicstarter的情况下,用法是:

string = 'Baboons love bananas'
parts = 5
splitlen = -(-len(string) // parts) # is alternative to math.ceil(len/parts)

first_method = split_bylen(string, splitlen)
#Result :['Babo', 'ons ', 'love', ' ban', 'anas']

second_method = split_bylen_any(string, splitlen, constructor=''.join)
#Result :['Babo', 'ons ', 'love', ' ban', 'anas']

#7


1  

Got an re trick:

得到了一个技巧:

In [28]: import re

In [29]: x = "qwertyui"

In [30]: [x for x in re.split(r'(\w{2})', x) if x]
Out[30]: ['qw', 'er', 'ty', 'ui']

Then be a func, it might looks like:

然后是一个功能,它可能看起来像:

def split(string, split_len):
    # Regex: `r'.{1}'` for example works for all characters
    regex = r'(.{%s})' % split_len
    return [x for x in re.split(regex, string) if x]

#8


1  

length = 4
string = "abcdefgh"
str_dict = [ o for o in string ]
parts = [ ''.join( str_dict[ (j * length) : ( ( j + 1 ) * length ) ]   ) for j in xrange(len(string)/length  )]

#9


0  

And for dudes who prefer it to be a bit more readable:

而对于那些喜欢它更具可读性的家伙:

def itersplit_into_x_chunks(string,x=10): # we assume here that x is an int and > 0
    size = len(string)
    chunksize = size//x
    for pos in range(0, size, chunksize):
        yield string[pos:pos+chunksize]

output:

>>> list(itersplit_into_x_chunks('qwertyui',x=4))
['qw', 'er', 'ty', 'ui']

#10


0  

The string splitting is required in many cases like where you have to sort the characters of the string given, replacing a character with an another character etc. But all these operations can be performed with the following mentioned string splitting methods.

在许多情况下需要字符串拆分,例如必须对给定字符串的字符进行排序,用另一个字符替换字符等。但是所有这些操作都可以使用以下提到的字符串拆分方法执行。

The string splitting can be done in two ways:

字符串拆分可以通过两种方式完成:

  1. Slicing the given string based on the length of split.

    根据拆分长度切片给定的字符串。

  2. Converting the given string to a list with list(str) function, where characters of the string breakdown to form the the elements of a list. Then do the required operation and join them with 'specified character between the characters of the original string'.join(list) to get a new processed string.

    将给定字符串转换为具有list(str)函数的列表,其中字符串的字符细分以形成列表的元素。然后执行所需的操作并使用'原始字符串'.join(list)的字符之间的“指定字符”将它们连接起来以获取新处理的字符串。

#11


0  

some_string="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
x=3 
res=[some_string[y-x:y] for y in range(x, len(some_string)+x,x)]
print(res)

will produce

['ABC', 'DEF', 'GHI', 'JKL', 'MNO', 'PQR', 'STU', 'VWX', 'YZ']

#12


-1  

l = 'abcdefghijklmn'

def group(l,n):
    tmp = len(l)%n
    zipped = zip(*[iter(l)]*n)
    return zipped if tmp == 0 else zipped+[tuple(l[-tmp:])]

print group(l,3)

#13


-2  

My solution

   st =' abs de fdgh  1234 556 shg shshh'
   print st

   def splitStringMax( si, limit):
    ls = si.split()
    lo=[]
    st=''
    ln=len(ls)
    if ln==1:
        return [si]
    i=0
    for l in ls:
        st+=l
        i+=1
        if i <ln:
            lk=len(ls[i])
            if (len(st))+1+lk < limit:
                st+=' '
                continue
        lo.append(st);st=''
    return lo

   ############################

   print  splitStringMax(st,7)
   # ['abs de', 'fdgh', '1234', '556', 'shg', 'shshh']
    print  splitStringMax(st,12)

   # ['abs de fdgh', '1234 556', 'shg shshh']

#1


60  

>>> x = "qwertyui"
>>> chunks, chunk_size = len(x), len(x)/4
>>> [ x[i:i+chunk_size] for i in range(0, chunks, chunk_size) ]
['qw', 'er', 'ty', 'ui']

#2


11  

I tried Alexanders answer but got this error in Python3:

我尝试过Alexanders的回答但在Python3中遇到了这个错误:

TypeError: 'float' object cannot be interpreted as an integer

This is because the division operator in Python3 is returning a float. This works for me:

这是因为Python3中的除法运算符返回了一个浮点数。这对我有用:

>>> x = "qwertyui"
>>> chunks, chunk_size = len(x), len(x)//4
>>> [ x[i:i+chunk_size] for i in range(0, chunks, chunk_size) ]
['qw', 'er', 'ty', 'ui']

Notice the // at the end of line 2, to ensure truncation to an integer.

注意第2行末尾的//,以确保截断为整数。

#3


4  

  • :param s: str; source string
  • :param s:str;源字符串

  • :param w: int; width to split on
  • :param w:int;要分开的宽度

Using the textwrap module:

使用textwrap模块:

PyDocs-textwrap

import textwrap
def wrap(s, w):
    return textwrap.fill(s, w)

:return str:

Inspired by Alexander's Answer

灵感来自亚历山大的答案

PyDocs-data structures

def wrap(s, w):
    return [s[i:i + w] for i in range(0, len(s), w)]
  • :return list:

Inspired by Eric's answer

灵感来自Eric的回答

PyDocs-regex

import re
def wrap(s, w):    
    sre = re.compile(rf'(.{{{w}}})')
    return [x for x in re.split(sre, s) if x]
  • :return list:

Complete Code Examples/Alternative Methods

完整的代码示例/替代方法

#4


3  

Here is a one-liner that doesn't need to know the length of the string beforehand:

这是一个单行,不需要事先知道字符串的长度:

from functools import partial
from StringIO import StringIO

[l for l in iter(partial(StringIO(data).read, 4), '')]

If you have a file or socket, then you don't need the StringIO wrapper:

如果您有文件或套接字,那么您不需要StringIO包装器:

[l for l in iter(partial(file_like_object.read, 4), '')]

#5


3  

def split2len(s, n):
    def _f(s, n):
        while s:
            yield s[:n]
            s = s[n:]
    return list(_f(s, n))

#6


1  

Here are two generic approaches. Probably worth adding to your own lib of reusables. First one requires the item to be sliceable and second one works with any iterables (but requires their constructor to accept iterable).

这是两种通用方法。可能值得添加到您自己的可重用的库中。第一个要求项是可切片的,第二个要求任何迭代(但要求它们的构造函数接受可迭代)。

def split_bylen(item, maxlen):
    '''
    Requires item to be sliceable (with __getitem__ defined)
    '''
    return [item[ind:ind+maxlen] for ind in range(0, len(item), maxlen)]
    #You could also replace outer [ ] brackets with ( ) to use as generator.

def split_bylen_any(item, maxlen, constructor=None):
    '''
    Works with any iterables.
    Requires item's constructor to accept iterable or alternatively 
    constructor argument could be provided (otherwise use item's class)
    '''
    if constructor is None: constructor = item.__class__
    return [constructor(part) for part in zip(* ([iter(item)] * maxlen))]
    #OR: return map(constructor, zip(* ([iter(item)] * maxlen)))
    #    which would be faster if you need an iterable, not list

So, in topicstarter's case, the usage is:

因此,在topicstarter的情况下,用法是:

string = 'Baboons love bananas'
parts = 5
splitlen = -(-len(string) // parts) # is alternative to math.ceil(len/parts)

first_method = split_bylen(string, splitlen)
#Result :['Babo', 'ons ', 'love', ' ban', 'anas']

second_method = split_bylen_any(string, splitlen, constructor=''.join)
#Result :['Babo', 'ons ', 'love', ' ban', 'anas']

#7


1  

Got an re trick:

得到了一个技巧:

In [28]: import re

In [29]: x = "qwertyui"

In [30]: [x for x in re.split(r'(\w{2})', x) if x]
Out[30]: ['qw', 'er', 'ty', 'ui']

Then be a func, it might looks like:

然后是一个功能,它可能看起来像:

def split(string, split_len):
    # Regex: `r'.{1}'` for example works for all characters
    regex = r'(.{%s})' % split_len
    return [x for x in re.split(regex, string) if x]

#8


1  

length = 4
string = "abcdefgh"
str_dict = [ o for o in string ]
parts = [ ''.join( str_dict[ (j * length) : ( ( j + 1 ) * length ) ]   ) for j in xrange(len(string)/length  )]

#9


0  

And for dudes who prefer it to be a bit more readable:

而对于那些喜欢它更具可读性的家伙:

def itersplit_into_x_chunks(string,x=10): # we assume here that x is an int and > 0
    size = len(string)
    chunksize = size//x
    for pos in range(0, size, chunksize):
        yield string[pos:pos+chunksize]

output:

>>> list(itersplit_into_x_chunks('qwertyui',x=4))
['qw', 'er', 'ty', 'ui']

#10


0  

The string splitting is required in many cases like where you have to sort the characters of the string given, replacing a character with an another character etc. But all these operations can be performed with the following mentioned string splitting methods.

在许多情况下需要字符串拆分,例如必须对给定字符串的字符进行排序,用另一个字符替换字符等。但是所有这些操作都可以使用以下提到的字符串拆分方法执行。

The string splitting can be done in two ways:

字符串拆分可以通过两种方式完成:

  1. Slicing the given string based on the length of split.

    根据拆分长度切片给定的字符串。

  2. Converting the given string to a list with list(str) function, where characters of the string breakdown to form the the elements of a list. Then do the required operation and join them with 'specified character between the characters of the original string'.join(list) to get a new processed string.

    将给定字符串转换为具有list(str)函数的列表,其中字符串的字符细分以形成列表的元素。然后执行所需的操作并使用'原始字符串'.join(list)的字符之间的“指定字符”将它们连接起来以获取新处理的字符串。

#11


0  

some_string="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
x=3 
res=[some_string[y-x:y] for y in range(x, len(some_string)+x,x)]
print(res)

will produce

['ABC', 'DEF', 'GHI', 'JKL', 'MNO', 'PQR', 'STU', 'VWX', 'YZ']

#12


-1  

l = 'abcdefghijklmn'

def group(l,n):
    tmp = len(l)%n
    zipped = zip(*[iter(l)]*n)
    return zipped if tmp == 0 else zipped+[tuple(l[-tmp:])]

print group(l,3)

#13


-2  

My solution

   st =' abs de fdgh  1234 556 shg shshh'
   print st

   def splitStringMax( si, limit):
    ls = si.split()
    lo=[]
    st=''
    ln=len(ls)
    if ln==1:
        return [si]
    i=0
    for l in ls:
        st+=l
        i+=1
        if i <ln:
            lk=len(ls[i])
            if (len(st))+1+lk < limit:
                st+=' '
                continue
        lo.append(st);st=''
    return lo

   ############################

   print  splitStringMax(st,7)
   # ['abs de', 'fdgh', '1234', '556', 'shg', 'shshh']
    print  splitStringMax(st,12)

   # ['abs de fdgh', '1234 556', 'shg shshh']