使用Numpy将Paritition阵列分成N个块

时间:2021-10-11 02:22:32

There is this How do you split a list into evenly sized chunks? for splitting an array into chunks. Is there anyway to do this more efficiently for giant arrays using Numpy?

有这个如何将列表拆分成均匀大小的块?用于将数组拆分为块。无论如何使用Numpy为巨型阵列更有效地做到这一点?

4 个解决方案

#1


46  

Try numpy.array_split.

From the documentation:

从文档:

>>> x = np.arange(8.0)
>>> np.array_split(x, 3)
    [array([ 0.,  1.,  2.]), array([ 3.,  4.,  5.]), array([ 6.,  7.])]

Identical to numpy.split, but won't raise an exception if the groups aren't equal length.

与numpy.split相同,但如果组长度不相等则不会引发异常。

If number of chunks > len(array) you get blank arrays nested inside, to address that - if your split array is saved in a, then you can remove empty arrays by:

如果块数> len(数组)你得到嵌套在里面的空白数组,为了解决这个问题 - 如果你的拆分数组保存在a中,那么你可以通过以下方式删除空数组:

[x for x in a if x.size > 0]

Just save that back in a if you wish.

如果你愿意的话,只需将其保存回来。

#2


17  

Just some examples on usage of array_split, split, hsplit and vsplit:

关于array_split,split,hsplit和vsplit的一些使用示例:

n [9]: a = np.random.randint(0,10,[4,4])

In [10]: a
Out[10]: 
array([[2, 2, 7, 1],
       [5, 0, 3, 1],
       [2, 9, 8, 8],
       [5, 7, 7, 6]])

Some examples on using array_split:
If you give an array or list as second argument you basically give the indices (before) which to 'cut'

关于使用array_split的一些例子:如果你给一个数组或列表作为第二个参数,你基本上给出了'cut'的索引(之前)

# split rows into 0|1 2|3
In [4]: np.array_split(a, [1,3])
Out[4]:                                                                                                                       
[array([[2, 2, 7, 1]]),                                                                                                       
 array([[5, 0, 3, 1],                                                                                                         
       [2, 9, 8, 8]]),                                                                                                        
 array([[5, 7, 7, 6]])]

# split columns into 0| 1 2 3
In [5]: np.array_split(a, [1], axis=1)                                                                                           
Out[5]:                                                                                                                       
[array([[2],                                                                                                                  
       [5],                                                                                                                   
       [2],                                                                                                                   
       [5]]),                                                                                                                 
 array([[2, 7, 1],                                                                                                            
       [0, 3, 1],
       [9, 8, 8],
       [7, 7, 6]])]

An integer as second arg. specifies the number of equal chunks:

一个整数作为第二个arg。指定相等块的数量:

In [6]: np.array_split(a, 2, axis=1)
Out[6]: 
[array([[2, 2],
       [5, 0],
       [2, 9],
       [5, 7]]),
 array([[7, 1],
       [3, 1],
       [8, 8],
       [7, 6]])]

split works the same but raises an exception if an equal split is not possible

split工作方式相同但如果无法进行相等的拆分则会引发异常

In addition to array_split you can use shortcuts vsplit and hsplit.
vsplit and hsplit are pretty much self-explanatry:

除了array_split之外,您还可以使用快捷方式vsplit和hsplit。 vsplit和hsplit几乎是自我解释:

In [11]: np.vsplit(a, 2)
Out[11]: 
[array([[2, 2, 7, 1],
       [5, 0, 3, 1]]),
 array([[2, 9, 8, 8],
       [5, 7, 7, 6]])]

In [12]: np.hsplit(a, 2)
Out[12]: 
[array([[2, 2],
       [5, 0],
       [2, 9],
       [5, 7]]),
 array([[7, 1],
       [3, 1],
       [8, 8],
       [7, 6]])]

#3


7  

I believe that you're looking for numpy.split or possibly numpy.array_split if the number of sections doesn't need to divide the size of the array properly.

我相信你正在寻找numpy.split或者numpy.array_split,如果部分的数量不需要正确地划分数组的大小。

#4


5  

Not quite an answer, but a long comment with nice formatting of code to the other (correct) answers. If you try the following, you will see that what you are getting are views of the original array, not copies, and that was not the case for the accepted answer in the question you link. Be aware of the possible side effects!

不是一个答案,而是一个很长的评论,很好的格式化代码到其他(正确的)答案。如果您尝试以下操作,您将看到所获得的是原始数组的视图,而不是副本,而在您链接的问题中,接受的答案不是这种情况。注意可能的副作用!

>>> x = np.arange(9.0)
>>> a,b,c = np.split(x, 3)
>>> a
array([ 0.,  1.,  2.])
>>> a[1] = 8
>>> a
array([ 0.,  8.,  2.])
>>> x
array([ 0.,  8.,  2.,  3.,  4.,  5.,  6.,  7.,  8.])
>>> def chunks(l, n):
...     """ Yield successive n-sized chunks from l.
...     """
...     for i in xrange(0, len(l), n):
...         yield l[i:i+n]
... 
>>> l = range(9)
>>> a,b,c = chunks(l, 3)
>>> a
[0, 1, 2]
>>> a[1] = 8
>>> a
[0, 8, 2]
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8]

#1


46  

Try numpy.array_split.

From the documentation:

从文档:

>>> x = np.arange(8.0)
>>> np.array_split(x, 3)
    [array([ 0.,  1.,  2.]), array([ 3.,  4.,  5.]), array([ 6.,  7.])]

Identical to numpy.split, but won't raise an exception if the groups aren't equal length.

与numpy.split相同,但如果组长度不相等则不会引发异常。

If number of chunks > len(array) you get blank arrays nested inside, to address that - if your split array is saved in a, then you can remove empty arrays by:

如果块数> len(数组)你得到嵌套在里面的空白数组,为了解决这个问题 - 如果你的拆分数组保存在a中,那么你可以通过以下方式删除空数组:

[x for x in a if x.size > 0]

Just save that back in a if you wish.

如果你愿意的话,只需将其保存回来。

#2


17  

Just some examples on usage of array_split, split, hsplit and vsplit:

关于array_split,split,hsplit和vsplit的一些使用示例:

n [9]: a = np.random.randint(0,10,[4,4])

In [10]: a
Out[10]: 
array([[2, 2, 7, 1],
       [5, 0, 3, 1],
       [2, 9, 8, 8],
       [5, 7, 7, 6]])

Some examples on using array_split:
If you give an array or list as second argument you basically give the indices (before) which to 'cut'

关于使用array_split的一些例子:如果你给一个数组或列表作为第二个参数,你基本上给出了'cut'的索引(之前)

# split rows into 0|1 2|3
In [4]: np.array_split(a, [1,3])
Out[4]:                                                                                                                       
[array([[2, 2, 7, 1]]),                                                                                                       
 array([[5, 0, 3, 1],                                                                                                         
       [2, 9, 8, 8]]),                                                                                                        
 array([[5, 7, 7, 6]])]

# split columns into 0| 1 2 3
In [5]: np.array_split(a, [1], axis=1)                                                                                           
Out[5]:                                                                                                                       
[array([[2],                                                                                                                  
       [5],                                                                                                                   
       [2],                                                                                                                   
       [5]]),                                                                                                                 
 array([[2, 7, 1],                                                                                                            
       [0, 3, 1],
       [9, 8, 8],
       [7, 7, 6]])]

An integer as second arg. specifies the number of equal chunks:

一个整数作为第二个arg。指定相等块的数量:

In [6]: np.array_split(a, 2, axis=1)
Out[6]: 
[array([[2, 2],
       [5, 0],
       [2, 9],
       [5, 7]]),
 array([[7, 1],
       [3, 1],
       [8, 8],
       [7, 6]])]

split works the same but raises an exception if an equal split is not possible

split工作方式相同但如果无法进行相等的拆分则会引发异常

In addition to array_split you can use shortcuts vsplit and hsplit.
vsplit and hsplit are pretty much self-explanatry:

除了array_split之外,您还可以使用快捷方式vsplit和hsplit。 vsplit和hsplit几乎是自我解释:

In [11]: np.vsplit(a, 2)
Out[11]: 
[array([[2, 2, 7, 1],
       [5, 0, 3, 1]]),
 array([[2, 9, 8, 8],
       [5, 7, 7, 6]])]

In [12]: np.hsplit(a, 2)
Out[12]: 
[array([[2, 2],
       [5, 0],
       [2, 9],
       [5, 7]]),
 array([[7, 1],
       [3, 1],
       [8, 8],
       [7, 6]])]

#3


7  

I believe that you're looking for numpy.split or possibly numpy.array_split if the number of sections doesn't need to divide the size of the array properly.

我相信你正在寻找numpy.split或者numpy.array_split,如果部分的数量不需要正确地划分数组的大小。

#4


5  

Not quite an answer, but a long comment with nice formatting of code to the other (correct) answers. If you try the following, you will see that what you are getting are views of the original array, not copies, and that was not the case for the accepted answer in the question you link. Be aware of the possible side effects!

不是一个答案,而是一个很长的评论,很好的格式化代码到其他(正确的)答案。如果您尝试以下操作,您将看到所获得的是原始数组的视图,而不是副本,而在您链接的问题中,接受的答案不是这种情况。注意可能的副作用!

>>> x = np.arange(9.0)
>>> a,b,c = np.split(x, 3)
>>> a
array([ 0.,  1.,  2.])
>>> a[1] = 8
>>> a
array([ 0.,  8.,  2.])
>>> x
array([ 0.,  8.,  2.,  3.,  4.,  5.,  6.,  7.,  8.])
>>> def chunks(l, n):
...     """ Yield successive n-sized chunks from l.
...     """
...     for i in xrange(0, len(l), n):
...         yield l[i:i+n]
... 
>>> l = range(9)
>>> a,b,c = chunks(l, 3)
>>> a
[0, 1, 2]
>>> a[1] = 8
>>> a
[0, 8, 2]
>>> l
[0, 1, 2, 3, 4, 5, 6, 7, 8]