Python:在数组数组中找到每列最小和最大的最快方法

时间:2022-08-22 13:08:44

My problem

Imagine I have

想象一下,我有

array1 = np.array([ [1,2] , [3,4], [4,5] ])
array2 = np.array([ [2,5] , [1,4], [8,1] ])
# .... and so on until arrayn

Where the first column we call it "x" and the second column we call it "y" . And then I group them in an container.

第一列我们称之为“x”,第二列我们称之为“y”。然后我将它们分组在一个容器中。

myList = [array1, array2,..., arrayn]

Now what is the fastest way of finding the minimum and maximum x values and the mimimum and maximu y value of the whole arrayList? (i.e. among all the arrays)

现在找到整个arrayList的最小和最大x值以及最小和最大值的最快方法是什么? (即在所有阵列中)

My really slow try

我真的很慢尝试

newarray = np.array([[np.array([i[:,j].min() for i in myList]).min(), np.array([i[:,j].max() for i in myList]).max()] for j in range(2)])

Is there something better?

还有更好的东西吗?

1 个解决方案

#1


1  

With arrayList = np.array(myList) being the 3D input array of the stacked list of 2D arrays, we could simply use min/max ufunc reductions on the array data and then stack those in columns. In the loopy code, we were finding the min/max among all elements for each index along the last axis, so the equivalent reductions would be for the first two axes in the 3D stacked array.

使用arrayList = np.array(myList)作为2D数组堆叠列表的3D输入数组,我们可以简单地对数组数据使用min / max ufunc缩减,然后将它们堆叠在列中。在循环代码中,我们在沿最后一个轴的每个索引的所有元素中找到最小值/最大值,因此等效减少将用于3D堆叠阵列中的前两个轴。

Thus, the implementation would look something like this -

因此,实现看起来像这样 -

np.column_stack(( arrayList.min(axis=(0,1)), arrayList.max(axis=(0,1)) ))

The bottleneck with the above method could be the list to array conversion. So, we could avoid that with a loop comprehension to perform one level of min/max reduction and then one more to cover for all input arrays - 1,2..n. Thus, an alternative solution would be -

上述方法的瓶颈可能是数组转换的列表。因此,我们可以通过循环理解来避免执行一个最小/最大缩减级别,然后再覆盖所有输入数组 - 1,2..n。因此,另一种解决方案是 -

minn = np.min([i.min(0) for i in myList],axis=0)
maxn = np.max([i.max(0) for i in myList],axis=0)
out = np.column_stack(( minn, maxn ))

#1


1  

With arrayList = np.array(myList) being the 3D input array of the stacked list of 2D arrays, we could simply use min/max ufunc reductions on the array data and then stack those in columns. In the loopy code, we were finding the min/max among all elements for each index along the last axis, so the equivalent reductions would be for the first two axes in the 3D stacked array.

使用arrayList = np.array(myList)作为2D数组堆叠列表的3D输入数组,我们可以简单地对数组数据使用min / max ufunc缩减,然后将它们堆叠在列中。在循环代码中,我们在沿最后一个轴的每个索引的所有元素中找到最小值/最大值,因此等效减少将用于3D堆叠阵列中的前两个轴。

Thus, the implementation would look something like this -

因此,实现看起来像这样 -

np.column_stack(( arrayList.min(axis=(0,1)), arrayList.max(axis=(0,1)) ))

The bottleneck with the above method could be the list to array conversion. So, we could avoid that with a loop comprehension to perform one level of min/max reduction and then one more to cover for all input arrays - 1,2..n. Thus, an alternative solution would be -

上述方法的瓶颈可能是数组转换的列表。因此,我们可以通过循环理解来避免执行一个最小/最大缩减级别,然后再覆盖所有输入数组 - 1,2..n。因此,另一种解决方案是 -

minn = np.min([i.min(0) for i in myList],axis=0)
maxn = np.max([i.max(0) for i in myList],axis=0)
out = np.column_stack(( minn, maxn ))