在NumPy中计算距离矩阵的有效方法

时间:2022-08-10 21:35:01

I have this sample array:

我有这个示例数组:

In [38]: arr
Out[38]: array([  0,  44, 121, 154, 191])

The above is just a sample whereas my actual array size is pretty huge. So, what is an efficient way to compute a distance matrix?

以上只是一个示例,而我的实际数组大小非常大。那么,什么是计算距离矩阵的有效方法?

The result should be:

结果应该是:

In [41]: res
Out[41]: 
array([[   0,   44,  121,  154,  191],
       [ -44,    0,   77,  110,  147],
       [-121,  -77,    0,   33,   70],
       [-154, -110,  -33,    0,   37],
       [-191, -147,  -70,  -37,    0]])

I wrote a for loop based implementation which is too slow. Could this be vectorized for efficiency reasons?

我写了一个基于for循环的实现,这个实现太慢了。出于效率原因,这可以进行矢量化吗?

2 个解决方案

#1


1  

There's subtract.outer, which effectively performs broadcasted subtraction between two arrays.

有subtract.outer,它有效地在两个数组之间执行广播减法。

Apply the ufunc op to all pairs (a, b) with a in A and b in B.

将ufunc op应用于所有对(a,b),其中A和B在B中。

Let M = A.ndim, N = B.ndim. Then the result, C, of op.outer(A, B) is an array of dimension M + N such that:

设M = A.ndim,N = B.ndim。那么op.outer(A,B)的结果C是一个维数为M + N的数组,这样:

C[i_0, ..., i_{M-1}, j_0, ..., j_{N-1}] = 
     op(A[i_0, ..., i_{M-1}],B[j_0, ..., j_{N-1}])
np.subtract.outer(arr, arr).T

Or,

arr - arr[:, None] # essentially the same thing as above

array([[   0,   44,  121,  154,  191],
       [ -44,    0,   77,  110,  147],
       [-121,  -77,    0,   33,   70],
       [-154, -110,  -33,    0,   37],
       [-191, -147,  -70,  -37,    0]])

#2


2  

You can use broadcasting:

你可以使用广播:

from numpy import array

arr = array([  0,  44, 121, 154, 191])
arrM = arr.reshape(1, len(arr))
res = arrM - arrM.T

#1


1  

There's subtract.outer, which effectively performs broadcasted subtraction between two arrays.

有subtract.outer,它有效地在两个数组之间执行广播减法。

Apply the ufunc op to all pairs (a, b) with a in A and b in B.

将ufunc op应用于所有对(a,b),其中A和B在B中。

Let M = A.ndim, N = B.ndim. Then the result, C, of op.outer(A, B) is an array of dimension M + N such that:

设M = A.ndim,N = B.ndim。那么op.outer(A,B)的结果C是一个维数为M + N的数组,这样:

C[i_0, ..., i_{M-1}, j_0, ..., j_{N-1}] = 
     op(A[i_0, ..., i_{M-1}],B[j_0, ..., j_{N-1}])
np.subtract.outer(arr, arr).T

Or,

arr - arr[:, None] # essentially the same thing as above

array([[   0,   44,  121,  154,  191],
       [ -44,    0,   77,  110,  147],
       [-121,  -77,    0,   33,   70],
       [-154, -110,  -33,    0,   37],
       [-191, -147,  -70,  -37,    0]])

#2


2  

You can use broadcasting:

你可以使用广播:

from numpy import array

arr = array([  0,  44, 121, 154, 191])
arrM = arr.reshape(1, len(arr))
res = arrM - arrM.T