I have this sample array:
我有这个示例数组:
In [38]: arr
Out[38]: array([ 0, 44, 121, 154, 191])
The above is just a sample whereas my actual array size is pretty huge. So, what is an efficient way to compute a distance matrix?
以上只是一个示例,而我的实际数组大小非常大。那么,什么是计算距离矩阵的有效方法?
The result should be:
结果应该是:
In [41]: res
Out[41]:
array([[ 0, 44, 121, 154, 191],
[ -44, 0, 77, 110, 147],
[-121, -77, 0, 33, 70],
[-154, -110, -33, 0, 37],
[-191, -147, -70, -37, 0]])
I wrote a for
loop based implementation which is too slow. Could this be vectorized for efficiency reasons?
我写了一个基于for循环的实现,这个实现太慢了。出于效率原因,这可以进行矢量化吗?
2 个解决方案
#1
1
There's subtract
.outer
, which effectively performs broadcasted subtraction between two arrays.
有subtract.outer,它有效地在两个数组之间执行广播减法。
Apply the ufunc
op
to all pairs (a, b) with a in A and b in B.将ufunc op应用于所有对(a,b),其中A和B在B中。
Let M = A.ndim, N = B.ndim. Then the result, C, of
op.outer(A, B)
is an array of dimension M + N such that:设M = A.ndim,N = B.ndim。那么op.outer(A,B)的结果C是一个维数为M + N的数组,这样:
C[i_0, ..., i_{M-1}, j_0, ..., j_{N-1}] = op(A[i_0, ..., i_{M-1}],B[j_0, ..., j_{N-1}])
np.subtract.outer(arr, arr).T
Or,
arr - arr[:, None] # essentially the same thing as above
array([[ 0, 44, 121, 154, 191],
[ -44, 0, 77, 110, 147],
[-121, -77, 0, 33, 70],
[-154, -110, -33, 0, 37],
[-191, -147, -70, -37, 0]])
#2
2
You can use broadcasting:
你可以使用广播:
from numpy import array
arr = array([ 0, 44, 121, 154, 191])
arrM = arr.reshape(1, len(arr))
res = arrM - arrM.T
#1
1
There's subtract
.outer
, which effectively performs broadcasted subtraction between two arrays.
有subtract.outer,它有效地在两个数组之间执行广播减法。
Apply the ufunc
op
to all pairs (a, b) with a in A and b in B.将ufunc op应用于所有对(a,b),其中A和B在B中。
Let M = A.ndim, N = B.ndim. Then the result, C, of
op.outer(A, B)
is an array of dimension M + N such that:设M = A.ndim,N = B.ndim。那么op.outer(A,B)的结果C是一个维数为M + N的数组,这样:
C[i_0, ..., i_{M-1}, j_0, ..., j_{N-1}] = op(A[i_0, ..., i_{M-1}],B[j_0, ..., j_{N-1}])
np.subtract.outer(arr, arr).T
Or,
arr - arr[:, None] # essentially the same thing as above
array([[ 0, 44, 121, 154, 191],
[ -44, 0, 77, 110, 147],
[-121, -77, 0, 33, 70],
[-154, -110, -33, 0, 37],
[-191, -147, -70, -37, 0]])
#2
2
You can use broadcasting:
你可以使用广播:
from numpy import array
arr = array([ 0, 44, 121, 154, 191])
arrM = arr.reshape(1, len(arr))
res = arrM - arrM.T