计算numpy数组的一些条目之间的欧氏距离

时间:2020-11-30 21:20:35

I am a bit new to numpy and I am trying to calculate the pairwaise distance between some of the elements of a numpy array.

我对numpy有点新,我正在尝试计算numpy数组的某些元素之间的pairwaise距离。

I have a numpy n x 3 array with n 3D cartesian coordinates (x,y,z) representing particles in a grid. Some of these particles move as the program runs and I need to keep track of the distances of the ones that move. I hold a list of integers with the index of the particles that have moved.

我有一个numpy n x 3数组,其中n个3D笛卡尔坐标(x,y,z)代表网格中的粒子。这些粒子中的一些随着程序的运行而移动,我需要跟踪移动的粒子的距离。我持有一个整数列表,其中包含已移动的粒子的索引。

I am aware of pdist but this calculates the distance between every pair of particles, which would be inefficient as only some of them have moved. Ideally, for example, if only 1,2 have moved then I would only calculate the distance of 1 with 2...N and 2 with 3...N

我知道pdist,但是这会计算每对粒子之间的距离,因为只有一些粒子移动了,所以效率很低。理想情况下,例如,如果只有1,2移动了那么我只计算1的距离为2 ... N和2与3 ... N

What would be the most efficient way of doing this? Right now I have a double loop which doesn't seem ideal...

这样做最有效的方法是什么?现在我有一个双环,看起来不太理想......

for i in np.nditer(particles_moved): particles = particles[particles!=i] for j in np.nditer(particles): distance(xyz,i, j)

对于n在np.nditer中的i(particles_moved):粒子=粒子[粒子!= i]对于n在np.nditer(粒子)中的j:距离(xyz,i,j)

Thanks

2 个解决方案

#1


0  

I believe this is what you want (create new axis and use broadcasting for full vectorization):

我相信这就是你想要的(创建新轴并使用广播进行完全矢量化):

import numpy as np

particles = np.arange(12).reshape((-1,3))
moved = np.array([0,2])
np.linalg.norm(particles[moved][:,None,:]-particles[None,:,:], axis=-1)

array([[  0.        ,   5.19615242,  10.39230485,  15.58845727],
       [ 10.39230485,   5.19615242,   0.        ,   5.19615242]])

#2


0  

If you wan't to use a jit compiler, here is an example using Numba

如果你不想使用jit编译器,这里有一个使用Numba的例子

@nb.njit(fastmath=True,parallel=True)
def distance(Paticle_coords,indices_moved):
  dist_res=np.empty((indices_moved.shape[0],Paticle_coords.shape[0]),dtype=Paticle_coords.dtype)

  for i in range(indices_moved.shape[0]):
    Koord=Paticle_coords[indices_moved[i],:]
    dist_res[i,:]=np.sqrt((Koord[0]-Paticle_coords[:,0])**2+(Koord[1]-Paticle_coords[:,1])**2+(Koord[2]-Paticle_coords[:,2])**2)

  return dist_res

Performance compared to Julien's solution

性能与Julien的解决方案相比

#Create Data
Paticle_coords=np.random.rand(10000000,3)
indices_moved=np.array([0,5,6,3,7],dtype=np.int64)

This gives on my Core i7-4771 0.15s for the Numba solution and 2.4s for Julien's solution.

这使我的Core i7-4771 0.15s用于Numba解决方案,而2.4s用于Julien的解决方案。

#1


0  

I believe this is what you want (create new axis and use broadcasting for full vectorization):

我相信这就是你想要的(创建新轴并使用广播进行完全矢量化):

import numpy as np

particles = np.arange(12).reshape((-1,3))
moved = np.array([0,2])
np.linalg.norm(particles[moved][:,None,:]-particles[None,:,:], axis=-1)

array([[  0.        ,   5.19615242,  10.39230485,  15.58845727],
       [ 10.39230485,   5.19615242,   0.        ,   5.19615242]])

#2


0  

If you wan't to use a jit compiler, here is an example using Numba

如果你不想使用jit编译器,这里有一个使用Numba的例子

@nb.njit(fastmath=True,parallel=True)
def distance(Paticle_coords,indices_moved):
  dist_res=np.empty((indices_moved.shape[0],Paticle_coords.shape[0]),dtype=Paticle_coords.dtype)

  for i in range(indices_moved.shape[0]):
    Koord=Paticle_coords[indices_moved[i],:]
    dist_res[i,:]=np.sqrt((Koord[0]-Paticle_coords[:,0])**2+(Koord[1]-Paticle_coords[:,1])**2+(Koord[2]-Paticle_coords[:,2])**2)

  return dist_res

Performance compared to Julien's solution

性能与Julien的解决方案相比

#Create Data
Paticle_coords=np.random.rand(10000000,3)
indices_moved=np.array([0,5,6,3,7],dtype=np.int64)

This gives on my Core i7-4771 0.15s for the Numba solution and 2.4s for Julien's solution.

这使我的Core i7-4771 0.15s用于Numba解决方案,而2.4s用于Julien的解决方案。