从三个一维数组中创建一个数字数组的三维坐标

时间:2021-03-10 21:23:01

Suppose I have three arbitrary 1D arrays, for example:

假设我有三个任意的1D数组,例如:

x_p = np.array((1.0, 2.0, 3.0, 4.0, 5.0))
y_p = np.array((2.0, 3.0, 4.0))
z_p = np.array((8.0, 9.0))

These three arrays represent sampling intervals in a 3D grid, and I want to construct a 1D array of three-dimensional vectors for all intersections, something like

这三个数组表示三维网格中的采样间隔,我想为所有的交叉点构造一个一维的三维向量数组,类似这样

points = np.array([[1.0, 2.0, 8.0],
                   [1.0, 2.0, 9.0],
                   [1.0, 3.0, 8.0],
                   ...
                   [5.0, 4.0, 9.0]])

Order doesn't actually matter for this. The obvious way to generate them:

顺序并不重要。生成它们的明显方法是:

npoints = len(x_p) * len(y_p) * len(z_p)
points = np.zeros((npoints, 3))
i = 0
for x in x_p:
    for y in y_p:
        for z in z_p:
            points[i, :] = (x, y, z)
            i += 1

So the question is... is there a faster way? I have looked but not found (possibly just failed to find the right Google keywords).

问题是……有更快的方法吗?我找过但没找到(可能只是找不到正确的谷歌关键字)。

I am currently using this:

我现在使用的是:

npoints = len(x_p) * len(y_p) * len(z_p)
points = np.zeros((npoints, 3))
i = 0
nz = len(z_p)
for x in x_p:
    for y in y_p:
        points[i:i+nz, 0] = x
        points[i:i+nz, 1] = y
        points[i:i+nz, 2] = z_p
        i += nz

but I feel like I am missing some clever fancy Numpy way?

但我觉得我好像错过了一些聪明的花心麻木的方式?

4 个解决方案

#1


13  

To use numpy mesh grid on the above example the following will work:

要在上面的示例中使用numpy网格,可以使用以下方法:

np.vstack(np.meshgrid(x_p,y_p,z_p)).reshape(3,-1).T

Numpy meshgrid for grids of more then two dimensions require numpy 1.7. To circumvent this and pulling the relevant data from the source code.

对于大于二维的网格,Numpy网格需要1.7。要绕过这个问题并从源代码中提取相关数据。

def ndmesh(*xi,**kwargs):
    if len(xi) < 2:
        msg = 'meshgrid() takes 2 or more arguments (%d given)' % int(len(xi) > 0)
        raise ValueError(msg)

    args = np.atleast_1d(*xi)
    ndim = len(args)
    copy_ = kwargs.get('copy', True)

    s0 = (1,) * ndim
    output = [x.reshape(s0[:i] + (-1,) + s0[i + 1::]) for i, x in enumerate(args)]

    shape = [x.size for x in output]

    # Return the full N-D matrix (not only the 1-D vector)
    if copy_:
        mult_fact = np.ones(shape, dtype=int)
        return [x * mult_fact for x in output]
    else:
        return np.broadcast_arrays(*output)

Checking the result:

检查结果:

print np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T

[[ 1.  2.  8.]
 [ 1.  2.  9.]
 [ 1.  3.  8.]
 ....
 [ 5.  3.  9.]
 [ 5.  4.  8.]
 [ 5.  4.  9.]]

For the above example:

上面的例子:

%timeit sol2()
10000 loops, best of 3: 56.1 us per loop

%timeit np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10000 loops, best of 3: 55.1 us per loop

For when each dimension is 100:

当每个维度为100时:

%timeit sol2()
1 loops, best of 3: 655 ms per loop
In [10]:

%timeit points = np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10 loops, best of 3: 21.8 ms per loop

Depending on what you want to do with the data, you can return a view:

根据你想对数据做什么,你可以返回一个视图:

%timeit np.vstack((ndmesh(x_p,y_p,z_p,copy=False))).reshape(3,-1).T
100 loops, best of 3: 8.16 ms per loop

#2


7  

For your specific example, mgrid is quite useful.:

对于您的特定示例,mgrid非常有用。

In [1]: import numpy as np
In [2]: points = np.mgrid[1:6, 2:5, 8:10]
In [3]: points.reshape(3, -1).T
Out[3]:
array([[1, 2, 8],
       [1, 2, 9],
       [1, 3, 8],
       [1, 3, 9],
       [1, 4, 8],
       [1, 4, 9],
       [2, 2, 8],
       [2, 2, 9],
       [2, 3, 8],
       [2, 3, 9],
       [2, 4, 8],
       [2, 4, 9],
       [3, 2, 8],
       [3, 2, 9],
       [3, 3, 8],
       [3, 3, 9],
       [3, 4, 8],
       [3, 4, 9],
       [4, 2, 8],
       [4, 2, 9],
       [4, 3, 8],
       [4, 3, 9],
       [4, 4, 8],
       [4, 4, 9],
       [5, 2, 8],
       [5, 2, 9],
       [5, 3, 8],
       [5, 3, 9],
       [5, 4, 8],
       [5, 4, 9]])

More generally, if you have specific arrays that you want to do this with, you'd use meshgrid instead of mgrid. However, you'll need numpy 1.7 or later for it to work in more than two dimensions.

更一般地说,如果您想要使用特定的数组,您应该使用网格而不是mgrid。但是,您需要numpy 1.7或更高版本才能在两个以上的维度上工作。

#3


3  

You can use itertools.product:

您可以使用itertools.product:

def sol1():
    points = np.array( list(product(x_p, y_p, z_p)) )

Another solution using iterators and np.take showed to be about 3X faster for your case:

另一个使用迭代器和np的解决方案。对于你的情况,take show大约快3倍:

from itertools import izip, product

def sol2():
    points = np.empty((len(x_p)*len(y_p)*len(z_p),3))

    xi,yi,zi = izip(*product( xrange(len(x_p)),
                              xrange(len(y_p)),
                              xrange(len(z_p))  ))

    points[:,0] = np.take(x_p,xi)
    points[:,1] = np.take(y_p,yi)
    points[:,2] = np.take(z_p,zi)
    return points

Timing results:

结果:时间

In [3]: timeit sol1()
10000 loops, best of 3: 126 µs per loop

In [4]: timeit sol2()
10000 loops, best of 3: 42.9 µs per loop

In [6]: timeit ops()
10000 loops, best of 3: 59 µs per loop

In [11]: timeit joekingtons() # with your permission Joe...
10000 loops, best of 3: 56.2 µs per loop

So, only my second solution and Joe Kington's solution would give you some performance gain...

所以,只有我的第二种解决方案和乔·金顿的解决方案才能给你带来性能提升……

#4


1  

For those who had to stay with numpy <1.7.x, here is a simple two-liner solution:

对于那些必须保持numpy <1.7的人。x,这里有一个简单的双线性解决方案:

g_p=np.zeros((x_p.size, y_p.size, z_p.size))
array_you_want=array(zip(*[item.flatten() for item in \
                                 [g_p+x_p[...,np.newaxis,np.newaxis],\
                                  g_p+y_p[...,np.newaxis],\
                                  g_p+z_p]]))

Very easy to expand to even higher dimenision as well. Cheers!

很容易扩展到更高的维度。干杯!

#1


13  

To use numpy mesh grid on the above example the following will work:

要在上面的示例中使用numpy网格,可以使用以下方法:

np.vstack(np.meshgrid(x_p,y_p,z_p)).reshape(3,-1).T

Numpy meshgrid for grids of more then two dimensions require numpy 1.7. To circumvent this and pulling the relevant data from the source code.

对于大于二维的网格,Numpy网格需要1.7。要绕过这个问题并从源代码中提取相关数据。

def ndmesh(*xi,**kwargs):
    if len(xi) < 2:
        msg = 'meshgrid() takes 2 or more arguments (%d given)' % int(len(xi) > 0)
        raise ValueError(msg)

    args = np.atleast_1d(*xi)
    ndim = len(args)
    copy_ = kwargs.get('copy', True)

    s0 = (1,) * ndim
    output = [x.reshape(s0[:i] + (-1,) + s0[i + 1::]) for i, x in enumerate(args)]

    shape = [x.size for x in output]

    # Return the full N-D matrix (not only the 1-D vector)
    if copy_:
        mult_fact = np.ones(shape, dtype=int)
        return [x * mult_fact for x in output]
    else:
        return np.broadcast_arrays(*output)

Checking the result:

检查结果:

print np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T

[[ 1.  2.  8.]
 [ 1.  2.  9.]
 [ 1.  3.  8.]
 ....
 [ 5.  3.  9.]
 [ 5.  4.  8.]
 [ 5.  4.  9.]]

For the above example:

上面的例子:

%timeit sol2()
10000 loops, best of 3: 56.1 us per loop

%timeit np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10000 loops, best of 3: 55.1 us per loop

For when each dimension is 100:

当每个维度为100时:

%timeit sol2()
1 loops, best of 3: 655 ms per loop
In [10]:

%timeit points = np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10 loops, best of 3: 21.8 ms per loop

Depending on what you want to do with the data, you can return a view:

根据你想对数据做什么,你可以返回一个视图:

%timeit np.vstack((ndmesh(x_p,y_p,z_p,copy=False))).reshape(3,-1).T
100 loops, best of 3: 8.16 ms per loop

#2


7  

For your specific example, mgrid is quite useful.:

对于您的特定示例,mgrid非常有用。

In [1]: import numpy as np
In [2]: points = np.mgrid[1:6, 2:5, 8:10]
In [3]: points.reshape(3, -1).T
Out[3]:
array([[1, 2, 8],
       [1, 2, 9],
       [1, 3, 8],
       [1, 3, 9],
       [1, 4, 8],
       [1, 4, 9],
       [2, 2, 8],
       [2, 2, 9],
       [2, 3, 8],
       [2, 3, 9],
       [2, 4, 8],
       [2, 4, 9],
       [3, 2, 8],
       [3, 2, 9],
       [3, 3, 8],
       [3, 3, 9],
       [3, 4, 8],
       [3, 4, 9],
       [4, 2, 8],
       [4, 2, 9],
       [4, 3, 8],
       [4, 3, 9],
       [4, 4, 8],
       [4, 4, 9],
       [5, 2, 8],
       [5, 2, 9],
       [5, 3, 8],
       [5, 3, 9],
       [5, 4, 8],
       [5, 4, 9]])

More generally, if you have specific arrays that you want to do this with, you'd use meshgrid instead of mgrid. However, you'll need numpy 1.7 or later for it to work in more than two dimensions.

更一般地说,如果您想要使用特定的数组,您应该使用网格而不是mgrid。但是,您需要numpy 1.7或更高版本才能在两个以上的维度上工作。

#3


3  

You can use itertools.product:

您可以使用itertools.product:

def sol1():
    points = np.array( list(product(x_p, y_p, z_p)) )

Another solution using iterators and np.take showed to be about 3X faster for your case:

另一个使用迭代器和np的解决方案。对于你的情况,take show大约快3倍:

from itertools import izip, product

def sol2():
    points = np.empty((len(x_p)*len(y_p)*len(z_p),3))

    xi,yi,zi = izip(*product( xrange(len(x_p)),
                              xrange(len(y_p)),
                              xrange(len(z_p))  ))

    points[:,0] = np.take(x_p,xi)
    points[:,1] = np.take(y_p,yi)
    points[:,2] = np.take(z_p,zi)
    return points

Timing results:

结果:时间

In [3]: timeit sol1()
10000 loops, best of 3: 126 µs per loop

In [4]: timeit sol2()
10000 loops, best of 3: 42.9 µs per loop

In [6]: timeit ops()
10000 loops, best of 3: 59 µs per loop

In [11]: timeit joekingtons() # with your permission Joe...
10000 loops, best of 3: 56.2 µs per loop

So, only my second solution and Joe Kington's solution would give you some performance gain...

所以,只有我的第二种解决方案和乔·金顿的解决方案才能给你带来性能提升……

#4


1  

For those who had to stay with numpy <1.7.x, here is a simple two-liner solution:

对于那些必须保持numpy <1.7的人。x,这里有一个简单的双线性解决方案:

g_p=np.zeros((x_p.size, y_p.size, z_p.size))
array_you_want=array(zip(*[item.flatten() for item in \
                                 [g_p+x_p[...,np.newaxis,np.newaxis],\
                                  g_p+y_p[...,np.newaxis],\
                                  g_p+z_p]]))

Very easy to expand to even higher dimenision as well. Cheers!

很容易扩展到更高的维度。干杯!