Suppose I have three arbitrary 1D arrays, for example:
假设我有三个任意的1D数组,例如:
x_p = np.array((1.0, 2.0, 3.0, 4.0, 5.0))
y_p = np.array((2.0, 3.0, 4.0))
z_p = np.array((8.0, 9.0))
These three arrays represent sampling intervals in a 3D grid, and I want to construct a 1D array of three-dimensional vectors for all intersections, something like
这三个数组表示三维网格中的采样间隔,我想为所有的交叉点构造一个一维的三维向量数组,类似这样
points = np.array([[1.0, 2.0, 8.0],
[1.0, 2.0, 9.0],
[1.0, 3.0, 8.0],
...
[5.0, 4.0, 9.0]])
Order doesn't actually matter for this. The obvious way to generate them:
顺序并不重要。生成它们的明显方法是:
npoints = len(x_p) * len(y_p) * len(z_p)
points = np.zeros((npoints, 3))
i = 0
for x in x_p:
for y in y_p:
for z in z_p:
points[i, :] = (x, y, z)
i += 1
So the question is... is there a faster way? I have looked but not found (possibly just failed to find the right Google keywords).
问题是……有更快的方法吗?我找过但没找到(可能只是找不到正确的谷歌关键字)。
I am currently using this:
我现在使用的是:
npoints = len(x_p) * len(y_p) * len(z_p)
points = np.zeros((npoints, 3))
i = 0
nz = len(z_p)
for x in x_p:
for y in y_p:
points[i:i+nz, 0] = x
points[i:i+nz, 1] = y
points[i:i+nz, 2] = z_p
i += nz
but I feel like I am missing some clever fancy Numpy way?
但我觉得我好像错过了一些聪明的花心麻木的方式?
4 个解决方案
#1
13
To use numpy mesh grid on the above example the following will work:
要在上面的示例中使用numpy网格,可以使用以下方法:
np.vstack(np.meshgrid(x_p,y_p,z_p)).reshape(3,-1).T
Numpy meshgrid for grids of more then two dimensions require numpy 1.7. To circumvent this and pulling the relevant data from the source code.
对于大于二维的网格,Numpy网格需要1.7。要绕过这个问题并从源代码中提取相关数据。
def ndmesh(*xi,**kwargs):
if len(xi) < 2:
msg = 'meshgrid() takes 2 or more arguments (%d given)' % int(len(xi) > 0)
raise ValueError(msg)
args = np.atleast_1d(*xi)
ndim = len(args)
copy_ = kwargs.get('copy', True)
s0 = (1,) * ndim
output = [x.reshape(s0[:i] + (-1,) + s0[i + 1::]) for i, x in enumerate(args)]
shape = [x.size for x in output]
# Return the full N-D matrix (not only the 1-D vector)
if copy_:
mult_fact = np.ones(shape, dtype=int)
return [x * mult_fact for x in output]
else:
return np.broadcast_arrays(*output)
Checking the result:
检查结果:
print np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
[[ 1. 2. 8.]
[ 1. 2. 9.]
[ 1. 3. 8.]
....
[ 5. 3. 9.]
[ 5. 4. 8.]
[ 5. 4. 9.]]
For the above example:
上面的例子:
%timeit sol2()
10000 loops, best of 3: 56.1 us per loop
%timeit np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10000 loops, best of 3: 55.1 us per loop
For when each dimension is 100:
当每个维度为100时:
%timeit sol2()
1 loops, best of 3: 655 ms per loop
In [10]:
%timeit points = np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10 loops, best of 3: 21.8 ms per loop
Depending on what you want to do with the data, you can return a view:
根据你想对数据做什么,你可以返回一个视图:
%timeit np.vstack((ndmesh(x_p,y_p,z_p,copy=False))).reshape(3,-1).T
100 loops, best of 3: 8.16 ms per loop
#2
7
For your specific example, mgrid
is quite useful.:
对于您的特定示例,mgrid非常有用。
In [1]: import numpy as np
In [2]: points = np.mgrid[1:6, 2:5, 8:10]
In [3]: points.reshape(3, -1).T
Out[3]:
array([[1, 2, 8],
[1, 2, 9],
[1, 3, 8],
[1, 3, 9],
[1, 4, 8],
[1, 4, 9],
[2, 2, 8],
[2, 2, 9],
[2, 3, 8],
[2, 3, 9],
[2, 4, 8],
[2, 4, 9],
[3, 2, 8],
[3, 2, 9],
[3, 3, 8],
[3, 3, 9],
[3, 4, 8],
[3, 4, 9],
[4, 2, 8],
[4, 2, 9],
[4, 3, 8],
[4, 3, 9],
[4, 4, 8],
[4, 4, 9],
[5, 2, 8],
[5, 2, 9],
[5, 3, 8],
[5, 3, 9],
[5, 4, 8],
[5, 4, 9]])
More generally, if you have specific arrays that you want to do this with, you'd use meshgrid
instead of mgrid
. However, you'll need numpy 1.7 or later for it to work in more than two dimensions.
更一般地说,如果您想要使用特定的数组,您应该使用网格而不是mgrid。但是,您需要numpy 1.7或更高版本才能在两个以上的维度上工作。
#3
3
You can use itertools.product
:
您可以使用itertools.product:
def sol1():
points = np.array( list(product(x_p, y_p, z_p)) )
Another solution using iterators and np.take
showed to be about 3X faster for your case:
另一个使用迭代器和np的解决方案。对于你的情况,take show大约快3倍:
from itertools import izip, product
def sol2():
points = np.empty((len(x_p)*len(y_p)*len(z_p),3))
xi,yi,zi = izip(*product( xrange(len(x_p)),
xrange(len(y_p)),
xrange(len(z_p)) ))
points[:,0] = np.take(x_p,xi)
points[:,1] = np.take(y_p,yi)
points[:,2] = np.take(z_p,zi)
return points
Timing results:
结果:时间
In [3]: timeit sol1()
10000 loops, best of 3: 126 µs per loop
In [4]: timeit sol2()
10000 loops, best of 3: 42.9 µs per loop
In [6]: timeit ops()
10000 loops, best of 3: 59 µs per loop
In [11]: timeit joekingtons() # with your permission Joe...
10000 loops, best of 3: 56.2 µs per loop
So, only my second solution and Joe Kington's solution would give you some performance gain...
所以,只有我的第二种解决方案和乔·金顿的解决方案才能给你带来性能提升……
#4
1
For those who had to stay with numpy <1.7.x, here is a simple two-liner solution:
对于那些必须保持numpy <1.7的人。x,这里有一个简单的双线性解决方案:
g_p=np.zeros((x_p.size, y_p.size, z_p.size))
array_you_want=array(zip(*[item.flatten() for item in \
[g_p+x_p[...,np.newaxis,np.newaxis],\
g_p+y_p[...,np.newaxis],\
g_p+z_p]]))
Very easy to expand to even higher dimenision as well. Cheers!
很容易扩展到更高的维度。干杯!
#1
13
To use numpy mesh grid on the above example the following will work:
要在上面的示例中使用numpy网格,可以使用以下方法:
np.vstack(np.meshgrid(x_p,y_p,z_p)).reshape(3,-1).T
Numpy meshgrid for grids of more then two dimensions require numpy 1.7. To circumvent this and pulling the relevant data from the source code.
对于大于二维的网格,Numpy网格需要1.7。要绕过这个问题并从源代码中提取相关数据。
def ndmesh(*xi,**kwargs):
if len(xi) < 2:
msg = 'meshgrid() takes 2 or more arguments (%d given)' % int(len(xi) > 0)
raise ValueError(msg)
args = np.atleast_1d(*xi)
ndim = len(args)
copy_ = kwargs.get('copy', True)
s0 = (1,) * ndim
output = [x.reshape(s0[:i] + (-1,) + s0[i + 1::]) for i, x in enumerate(args)]
shape = [x.size for x in output]
# Return the full N-D matrix (not only the 1-D vector)
if copy_:
mult_fact = np.ones(shape, dtype=int)
return [x * mult_fact for x in output]
else:
return np.broadcast_arrays(*output)
Checking the result:
检查结果:
print np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
[[ 1. 2. 8.]
[ 1. 2. 9.]
[ 1. 3. 8.]
....
[ 5. 3. 9.]
[ 5. 4. 8.]
[ 5. 4. 9.]]
For the above example:
上面的例子:
%timeit sol2()
10000 loops, best of 3: 56.1 us per loop
%timeit np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10000 loops, best of 3: 55.1 us per loop
For when each dimension is 100:
当每个维度为100时:
%timeit sol2()
1 loops, best of 3: 655 ms per loop
In [10]:
%timeit points = np.vstack((ndmesh(x_p,y_p,z_p))).reshape(3,-1).T
10 loops, best of 3: 21.8 ms per loop
Depending on what you want to do with the data, you can return a view:
根据你想对数据做什么,你可以返回一个视图:
%timeit np.vstack((ndmesh(x_p,y_p,z_p,copy=False))).reshape(3,-1).T
100 loops, best of 3: 8.16 ms per loop
#2
7
For your specific example, mgrid
is quite useful.:
对于您的特定示例,mgrid非常有用。
In [1]: import numpy as np
In [2]: points = np.mgrid[1:6, 2:5, 8:10]
In [3]: points.reshape(3, -1).T
Out[3]:
array([[1, 2, 8],
[1, 2, 9],
[1, 3, 8],
[1, 3, 9],
[1, 4, 8],
[1, 4, 9],
[2, 2, 8],
[2, 2, 9],
[2, 3, 8],
[2, 3, 9],
[2, 4, 8],
[2, 4, 9],
[3, 2, 8],
[3, 2, 9],
[3, 3, 8],
[3, 3, 9],
[3, 4, 8],
[3, 4, 9],
[4, 2, 8],
[4, 2, 9],
[4, 3, 8],
[4, 3, 9],
[4, 4, 8],
[4, 4, 9],
[5, 2, 8],
[5, 2, 9],
[5, 3, 8],
[5, 3, 9],
[5, 4, 8],
[5, 4, 9]])
More generally, if you have specific arrays that you want to do this with, you'd use meshgrid
instead of mgrid
. However, you'll need numpy 1.7 or later for it to work in more than two dimensions.
更一般地说,如果您想要使用特定的数组,您应该使用网格而不是mgrid。但是,您需要numpy 1.7或更高版本才能在两个以上的维度上工作。
#3
3
You can use itertools.product
:
您可以使用itertools.product:
def sol1():
points = np.array( list(product(x_p, y_p, z_p)) )
Another solution using iterators and np.take
showed to be about 3X faster for your case:
另一个使用迭代器和np的解决方案。对于你的情况,take show大约快3倍:
from itertools import izip, product
def sol2():
points = np.empty((len(x_p)*len(y_p)*len(z_p),3))
xi,yi,zi = izip(*product( xrange(len(x_p)),
xrange(len(y_p)),
xrange(len(z_p)) ))
points[:,0] = np.take(x_p,xi)
points[:,1] = np.take(y_p,yi)
points[:,2] = np.take(z_p,zi)
return points
Timing results:
结果:时间
In [3]: timeit sol1()
10000 loops, best of 3: 126 µs per loop
In [4]: timeit sol2()
10000 loops, best of 3: 42.9 µs per loop
In [6]: timeit ops()
10000 loops, best of 3: 59 µs per loop
In [11]: timeit joekingtons() # with your permission Joe...
10000 loops, best of 3: 56.2 µs per loop
So, only my second solution and Joe Kington's solution would give you some performance gain...
所以,只有我的第二种解决方案和乔·金顿的解决方案才能给你带来性能提升……
#4
1
For those who had to stay with numpy <1.7.x, here is a simple two-liner solution:
对于那些必须保持numpy <1.7的人。x,这里有一个简单的双线性解决方案:
g_p=np.zeros((x_p.size, y_p.size, z_p.size))
array_you_want=array(zip(*[item.flatten() for item in \
[g_p+x_p[...,np.newaxis,np.newaxis],\
g_p+y_p[...,np.newaxis],\
g_p+z_p]]))
Very easy to expand to even higher dimenision as well. Cheers!
很容易扩展到更高的维度。干杯!