如何用numpy有效地将函数应用到三维数组中?

时间:2021-11-05 23:29:43

I want to apply arbitrary function to 3d-ndarray as element, which use (3rd-dimensional) array for its arguments and return scalar.As a result, we should get 2d-Matrix.

我想将任意函数应用到3d-ndarray作为元素,它使用(3d- dimensional)数组作为参数并返回标量。因此,我们应该得到2d矩阵。

e.g) pseudo code

如)伪代码

A = [[[1,2,3],[4,5,6]],
     [[7,8,9],[10,11,12]]]
A.apply_3d_array(sum) ## or apply_3d_array(A,sum) is Okey.
>> [[6,15],[24,33]]

I understand it's possible with loop using ndarray.shape function,but direct index access is inefficient as official document says. Is there more effective way than using loop?

我知道使用ndarray循环是可能的。形状功能,但直接索引访问是低效的,正如官方文件所说。是否有比使用循环更有效的方法?

def chromaticity(pixel):
    geo_mean = math.pow(sum(pixel),1/3)
    return map(lambda x: math.log(x/geo_mean),pixel ) 

2 个解决方案

#1


3  

Given the function implementation, we could vectorize it using NumPy ufuncs that would operate on the entire input array A in one go and thus avoid the math library functions that doesn't support vectorization on arrays. In this process, we would also bring in the very efficient vectorizing tool : NumPy broadcasting. So, we would have an implementation like so -

给定函数实现,我们可以使用NumPy ufuncs对整个输入数组A进行一次操作,从而避免使用不支持数组向量化的数学库函数。在这个过程中,我们还将引入非常有效的矢量化工具:NumPy广播。所以,我们会有一个这样的实现

np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))

Sample run and verification

样本运行和验证

The function implementation without the lamdba construct and introducing NumPy functions instead of math library functions, would look something like this -

没有lamdba构造和引入NumPy函数而不是数学库函数的函数实现会是这样的

def chromaticity(pixel): 
    geo_mean = np.power(np.sum(pixel),1/3) 
    return np.log(pixel/geo_mean)

Sample run with the iterative implementation -

样本运行与迭代实现-

In [67]: chromaticity(A[0,0,:])
Out[67]: array([-0.59725316,  0.09589402,  0.50135913])

In [68]: chromaticity(A[0,1,:])
Out[68]: array([ 0.48361096,  0.70675451,  0.88907607])

In [69]: chromaticity(A[1,0,:])
Out[69]: array([ 0.88655887,  1.02009026,  1.1378733 ])

In [70]: chromaticity(A[1,1,:])
Out[70]: array([ 1.13708257,  1.23239275,  1.31940413])    

Sample run with the proposed vectorized implementation -

样例运行与拟议的矢量实现-

In [72]: np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))
Out[72]: 
array([[[-0.59725316,  0.09589402,  0.50135913],
        [ 0.48361096,  0.70675451,  0.88907607]],

       [[ 0.88655887,  1.02009026,  1.1378733 ],
        [ 1.13708257,  1.23239275,  1.31940413]]])

Runtime test

运行时测试

In [131]: A = np.random.randint(0,255,(512,512,3)) # 512x512 colored image

In [132]: def org_app(A):
     ...:     out = np.zeros(A.shape)     
     ...:     for i in range(A.shape[0]):
     ...:         for j in range(A.shape[1]):
     ...:             out[i,j] = chromaticity(A[i,j])
     ...:     return out
     ...: 

In [133]: %timeit org_app(A)
1 loop, best of 3: 5.99 s per loop

In [134]: %timeit np.apply_along_axis(chromaticity, 2, A) #@hpaulj's soln
1 loop, best of 3: 9.68 s per loop

In [135]: %timeit np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))
10 loops, best of 3: 90.8 ms per loop

That's why always try to push in NumPy funcs when vectorizing things with arrays and work on as many elements in one-go as possible!

这就是为什么在用数组向量化东西时总是尝试推入NumPy函数,并一次处理尽可能多的元素!

#2


1  

apply_along_axis is designed to make this task easy:

apply_along_axis旨在使这个任务变得更容易:

In [683]: A=np.arange(1,13).reshape(2,2,3)
In [684]: A
Out[684]: 
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])
In [685]: np.apply_along_axis(np.sum, 2, A)
Out[685]: 
array([[ 6, 15],
       [24, 33]])

It, in effect, does

实际上,它

for all i,j:
    out[i,j] = func( A[i,j,:])

taking care of the details. It's not faster than doing that iteration yourself, but it makes it easier.

注意细节。它并不比自己进行迭代快,但是它使它更容易。

Another trick is to reshape your input to 2d, perform the simpler 1d iteration, and the reshape the result

另一个技巧是将输入重新定义为2d,执行更简单的1d迭代,并重新定义结果

 A1 = A.reshape(-1, A.shape[-1])
 for i in range(A1.shape[0]):
     out[i] = func(A1[i,:])
 out.reshape(A.shape[:2])

To do things faster, you need to dig into the guts of the function, and figure out how to use compile numpy operations on more than one dimension. In the simple case of sum, that function already can work on selected axes.

为了更快地完成工作,您需要深入到函数的内部,并了解如何在多个维度上使用编译numpy操作。在简单的sum例子中,该函数已经可以在选定的轴上工作。

#1


3  

Given the function implementation, we could vectorize it using NumPy ufuncs that would operate on the entire input array A in one go and thus avoid the math library functions that doesn't support vectorization on arrays. In this process, we would also bring in the very efficient vectorizing tool : NumPy broadcasting. So, we would have an implementation like so -

给定函数实现,我们可以使用NumPy ufuncs对整个输入数组A进行一次操作,从而避免使用不支持数组向量化的数学库函数。在这个过程中,我们还将引入非常有效的矢量化工具:NumPy广播。所以,我们会有一个这样的实现

np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))

Sample run and verification

样本运行和验证

The function implementation without the lamdba construct and introducing NumPy functions instead of math library functions, would look something like this -

没有lamdba构造和引入NumPy函数而不是数学库函数的函数实现会是这样的

def chromaticity(pixel): 
    geo_mean = np.power(np.sum(pixel),1/3) 
    return np.log(pixel/geo_mean)

Sample run with the iterative implementation -

样本运行与迭代实现-

In [67]: chromaticity(A[0,0,:])
Out[67]: array([-0.59725316,  0.09589402,  0.50135913])

In [68]: chromaticity(A[0,1,:])
Out[68]: array([ 0.48361096,  0.70675451,  0.88907607])

In [69]: chromaticity(A[1,0,:])
Out[69]: array([ 0.88655887,  1.02009026,  1.1378733 ])

In [70]: chromaticity(A[1,1,:])
Out[70]: array([ 1.13708257,  1.23239275,  1.31940413])    

Sample run with the proposed vectorized implementation -

样例运行与拟议的矢量实现-

In [72]: np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))
Out[72]: 
array([[[-0.59725316,  0.09589402,  0.50135913],
        [ 0.48361096,  0.70675451,  0.88907607]],

       [[ 0.88655887,  1.02009026,  1.1378733 ],
        [ 1.13708257,  1.23239275,  1.31940413]]])

Runtime test

运行时测试

In [131]: A = np.random.randint(0,255,(512,512,3)) # 512x512 colored image

In [132]: def org_app(A):
     ...:     out = np.zeros(A.shape)     
     ...:     for i in range(A.shape[0]):
     ...:         for j in range(A.shape[1]):
     ...:             out[i,j] = chromaticity(A[i,j])
     ...:     return out
     ...: 

In [133]: %timeit org_app(A)
1 loop, best of 3: 5.99 s per loop

In [134]: %timeit np.apply_along_axis(chromaticity, 2, A) #@hpaulj's soln
1 loop, best of 3: 9.68 s per loop

In [135]: %timeit np.log(A/np.power(np.sum(A,2,keepdims=True),1/3))
10 loops, best of 3: 90.8 ms per loop

That's why always try to push in NumPy funcs when vectorizing things with arrays and work on as many elements in one-go as possible!

这就是为什么在用数组向量化东西时总是尝试推入NumPy函数,并一次处理尽可能多的元素!

#2


1  

apply_along_axis is designed to make this task easy:

apply_along_axis旨在使这个任务变得更容易:

In [683]: A=np.arange(1,13).reshape(2,2,3)
In [684]: A
Out[684]: 
array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])
In [685]: np.apply_along_axis(np.sum, 2, A)
Out[685]: 
array([[ 6, 15],
       [24, 33]])

It, in effect, does

实际上,它

for all i,j:
    out[i,j] = func( A[i,j,:])

taking care of the details. It's not faster than doing that iteration yourself, but it makes it easier.

注意细节。它并不比自己进行迭代快,但是它使它更容易。

Another trick is to reshape your input to 2d, perform the simpler 1d iteration, and the reshape the result

另一个技巧是将输入重新定义为2d,执行更简单的1d迭代,并重新定义结果

 A1 = A.reshape(-1, A.shape[-1])
 for i in range(A1.shape[0]):
     out[i] = func(A1[i,:])
 out.reshape(A.shape[:2])

To do things faster, you need to dig into the guts of the function, and figure out how to use compile numpy operations on more than one dimension. In the simple case of sum, that function already can work on selected axes.

为了更快地完成工作,您需要深入到函数的内部,并了解如何在多个维度上使用编译numpy操作。在简单的sum例子中,该函数已经可以在选定的轴上工作。