为什么cffi比numpy快得多?

时间:2021-11-02 01:32:55

I have been playing around with writing cffi modules in python, and their speed is making me wonder if I'm using standard python correctly. It's making me want to switch to C completely! Truthfully there are some great python libraries I could never reimplement myself in C so this is more hypothetical than anything really.

我一直在玩python中编写cffi模块,他们的速度让我想知道我是否正确使用标准python。这让我想彻底切换到C!说实话,有一些伟大的python库我永远无法在C中重新实现自己,所以这比任何事情都更加假设。

This example shows the sum function in python being used with a numpy array, and how slow it is in comparison with a c function. Is there a quicker pythonic way of computing the sum of a numpy array?

这个例子展示了python中的sum函数与numpy数组一起使用,以及它与c函数相比有多慢。是否有更快速的pythonic方法来计算numpy数组的总和?

def cast_matrix(matrix, ffi):
    ap = ffi.new("double* [%d]" % (matrix.shape[0]))
    ptr = ffi.cast("double *", matrix.ctypes.data)
    for i in range(matrix.shape[0]):
        ap[i] = ptr + i*matrix.shape[1]                                                                
    return ap 

ffi = FFI()
ffi.cdef("""
double sum(double**, int, int);
""")
C = ffi.verify("""
double sum(double** matrix,int x, int y){
    int i, j; 
    double sum = 0.0;
    for (i=0; i<x; i++){
        for (j=0; j<y; j++){
            sum = sum + matrix[i][j];
        }
    }
    return(sum);
}
""")
m = np.ones(shape=(10,10))
print 'numpy says', m.sum()

m_p = cast_matrix(m, ffi)

sm = C.sum(m_p, m.shape[0], m.shape[1])
print 'cffi says', sm

just to show the function works:

只是为了显示功能的工作原理:

numpy says 100.0
cffi says 100.0

now if I time this simple function I find that numpy is really slow! Am I using numpy in the correct way? Is there a faster way to calculate the sum in python?

现在,如果我计时这个简单的功能,我发现numpy真的很慢!我是否以正确的方式使用numpy?有没有更快的方法来计算python中的总和?

import time
n = 1000000

t0 = time.time()
for i in range(n): C.sum(m_p, m.shape[0], m.shape[1])
t1 = time.time()

print 'cffi', t1-t0

t0 = time.time()
for i in range(n): m.sum()
t1 = time.time()

print 'numpy', t1-t0

times:

时间:

cffi 0.818415880203
numpy 5.61657714844

1 个解决方案

#1


13  

Numpy is slower than C for two reasons: the Python overhead (probably similar to cffi) and generality. Numpy is designed to deal with arrays of arbitrary dimensions, in a bunch of different data types. Your example with cffi was made for a 2D array of floats. The cost was writing several lines of code vs .sum(), 6 characters to save less than 5 microseconds. (But of course, you already knew this). I just want to emphasize that CPU time is cheap, much cheaper than developer time.

Numpy比C慢,原因有两个:Python开销(可能类似于cffi)和通用性。 Numpy旨在以一堆不同的数据类型处理任意维度的数组。您的cffi示例是针对2D浮点数组生成的。成本是编写几行代码与.sum(),6个字符以节省不到5微秒。 (但当然,你已经知道了这一点)。我只想强调CPU时间便宜,比开发人员时间便宜得多。

Now, if you want to stick to Numpy, and you want to get a better performance, your best option is to use Bottleneck. They provide a few functions optimised for 1 and 2D arrays of float and doubles, and they are blazing fast. In your case, 16 times faster, which will put execution time in 0.35, or about twice as fast as cffi.

现在,如果你想坚持Numpy,并且想要获得更好的性能,那么你最好的选择就是使用Bottleneck。它们提供了一些针对浮动和双打的1和2D阵列进行了优化的功能,并且它们非常快速。在你的情况下,快16倍,这将使执行时间为0.35,或大约是cffi的两倍。

For other functions that bottleneck does not have, you can use Cython. It helps you write C code with a more pythonic syntax. Or, if you will, convert progressively Python into C until you are happy with the speed.

对于瓶颈没有的其他功能,您可以使用Cython。它可以帮助您使用更加pythonic语法编写C代码。或者,如果您愿意,可以逐步将Python转换为C,直到您对速度感到满意为止。

#1


13  

Numpy is slower than C for two reasons: the Python overhead (probably similar to cffi) and generality. Numpy is designed to deal with arrays of arbitrary dimensions, in a bunch of different data types. Your example with cffi was made for a 2D array of floats. The cost was writing several lines of code vs .sum(), 6 characters to save less than 5 microseconds. (But of course, you already knew this). I just want to emphasize that CPU time is cheap, much cheaper than developer time.

Numpy比C慢,原因有两个:Python开销(可能类似于cffi)和通用性。 Numpy旨在以一堆不同的数据类型处理任意维度的数组。您的cffi示例是针对2D浮点数组生成的。成本是编写几行代码与.sum(),6个字符以节省不到5微秒。 (但当然,你已经知道了这一点)。我只想强调CPU时间便宜,比开发人员时间便宜得多。

Now, if you want to stick to Numpy, and you want to get a better performance, your best option is to use Bottleneck. They provide a few functions optimised for 1 and 2D arrays of float and doubles, and they are blazing fast. In your case, 16 times faster, which will put execution time in 0.35, or about twice as fast as cffi.

现在,如果你想坚持Numpy,并且想要获得更好的性能,那么你最好的选择就是使用Bottleneck。它们提供了一些针对浮动和双打的1和2D阵列进行了优化的功能,并且它们非常快速。在你的情况下,快16倍,这将使执行时间为0.35,或大约是cffi的两倍。

For other functions that bottleneck does not have, you can use Cython. It helps you write C code with a more pythonic syntax. Or, if you will, convert progressively Python into C until you are happy with the speed.

对于瓶颈没有的其他功能,您可以使用Cython。它可以帮助您使用更加pythonic语法编写C代码。或者,如果您愿意,可以逐步将Python转换为C,直到您对速度感到满意为止。