为什么python中的普通循环运行速度比C ++中的慢?以及如何优化? [重复]

时间:2022-07-04 21:25:14

This question already has an answer here:

这个问题在这里已有答案:

simply run a near empty for loop in python and in C++ (as following), the speed are very different, the python is more than a hundred times slower.

简单地在python和C ++中运行一个近空的for循环(如下所示),速度非常不同,python慢​​了一百多倍。

a = 0
for i in xrange(large_const):
  a += 1
int a = 0;
for (int i = 0; i < large_const; i++)
  a += 1;

Plus, what can I do to optimize the speed of python?

另外,我该怎么做才能优化python的速度?

(Addition: I made a bad example here in the first version of this question, I don't really mean that a=1 so that C/C++ compiler could optimize that, I mean the loop itself consumed a lot of resource (maybe I should use a+=1 as example).. And what I mean by how to optimize is that if the for loop is just like a += 1 that simple, how could it be run in the similar speed as C/C++? In my practice, I used Numpy so I can't use pypy anymore (for now), is there some general methods for making loop far more quickly (such as generator in generating list)? )

(另外:我在这个问题的第一个版本中做了一个不好的例子,我并不是说a = 1因此C / C ++编译器可以优化它,我的意思是循环本身消耗了大量资源(也许我应该使用+ = 1作为示例.. ..我的意思是如何优化是如果for循环就像一个+ = 1这么简单,它怎么能以与C / C ++类似的速度运行?在我的练习,我使用了Numpy所以我不能再使用pypy了(现在),是否有一些通用的方法可以更快地制作循环(例如生成列表中的生成器)?)

4 个解决方案

#1


11  

A smart C compiler can probably optimize your loop away by recognizing that at the end, a will always be 1. Python can't do that because when iterating over xrange, it needs to call __next__ on the xrange object until it raises StopIteration. python can't know if __next__ will have side-effect until it calls it, so there is no way to optimize the loop away. The take-away message from this paragraph is that it is MUCH HARDER to optimize a Python "compiler" than a C compiler because python is such a dynamic language and requires the compiler to know how the object will behave in certain circumstances. In C, that's much easier because C knows exactly what type every object is ahead of time.

一个聪明的C编译器可能可以优化你的循环路程承认,在年底,将始终为1 Python不能这样做,因为遍历的xrange时,需要调用__next__的x范围对象,直到抛出StopIteration异常。 python无法知道__next__在调用它之前是否会产生副作用,因此无法优化循环。从本段的外卖的消息是,它是更难优化的Python“编译”比C编译器,因为蟒蛇是这样的动态语言,并要求编译器知道如何对象将在某些情况下的行为。在C中,这更容易,因为C确切地知道每个对象提前的类型。

Of course, compiler aside, python needs to do a lot more work. In C, you're working with base types using operations supported in hardware instructions. In python, the interpreter is interpreting the byte-code one line at a time in software. Clearly that is going to take longer than machine level instructions. And the data model (e.g. calling __next__ over and over again) can also lead to a lot of function calls which the C doesn't need to do. Of course, python does this stuff to make it much more flexible than you can have in a compiled language.

当然,除了编译器之外,python还需要做更多的工作。在C中,您使用硬件指令支持的操作来处理基类型。在python中,解释器在软件中一次解释一行字节码。显然,这比机器级指令要花费更长的时间。并且数据模型(例如,一遍又一遍地调用__next__)也可以导致许多函数调用,而C不需要这样做。当然,python做这件事使它比编译语言更灵活。

The typical way to speed up python code is to use libraries or intrinsic functions which provide a high level interface to low-level compiled code. scipy and numpy are excellent examples this kind of library. Other things you can look into are using pypy which includes a JIT compiler -- you probably won't reach native speeds, but it'll probably beat Cpython (the most common implementation), or writing extensions in C/fortran using the Cpython-API, cython or f2py for performance critical sections of code.

加速python代码的典型方法是使用库或内部函数,它们为低级编译代码提供高级接口。 scipy和numpy是这类图书馆的绝佳范例。你可以看到的其他东西是使用包含JIT编译器的pypy - 你可能无法达到本机速度,但它可能会击败Cpython(最常见的实现),或使用Cpython在C / fortran中编写扩展 - 用于性能关键代码段的API,cython或f2py。

#2


5  

Simply because Python is a more high level language and has to do more different things on every iteration (like acquiring locks, resolving variables etc.)

只是因为Python是一种更高级的语言,并且必须在每次迭代时做更多不同的事情(比如获取锁,解析变量等)

“How to optimise” is a very vague question. There is no “general” way to optimise any Python program (everythng possible was already done by the developers of Python). Your particular example can be optimsed this way:

“如何优化”是一个非常模糊的问题。没有“通用”方法来优化任何Python程序(Python的开发人员已经完成了所有可能的工作)。您可以通过以下方式优化您的特定示例:

a = 1

That's what any C compiler will do, by the way.

顺便说一下,这就是任何C编译器都会做的事情。

If your program works with numeric data, then using numpy and its vectorised routines often gives you a great performance boost, as it does everything in pure C (using C loops, not Python ones) and doesn't have to take interpreter lock and all this stuff.

如果您的程序使用数字数据,那么使用numpy及其矢量化例程通常会为您提供出色的性能提升,因为它可以完成纯C中的所有操作(使用C循环,而不是Python循环)并且不必使用解释器锁定这个东西。

#3


0  

As you go more abstract the speed will go down. The fastest code is assembly code which is written directly.

当你走得更抽象时,速度会下降。最快的代码是直接编写的汇编代码。

Read this question Why are Python Programs often slower than the Equivalent Program Written in C or C++?

阅读此问题为什么Python程序通常比用C或C ++编写的等效程序慢?

#4


-1  

Python is (usually) an interpreted language, meaning that the script has to be read line-by-line at runtime and its instructions compiled into usable bytecode at that point.

Python(通常)是一种解释型语言,这意味着必须在运行时逐行读取脚本,并在此时将其指令编译为可用的字节码。

C is (usually) a compiled language, so by the time you're running it you're working with pure machine code.

C(通常)是一种编译语言,所以当你运行它时,你正在使用纯机器代码。

Python will never be as fast as C, for that reason.

因此,Python永远不会像C一样快。

Edit: In fact, python compiles INTO C code at run time, that's why you get those .pyc files.

编辑:实际上,python在运行时编译INTO C代码,这就是你得到那些.pyc文件的原因。

#1


11  

A smart C compiler can probably optimize your loop away by recognizing that at the end, a will always be 1. Python can't do that because when iterating over xrange, it needs to call __next__ on the xrange object until it raises StopIteration. python can't know if __next__ will have side-effect until it calls it, so there is no way to optimize the loop away. The take-away message from this paragraph is that it is MUCH HARDER to optimize a Python "compiler" than a C compiler because python is such a dynamic language and requires the compiler to know how the object will behave in certain circumstances. In C, that's much easier because C knows exactly what type every object is ahead of time.

一个聪明的C编译器可能可以优化你的循环路程承认,在年底,将始终为1 Python不能这样做,因为遍历的xrange时,需要调用__next__的x范围对象,直到抛出StopIteration异常。 python无法知道__next__在调用它之前是否会产生副作用,因此无法优化循环。从本段的外卖的消息是,它是更难优化的Python“编译”比C编译器,因为蟒蛇是这样的动态语言,并要求编译器知道如何对象将在某些情况下的行为。在C中,这更容易,因为C确切地知道每个对象提前的类型。

Of course, compiler aside, python needs to do a lot more work. In C, you're working with base types using operations supported in hardware instructions. In python, the interpreter is interpreting the byte-code one line at a time in software. Clearly that is going to take longer than machine level instructions. And the data model (e.g. calling __next__ over and over again) can also lead to a lot of function calls which the C doesn't need to do. Of course, python does this stuff to make it much more flexible than you can have in a compiled language.

当然,除了编译器之外,python还需要做更多的工作。在C中,您使用硬件指令支持的操作来处理基类型。在python中,解释器在软件中一次解释一行字节码。显然,这比机器级指令要花费更长的时间。并且数据模型(例如,一遍又一遍地调用__next__)也可以导致许多函数调用,而C不需要这样做。当然,python做这件事使它比编译语言更灵活。

The typical way to speed up python code is to use libraries or intrinsic functions which provide a high level interface to low-level compiled code. scipy and numpy are excellent examples this kind of library. Other things you can look into are using pypy which includes a JIT compiler -- you probably won't reach native speeds, but it'll probably beat Cpython (the most common implementation), or writing extensions in C/fortran using the Cpython-API, cython or f2py for performance critical sections of code.

加速python代码的典型方法是使用库或内部函数,它们为低级编译代码提供高级接口。 scipy和numpy是这类图书馆的绝佳范例。你可以看到的其他东西是使用包含JIT编译器的pypy - 你可能无法达到本机速度,但它可能会击败Cpython(最常见的实现),或使用Cpython在C / fortran中编写扩展 - 用于性能关键代码段的API,cython或f2py。

#2


5  

Simply because Python is a more high level language and has to do more different things on every iteration (like acquiring locks, resolving variables etc.)

只是因为Python是一种更高级的语言,并且必须在每次迭代时做更多不同的事情(比如获取锁,解析变量等)

“How to optimise” is a very vague question. There is no “general” way to optimise any Python program (everythng possible was already done by the developers of Python). Your particular example can be optimsed this way:

“如何优化”是一个非常模糊的问题。没有“通用”方法来优化任何Python程序(Python的开发人员已经完成了所有可能的工作)。您可以通过以下方式优化您的特定示例:

a = 1

That's what any C compiler will do, by the way.

顺便说一下,这就是任何C编译器都会做的事情。

If your program works with numeric data, then using numpy and its vectorised routines often gives you a great performance boost, as it does everything in pure C (using C loops, not Python ones) and doesn't have to take interpreter lock and all this stuff.

如果您的程序使用数字数据,那么使用numpy及其矢量化例程通常会为您提供出色的性能提升,因为它可以完成纯C中的所有操作(使用C循环,而不是Python循环)并且不必使用解释器锁定这个东西。

#3


0  

As you go more abstract the speed will go down. The fastest code is assembly code which is written directly.

当你走得更抽象时,速度会下降。最快的代码是直接编写的汇编代码。

Read this question Why are Python Programs often slower than the Equivalent Program Written in C or C++?

阅读此问题为什么Python程序通常比用C或C ++编写的等效程序慢?

#4


-1  

Python is (usually) an interpreted language, meaning that the script has to be read line-by-line at runtime and its instructions compiled into usable bytecode at that point.

Python(通常)是一种解释型语言,这意味着必须在运行时逐行读取脚本,并在此时将其指令编译为可用的字节码。

C is (usually) a compiled language, so by the time you're running it you're working with pure machine code.

C(通常)是一种编译语言,所以当你运行它时,你正在使用纯机器代码。

Python will never be as fast as C, for that reason.

因此,Python永远不会像C一样快。

Edit: In fact, python compiles INTO C code at run time, that's why you get those .pyc files.

编辑:实际上,python在运行时编译INTO C代码,这就是你得到那些.pyc文件的原因。