有问题或者指导欢迎致信ms08.shiroh@gmail.com来交流。
问题
平时经常使用到len函数来求一个List或者其他的长度,但是没怎么思考过,len函数的时间开销是多少。
虽然感觉上不会用从头到尾数一遍这种很弱的方法,而是会保存一个size变量,但是一切还是要用事实来说话
编程验证
首先很简单粗暴地使用编程来看看时间花销,分别验证不同长度的List,调用len函数的时间开销
A = [0 for i in range(1000)] B = [0 for i in range(100000)] C = [0 for i in range(10000000)] import time start = time.time() length = len(A) end = time.time() print length, 'time cost', end-start start = time.time() length = len(B) end = time.time() print length, 'time cost', end-start start = time.time() length = len(C) end = time.time() print length, 'time cost', end-start
结果如下:
1000 time cost 9.53674316406e-07 100000 time cost 9.53674316406e-07 10000000 time cost 0.0
可以看出的确符合我们的期望,在长度相差很大的情况下时间开销相似,第三种情况下长度最长却时间短到测不出来
源码分析
在Python目录下的include文件夹中找到了如下的文件listobject.h,其中看到一段代码
PyAPI_FUNC(PyObject *) PyList_New(Py_ssize_t size); PyAPI_FUNC(Py_ssize_t) PyList_Size(PyObject *); PyAPI_FUNC(PyObject *) PyList_GetItem(PyObject *, Py_ssize_t); PyAPI_FUNC(int) PyList_SetItem(PyObject *, Py_ssize_t, PyObject *); PyAPI_FUNC(int) PyList_Insert(PyObject *, Py_ssize_t, PyObject *); PyAPI_FUNC(int) PyList_Append(PyObject *, PyObject *);
看出PyList_Size对应的就是len函数。而在object.h中看到一段代码
/* PyObject_VAR_HEAD defines the initial segment of all variable-size * container objects. These end with a declaration of an array with 1 * element, but enough space is malloc'ed so that the array actually * has room for ob_size elements. Note that ob_size is an element count, * not necessarily a byte count. */ #define PyObject_VAR_HEAD \ PyObject_HEAD \ Py_ssize_t ob_size; /* Number of items in variable part */
所以的确是保存了一个变量记录size,当我们调用len函数时候直接找到这个size就行了,所以无论多长的List,调用len函数花费的时间基本相同,可以认为len函数时间开销为O(1)