python中len()和sys.getsizeof()方法有什么区别?

时间:2021-11-03 18:13:46

When I ran the below code I got 3 and 36 as the answers respectively.

当我运行下面的代码时,我分别获得3和36作为答案。

x ="abd"print len(x)print sys.getsizeof(x)

Can someone explain to me what's the difference between them ?

有人可以向我解释一下它们之间有什么区别吗?

1 个解决方案

#1


44  

They are not the same thing at all.

它们根本不是一回事。

len() queries for the number of items contained in a container. For a string that's the number of characters:

len()查询容器中包含的项目数。对于字符数,即字符数:

Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).

返回对象的长度(项数)。参数可以是序列(字符串,元组或列表)或映射(字典)。

sys.getsizeof() on the other hand returns the memory size of the object:

另一方面,sys.getsizeof()返回对象的内存大小:

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

以字节为单位返回对象的大小。对象可以是任何类型的对象。所有内置对象都将返回正确的结果,但这不一定适用于第三方扩展,因为它是特定于实现的。

Python string objects are not simple sequences of characters, 1 byte per character.

Python字符串对象不是简单的字符序列,每个字符1个字节。

Specifically, the sys.getsizeof() function includes the garbage collector overhead if any:

具体来说,sys.getsizeof()函数包括垃圾收集器开销(如果有):

getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

getsizeof()调用对象的__sizeof__方法,如果对象由垃圾收集器管理,则会增加额外的垃圾收集器开销。

String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python 2, __sizeof__ method returns (in C code):

不需要跟踪字符串对象(它们不能创建循环引用),但字符串对象确实需要的内存多于每个字符的字节数。在Python 2中,__sizeof__方法返回(在C代码中):

Py_ssize_t res;res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;return PyInt_FromSsize_t(res);

where PyStringObject_SIZE is the C struct header size for the type, PyString_GET_SIZE basically is the same as len() and Py_TYPE(v)->tp_itemsize is the per-character size. In Python 2.7, for byte strings, the size per character is 1, but it's PyStringObject_SIZE that is confusing you; on my Mac that size is 37 bytes:

其中PyStringObject_SIZE是类型的C结构头大小,PyString_GET_SIZE基本上与len()和Py_TYPE(v)相同 - > tp_itemsize是每个字符的大小。在Python 2.7中,对于字节字符串,每个字符的大小为1,但它的PyStringObject_SIZE令您感到困惑;在我的Mac上,大小为37字节:

>>> sys.getsizeof('')37

For unicode strings the per-character size goes up to 2 or 4 (depending on compilation options). On Python 3.3 and newer, Unicode strings take up between 1 and 4 bytes per character, depending on the contents of the string.

对于unicode字符串,每个字符的大小最多为2或4(取决于编译选项)。在Python 3.3及更高版本中,Unicode字符串每个字符占用1到4个字节,具体取决于字符串的内容。

#1


44  

They are not the same thing at all.

它们根本不是一回事。

len() queries for the number of items contained in a container. For a string that's the number of characters:

len()查询容器中包含的项目数。对于字符数,即字符数:

Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).

返回对象的长度(项数)。参数可以是序列(字符串,元组或列表)或映射(字典)。

sys.getsizeof() on the other hand returns the memory size of the object:

另一方面,sys.getsizeof()返回对象的内存大小:

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

以字节为单位返回对象的大小。对象可以是任何类型的对象。所有内置对象都将返回正确的结果,但这不一定适用于第三方扩展,因为它是特定于实现的。

Python string objects are not simple sequences of characters, 1 byte per character.

Python字符串对象不是简单的字符序列,每个字符1个字节。

Specifically, the sys.getsizeof() function includes the garbage collector overhead if any:

具体来说,sys.getsizeof()函数包括垃圾收集器开销(如果有):

getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

getsizeof()调用对象的__sizeof__方法,如果对象由垃圾收集器管理,则会增加额外的垃圾收集器开销。

String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python 2, __sizeof__ method returns (in C code):

不需要跟踪字符串对象(它们不能创建循环引用),但字符串对象确实需要的内存多于每个字符的字节数。在Python 2中,__sizeof__方法返回(在C代码中):

Py_ssize_t res;res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;return PyInt_FromSsize_t(res);

where PyStringObject_SIZE is the C struct header size for the type, PyString_GET_SIZE basically is the same as len() and Py_TYPE(v)->tp_itemsize is the per-character size. In Python 2.7, for byte strings, the size per character is 1, but it's PyStringObject_SIZE that is confusing you; on my Mac that size is 37 bytes:

其中PyStringObject_SIZE是类型的C结构头大小,PyString_GET_SIZE基本上与len()和Py_TYPE(v)相同 - > tp_itemsize是每个字符的大小。在Python 2.7中,对于字节字符串,每个字符的大小为1,但它的PyStringObject_SIZE令您感到困惑;在我的Mac上,大小为37字节:

>>> sys.getsizeof('')37

For unicode strings the per-character size goes up to 2 or 4 (depending on compilation options). On Python 3.3 and newer, Unicode strings take up between 1 and 4 bytes per character, depending on the contents of the string.

对于unicode字符串,每个字符的大小最多为2或4(取决于编译选项)。在Python 3.3及更高版本中,Unicode字符串每个字符占用1到4个字节,具体取决于字符串的内容。