When I ran the below code I got 3 and 36 as the answers respectively.
当我运行下面的代码时,我分别获得3和36作为答案。
x ="abd"print len(x)print sys.getsizeof(x)
Can someone explain to me what's the difference between them ?
有人可以向我解释一下它们之间有什么区别吗?
1 个解决方案
#1
44
They are not the same thing at all.
它们根本不是一回事。
len()
queries for the number of items contained in a container. For a string that's the number of characters:
len()查询容器中包含的项目数。对于字符数,即字符数:
Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).
返回对象的长度(项数)。参数可以是序列(字符串,元组或列表)或映射(字典)。
sys.getsizeof()
on the other hand returns the memory size of the object:
另一方面,sys.getsizeof()返回对象的内存大小:
Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.
以字节为单位返回对象的大小。对象可以是任何类型的对象。所有内置对象都将返回正确的结果,但这不一定适用于第三方扩展,因为它是特定于实现的。
Python string objects are not simple sequences of characters, 1 byte per character.
Python字符串对象不是简单的字符序列,每个字符1个字节。
Specifically, the sys.getsizeof()
function includes the garbage collector overhead if any:
具体来说,sys.getsizeof()函数包括垃圾收集器开销(如果有):
getsizeof()
calls the object’s__sizeof__
method and adds an additional garbage collector overhead if the object is managed by the garbage collector.getsizeof()调用对象的__sizeof__方法,如果对象由垃圾收集器管理,则会增加额外的垃圾收集器开销。
String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python 2, __sizeof__
method returns (in C code):
不需要跟踪字符串对象(它们不能创建循环引用),但字符串对象确实需要的内存多于每个字符的字节数。在Python 2中,__sizeof__方法返回(在C代码中):
Py_ssize_t res;res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;return PyInt_FromSsize_t(res);
where PyStringObject_SIZE
is the C struct header size for the type, PyString_GET_SIZE
basically is the same as len()
and Py_TYPE(v)->tp_itemsize
is the per-character size. In Python 2.7, for byte strings, the size per character is 1, but it's PyStringObject_SIZE
that is confusing you; on my Mac that size is 37 bytes:
其中PyStringObject_SIZE是类型的C结构头大小,PyString_GET_SIZE基本上与len()和Py_TYPE(v)相同 - > tp_itemsize是每个字符的大小。在Python 2.7中,对于字节字符串,每个字符的大小为1,但它的PyStringObject_SIZE令您感到困惑;在我的Mac上,大小为37字节:
>>> sys.getsizeof('')37
For unicode
strings the per-character size goes up to 2 or 4 (depending on compilation options). On Python 3.3 and newer, Unicode strings take up between 1 and 4 bytes per character, depending on the contents of the string.
对于unicode字符串,每个字符的大小最多为2或4(取决于编译选项)。在Python 3.3及更高版本中,Unicode字符串每个字符占用1到4个字节,具体取决于字符串的内容。
#1
44
They are not the same thing at all.
它们根本不是一回事。
len()
queries for the number of items contained in a container. For a string that's the number of characters:
len()查询容器中包含的项目数。对于字符数,即字符数:
Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).
返回对象的长度(项数)。参数可以是序列(字符串,元组或列表)或映射(字典)。
sys.getsizeof()
on the other hand returns the memory size of the object:
另一方面,sys.getsizeof()返回对象的内存大小:
Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.
以字节为单位返回对象的大小。对象可以是任何类型的对象。所有内置对象都将返回正确的结果,但这不一定适用于第三方扩展,因为它是特定于实现的。
Python string objects are not simple sequences of characters, 1 byte per character.
Python字符串对象不是简单的字符序列,每个字符1个字节。
Specifically, the sys.getsizeof()
function includes the garbage collector overhead if any:
具体来说,sys.getsizeof()函数包括垃圾收集器开销(如果有):
getsizeof()
calls the object’s__sizeof__
method and adds an additional garbage collector overhead if the object is managed by the garbage collector.getsizeof()调用对象的__sizeof__方法,如果对象由垃圾收集器管理,则会增加额外的垃圾收集器开销。
String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python 2, __sizeof__
method returns (in C code):
不需要跟踪字符串对象(它们不能创建循环引用),但字符串对象确实需要的内存多于每个字符的字节数。在Python 2中,__sizeof__方法返回(在C代码中):
Py_ssize_t res;res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;return PyInt_FromSsize_t(res);
where PyStringObject_SIZE
is the C struct header size for the type, PyString_GET_SIZE
basically is the same as len()
and Py_TYPE(v)->tp_itemsize
is the per-character size. In Python 2.7, for byte strings, the size per character is 1, but it's PyStringObject_SIZE
that is confusing you; on my Mac that size is 37 bytes:
其中PyStringObject_SIZE是类型的C结构头大小,PyString_GET_SIZE基本上与len()和Py_TYPE(v)相同 - > tp_itemsize是每个字符的大小。在Python 2.7中,对于字节字符串,每个字符的大小为1,但它的PyStringObject_SIZE令您感到困惑;在我的Mac上,大小为37字节:
>>> sys.getsizeof('')37
For unicode
strings the per-character size goes up to 2 or 4 (depending on compilation options). On Python 3.3 and newer, Unicode strings take up between 1 and 4 bytes per character, depending on the contents of the string.
对于unicode字符串,每个字符的大小最多为2或4(取决于编译选项)。在Python 3.3及更高版本中,Unicode字符串每个字符占用1到4个字节,具体取决于字符串的内容。