为什么size_t当int足以满足数组的大小?

时间:2021-08-13 17:07:27

The C standard guarantees that an int is able to store every possible array size. At least, that's what I understand from reading §6.5.2.1, subsection 1 (Array subscripting constraints):

C标准保证int能够存储每个可能的数组大小。至少,这是我从阅读§6.5.2.1,第1小节(数组下标约束)中理解的内容:

One of the expressions shall have type ‘‘pointer to object type’’, the other expression shall have integer type, and the result has type ‘‘type’’.

其中一个表达式应具有类型''指向对象类型的指针'',另一个表达式应具有整数类型,结果具有类型''type''。

Since we shall use ints as array subscripts, why are we supposed to use size_t to determine the size of an array?

既然我们将使用int作为数组下标,为什么我们应该使用size_t来确定数组的大小?

Why does strlen() return size_t when int would suffice?

为什么strlen()在int足够时返回size_t?

5 个解决方案

#1


25  

The term "integer type" doesn't mean int - for example, char, and short are integer types.

术语“整数类型”并不意味着int - 例如,char,short是整数类型。

Just because you can use an int to subscript an array doesn't necessarily mean that it can reach all possible array elements.

仅仅因为你可以使用int来下标数组并不一定意味着它可以到达所有可能的数组元素。

More specifically about size_t vs. int, one example would be platforms where int might be a 16-bit type and size_t might be a 32-bit type (or the more common 32-bit int vs 64 bit size_t difference on today's 64-bit platforms).

更具体地说,关于size_t与int,一个例子是平台,其中int可能是16位类型,size_t可能是32位类型(或更常见的32位int与今天64位的64位size_t差异)平台)。

#2


6  

integer type is not necessarily an "int". "long long" is an integer type too, as is "size_t".

整数类型不一定是“int”。 “long long”也是一个整数类型,“size_t”也是如此。

Arrays can be larger than 2GB. This property is quite handy for those who write memory hungry programs, e.g DBMS with big buffer pools, application servers with big memory caches etc. Arrays bigger than 2GB/4GB is the whole point of 64 bit computing :)

数组可以大于2GB。对于那些编写内存饥饿程序的人来说,这个属性非常方便,例如带有大缓冲池的DBMS,带有大内存缓存的应用程序服务器等。大于2GB / 4GB的阵列是64位计算的全部要点:)

size_t for strlen(), at least sounds compatible with how C standard handles arrays, whether it makes practical sense or not, or whether somebody have seen strings that large, is another question.

strlen()的size_t,至少与C标准处理数组的方式兼容,是否具有实际意义,或者是否有人看到过大的字符串,这是另一个问题。

#3


2  

Firstly, what you quoted from the standard does not make any references to type int specifically. And no, int is not guaranteed to be sufficient to store the size of any object (including arrays) in C.

首先,您从标准中引用的内容并未特别指出对int类型的引用。不,int不能保证足以在C中存储任何对象(包括数组)的大小。

Secondly, C language does not really have "array subscriptions" specifically. The array subscription is implemented through pointer arithmetic. And the integral operand in pointer arithmetics has ptrdiff_t type. Not size_t, not int, but ptrdiff_t. It is a signed type, BTW, meaning that the value can be negative.

其次,C语言并没有特别具有“数组订阅”。数组订阅是通过指针算法实现的。指针算术中的积分操作数有ptrdiff_t类型。不是size_t,不是int,而是ptrdiff_t。它是一个带符号的类型,BTW,意味着该值可以是负数。

Thirdly, the purpose of size_t is to store the size of any object in the program (i.e. to store the result of sizeof). It is not immediately intended to be used as an array index. It just happens to work as an array index since it is guaranteed that it is always large enough to index any array. However, from an abstract point of view, "array" is a specific kind of "container" and there are other kinds of containers out there (lists-based ones, tree-based ones and so on). In generic case size_t is not sufficient to store the size of any container, which in generic case makes it a questionable choice for array indexing as well. (strlen, on the other hand, is a function that works with arrays specifically, which makes size_t appropriate there.)

第三,size_t的目的是存储程序中任何对象的大小(即存储sizeof的结果)。它不是立即用作数组索引。它恰好作为一个数组索引工作,因为它保证它总是足够大,可以索引任何数组。但是,从抽象的角度来看,“数组”是一种特定的“容器”,还有其他类型的容器(基于列表的容器,基于树的容器等)。在通用情况下,size_t不足以存储任何容器的大小,这在一般情况下也使得它成为数组索引的可疑选择。 (另一方面,strlen是一个专门用于数组的函数,这使得size_t适用于那里。)

#4


0  

When the C Standard was written, it was common for machines to have a 16-bit "int" type, and be incapable of handling any single object larger than 65535 bytes, but nonetheless be capable of handling objects larger than 32767 bytes. Since arithmetic on an unsigned int would be large enough to handle the largest size of such objects, but arithmetic on signed int would not, size_t was defined to be unsigned so as to accommodate such objects without having to use "long" computations.

编写C标准时,机器通常具有16位“int”类型,并且无法处理任何大于65535字节的单个对象,但仍然能够处理大于32767字节的对象。由于对unsigned int的算术足够大以处理这些对象的最大大小,但是对signed int的算术不会,size_t被定义为无符号的,以便容纳这些对象而不必使用“长”计算。

On machines where the maximum allowable object size is between INT_MAX and UINT_MAX, the difference between pointers to the start and end of such an object may be too large to fit in "int". While the Standard doesn't impose any requirements for how implementations should handle that, a common approach is to define integer and pointer wrap-around behavior such that if S and E are pointers to the start and end of a char[49152], then even though E-S would exceed INT_MAX, it will yield a value which, when added to S, will yield E.

在最大允许对象大小介于INT_MAX和UINT_MAX之间的机器上,指向此类对象的开头和结尾的指针之间的差异可能太大而不适合“int”。虽然标准没有强制实现应该如何处理它的任何要求,但常见的方法是定义整数和指针环绕行为,这样如果S和E是指向char [49152]的开头和结尾的指针,那么即使ES超过INT_MAX,它也会产生一个值,当加到S时,它将产生E.

Nowadays, there's seldom any real advantage to the fact that size_t is an unsigned type (since code which needs objects larger than 2GB would often need to use 64-bit pointers for other reasons) and it causes many kinds of comparisons involving object sizes to behave counter-intuitively, but the fact that sizeof expressions yield an unsigned type is sufficiently well entrenched that it's unlikely ever to change.

如今,size_t是一个无符号类型的事实很少有任何真正的优势(因为需要大于2GB的对象的代码通常需要使用64位指针,因为其他原因)并且它导致涉及对象大小的多种比较表现反直觉地说,但sizeof表达式产生无符号类型的事实已经足够根深蒂固,以至于不可能改变。

#5


-3  

size_t is a typedef of unsigned integer (such as int or long).

size_t是无符号整数的typedef(例如int或long)。

In some 64bit platforms, int can be 32bit, while size_t can be 64bit.

在某些64位平台中,int可以是32位,而size_t可以是64位。

It is used as a more standard way for size.

它被用作尺寸的更标准方式。

#1


25  

The term "integer type" doesn't mean int - for example, char, and short are integer types.

术语“整数类型”并不意味着int - 例如,char,short是整数类型。

Just because you can use an int to subscript an array doesn't necessarily mean that it can reach all possible array elements.

仅仅因为你可以使用int来下标数组并不一定意味着它可以到达所有可能的数组元素。

More specifically about size_t vs. int, one example would be platforms where int might be a 16-bit type and size_t might be a 32-bit type (or the more common 32-bit int vs 64 bit size_t difference on today's 64-bit platforms).

更具体地说,关于size_t与int,一个例子是平台,其中int可能是16位类型,size_t可能是32位类型(或更常见的32位int与今天64位的64位size_t差异)平台)。

#2


6  

integer type is not necessarily an "int". "long long" is an integer type too, as is "size_t".

整数类型不一定是“int”。 “long long”也是一个整数类型,“size_t”也是如此。

Arrays can be larger than 2GB. This property is quite handy for those who write memory hungry programs, e.g DBMS with big buffer pools, application servers with big memory caches etc. Arrays bigger than 2GB/4GB is the whole point of 64 bit computing :)

数组可以大于2GB。对于那些编写内存饥饿程序的人来说,这个属性非常方便,例如带有大缓冲池的DBMS,带有大内存缓存的应用程序服务器等。大于2GB / 4GB的阵列是64位计算的全部要点:)

size_t for strlen(), at least sounds compatible with how C standard handles arrays, whether it makes practical sense or not, or whether somebody have seen strings that large, is another question.

strlen()的size_t,至少与C标准处理数组的方式兼容,是否具有实际意义,或者是否有人看到过大的字符串,这是另一个问题。

#3


2  

Firstly, what you quoted from the standard does not make any references to type int specifically. And no, int is not guaranteed to be sufficient to store the size of any object (including arrays) in C.

首先,您从标准中引用的内容并未特别指出对int类型的引用。不,int不能保证足以在C中存储任何对象(包括数组)的大小。

Secondly, C language does not really have "array subscriptions" specifically. The array subscription is implemented through pointer arithmetic. And the integral operand in pointer arithmetics has ptrdiff_t type. Not size_t, not int, but ptrdiff_t. It is a signed type, BTW, meaning that the value can be negative.

其次,C语言并没有特别具有“数组订阅”。数组订阅是通过指针算法实现的。指针算术中的积分操作数有ptrdiff_t类型。不是size_t,不是int,而是ptrdiff_t。它是一个带符号的类型,BTW,意味着该值可以是负数。

Thirdly, the purpose of size_t is to store the size of any object in the program (i.e. to store the result of sizeof). It is not immediately intended to be used as an array index. It just happens to work as an array index since it is guaranteed that it is always large enough to index any array. However, from an abstract point of view, "array" is a specific kind of "container" and there are other kinds of containers out there (lists-based ones, tree-based ones and so on). In generic case size_t is not sufficient to store the size of any container, which in generic case makes it a questionable choice for array indexing as well. (strlen, on the other hand, is a function that works with arrays specifically, which makes size_t appropriate there.)

第三,size_t的目的是存储程序中任何对象的大小(即存储sizeof的结果)。它不是立即用作数组索引。它恰好作为一个数组索引工作,因为它保证它总是足够大,可以索引任何数组。但是,从抽象的角度来看,“数组”是一种特定的“容器”,还有其他类型的容器(基于列表的容器,基于树的容器等)。在通用情况下,size_t不足以存储任何容器的大小,这在一般情况下也使得它成为数组索引的可疑选择。 (另一方面,strlen是一个专门用于数组的函数,这使得size_t适用于那里。)

#4


0  

When the C Standard was written, it was common for machines to have a 16-bit "int" type, and be incapable of handling any single object larger than 65535 bytes, but nonetheless be capable of handling objects larger than 32767 bytes. Since arithmetic on an unsigned int would be large enough to handle the largest size of such objects, but arithmetic on signed int would not, size_t was defined to be unsigned so as to accommodate such objects without having to use "long" computations.

编写C标准时,机器通常具有16位“int”类型,并且无法处理任何大于65535字节的单个对象,但仍然能够处理大于32767字节的对象。由于对unsigned int的算术足够大以处理这些对象的最大大小,但是对signed int的算术不会,size_t被定义为无符号的,以便容纳这些对象而不必使用“长”计算。

On machines where the maximum allowable object size is between INT_MAX and UINT_MAX, the difference between pointers to the start and end of such an object may be too large to fit in "int". While the Standard doesn't impose any requirements for how implementations should handle that, a common approach is to define integer and pointer wrap-around behavior such that if S and E are pointers to the start and end of a char[49152], then even though E-S would exceed INT_MAX, it will yield a value which, when added to S, will yield E.

在最大允许对象大小介于INT_MAX和UINT_MAX之间的机器上,指向此类对象的开头和结尾的指针之间的差异可能太大而不适合“int”。虽然标准没有强制实现应该如何处理它的任何要求,但常见的方法是定义整数和指针环绕行为,这样如果S和E是指向char [49152]的开头和结尾的指针,那么即使ES超过INT_MAX,它也会产生一个值,当加到S时,它将产生E.

Nowadays, there's seldom any real advantage to the fact that size_t is an unsigned type (since code which needs objects larger than 2GB would often need to use 64-bit pointers for other reasons) and it causes many kinds of comparisons involving object sizes to behave counter-intuitively, but the fact that sizeof expressions yield an unsigned type is sufficiently well entrenched that it's unlikely ever to change.

如今,size_t是一个无符号类型的事实很少有任何真正的优势(因为需要大于2GB的对象的代码通常需要使用64位指针,因为其他原因)并且它导致涉及对象大小的多种比较表现反直觉地说,但sizeof表达式产生无符号类型的事实已经足够根深蒂固,以至于不可能改变。

#5


-3  

size_t is a typedef of unsigned integer (such as int or long).

size_t是无符号整数的typedef(例如int或long)。

In some 64bit platforms, int can be 32bit, while size_t can be 64bit.

在某些64位平台中,int可以是32位,而size_t可以是64位。

It is used as a more standard way for size.

它被用作尺寸的更标准方式。