In several C++ examples I see a use of the type size_t where I would have used a simple int. What's the difference, and why size_t should be better?
在一些c++示例中,我看到了size_t类型的使用,我应该使用一个简单的int类型。
5 个解决方案
#1
120
From the friendly Wikipedia:
从友好的*:
The stdlib.h and stddef.h header files define a datatype called size_t which is used to represent the size of an object. Library functions that take sizes expect them to be of type size_t, and the sizeof operator evaluates to size_t.
stdlib。h和stddef。h头文件定义一个名为size_t的数据类型,用于表示对象的大小。使用大小的库函数期望它们是size_t类型,而sizeof运算符计算为size_t。
The actual type of size_t is platform-dependent; a common mistake is to assume size_t is the same as unsigned int, which can lead to programming errors, particularly as 64-bit architectures become more prevalent.
size_t的实际类型依赖于平台;一个常见的错误是假定size_t与无符号int是相同的,这可能导致编程错误,尤其是当64位体系结构变得越来越普遍时。
Also, check Why size_t matters
此外,还要检查为什么size_t很重要
#2
23
size_t is the type used to represent sizes (as its names implies). Its platform (and even potentially implementation) dependent, and should be used only for this purpose. Obviously, representing a size, size_t is unsigned. Many stdlib functions, including malloc, sizeof and various string operation functions use size_t as a datatype.
size_t是用来表示大小的类型(顾名思义)。它的平台(甚至潜在的实现)依赖于此,并且应该只用于此目的。显然,表示大小的size_t是无符号的。许多stdlib函数(包括malloc、sizeof和各种字符串操作函数)使用size_t作为数据类型。
An int is signed by default, and even though its size is also platform dependant, it will be a fixed 32bits on most modern machine (and though size_t is 64 bits on 64-bits architecture, int remain 32bits long on those architectures).
默认情况下,int是签名的,尽管它的大小也依赖于平台,但它在大多数现代机器上是一个固定的32位(虽然size_t是64位架构的64位,但在这些架构上,int仍然保持32位)。
To summarize : use size_t to represent the size of an object and int (or long) in other cases.
总结:在其他情况下,使用size_t表示对象的大小和int(或long)。
#3
7
It's because size_t can be anything other than an int (maybe a struct). The idea is that it decouples it's job from the underlying type.
因为size_t可以是除了int(可能是struct)之外的任何东西。它的想法是将它的工作与底层类型分离开来。
#4
3
The size_t
type is defined as the unsigned integral type of the sizeof
operator. In the real world, you will often see int
defined as 32 bits (for backward compatibility) but size_t
defined as 64 bits (so you can declare arrays and structures more than 4 GiB in size) on 64-bit platforms. If a long int
is also 64-bits, this is called the LP64 convention; if long int
is 32 bits but long long int
and pointers are 64 bits, that’s LLP64. You also might get the reverse, a program that uses 64-bit instructions for speed, but 32-bit pointers to save memory. Also, int
is signed and size_t
is unsigned.
size_t类型定义为sizeof运算符的无符号整型。在现实世界中,您经常会看到int定义为32位(用于向后兼容性),而size_t定义为64位(因此您可以在64位平台上声明数组和大于4 GiB的结构)。如果一个长整数也是64位,这称为LP64约定;如果长整数是32位,长整数和指针是64位,那就是LLP64。您也可能会得到相反的结果,一个使用64位指令的程序,但是32位指针可以节省内存。此外,int是有符号的,size_t是无符号的。
There were historically a number of other platforms where addresses were wider or shorter than the native size of int
. In fact, in the ’70s and early ’80s, this was more common than not: all the popular 8-bit microcomputers had 8-bit registers and 16-bit addresses, and the transition between 16 and 32 bits also produced many machines that had addresses wider than their registers. I occasionally still see questions here about Borland Turbo C for MS-DOS, whose Huge memory mode had 20-bit addresses stored in 32 bits on a 16-bit CPU (but which could support the 32-bit instruction set of the 80386); the Motorola 68000 had a 16-bit ALU with 32-bit registers and addresses; there were IBM mainframes with 15-bit, 24-bit or 31-bit addresses. You also still see different ALU and address-bus sizes in embedded systems.
历史上有许多其他平台地址在哪里更广泛或短于本机int的大小。事实上,在70年代和80年代早期,这是比不常见:所有流行的8位微机8位寄存器和16位地址,以及16和32位之间的过渡也产生了许多机器,比他们的寄存器地址更广泛。我偶尔还会在这里看到一些关于MS-DOS的Borland Turbo C的问题,它的巨大内存模式在16位CPU上以32位的方式存储了20位地址(但是可以支持80386的32位指令集);摩托罗拉68000有一个16位的ALU和32位寄存器和地址;IBM大型机有15位、24位或31位地址。在嵌入式系统中,您还可以看到不同的ALU和地址总线大小。
Any time int
is smaller than size_t
, and you try to store the size or offset of a very large file or object in an unsigned int
, there is the possibility that it could overflow and cause a bug. With an int
, there is also the possibility of getting a negative number. If an int
or unsigned int
is wider, the program will run correctly but waste memory.
任何时候int都小于size_t,并且您试图将一个非常大的文件或对象的大小或偏移量存储在一个无符号int中,它可能会溢出并导致错误。对于整数,也有可能得到一个负数。如果一个int型或无符号int型更宽,程序将正确运行,但会浪费内存。
You should generally use the correct type for the purpose if you want portability. A lot of people will recommend that you use signed math instead of unsigned (to avoid nasty, subtle bugs like 1U < -3
). For that purpose, the standard library defines ptrdiff_t
in <stddef.h>
as the signed type of the result of subtracting a pointer from another.
如果想要可移植性,一般应该使用正确的类型。很多人会建议您使用带符号的数学而不是无符号的(以避免像1U < -3这样讨厌的、微妙的错误)。为此,标准库在
That said, a workaround might be to bounds-check all addresses and offsets against INT_MAX
and either 0
or INT_MIN
as appropriate, and turn on the compiler warnings about comparing signed and unsigned quantities in case you miss any. You should always, always, always be checking your array accesses for overflow in C anyway.
也就是说,一个解决方案可能是:根据INT_MAX和0或INT_MIN检查所有地址和偏移量,并打开编译器警告,以防遗漏任何有符号和无符号量。无论如何,您都应该始终检查数组访问是否在C中溢出。
#5
0
The definition of SIZE_T
is found at: https://msdn.microsoft.com/en-us/library/cc441980.aspx and https://msdn.microsoft.com/en-us/library/cc230394.aspx
SIZE_T的定义如下:https://msdn.microsoft.com/en-us/library/cc441980.aspx和https://msdn.microsoft.com/en-us/library/cc230394.aspx
Pasting here the required information:
在此粘贴所需信息:
SIZE_T
is a ULONG_PTR
representing the maximum number of bytes to which a pointer can point.
SIZE_T是一个ULONG_PTR,表示一个指针可以指向的最大字节数。
This type is declared as follows:
这种类型的声明如下:
typedef ULONG_PTR SIZE_T;
A ULONG_PTR
is an unsigned long type used for pointer precision. It is used when casting a pointer to a long type to perform pointer arithmetic.
ULONG_PTR是用于指针精度的无符号长类型。它用于将指针转换为长类型以执行指针算法。
This type is declared as follows:
这种类型的声明如下:
typedef unsigned __int3264 ULONG_PTR;
#1
120
From the friendly Wikipedia:
从友好的*:
The stdlib.h and stddef.h header files define a datatype called size_t which is used to represent the size of an object. Library functions that take sizes expect them to be of type size_t, and the sizeof operator evaluates to size_t.
stdlib。h和stddef。h头文件定义一个名为size_t的数据类型,用于表示对象的大小。使用大小的库函数期望它们是size_t类型,而sizeof运算符计算为size_t。
The actual type of size_t is platform-dependent; a common mistake is to assume size_t is the same as unsigned int, which can lead to programming errors, particularly as 64-bit architectures become more prevalent.
size_t的实际类型依赖于平台;一个常见的错误是假定size_t与无符号int是相同的,这可能导致编程错误,尤其是当64位体系结构变得越来越普遍时。
Also, check Why size_t matters
此外,还要检查为什么size_t很重要
#2
23
size_t is the type used to represent sizes (as its names implies). Its platform (and even potentially implementation) dependent, and should be used only for this purpose. Obviously, representing a size, size_t is unsigned. Many stdlib functions, including malloc, sizeof and various string operation functions use size_t as a datatype.
size_t是用来表示大小的类型(顾名思义)。它的平台(甚至潜在的实现)依赖于此,并且应该只用于此目的。显然,表示大小的size_t是无符号的。许多stdlib函数(包括malloc、sizeof和各种字符串操作函数)使用size_t作为数据类型。
An int is signed by default, and even though its size is also platform dependant, it will be a fixed 32bits on most modern machine (and though size_t is 64 bits on 64-bits architecture, int remain 32bits long on those architectures).
默认情况下,int是签名的,尽管它的大小也依赖于平台,但它在大多数现代机器上是一个固定的32位(虽然size_t是64位架构的64位,但在这些架构上,int仍然保持32位)。
To summarize : use size_t to represent the size of an object and int (or long) in other cases.
总结:在其他情况下,使用size_t表示对象的大小和int(或long)。
#3
7
It's because size_t can be anything other than an int (maybe a struct). The idea is that it decouples it's job from the underlying type.
因为size_t可以是除了int(可能是struct)之外的任何东西。它的想法是将它的工作与底层类型分离开来。
#4
3
The size_t
type is defined as the unsigned integral type of the sizeof
operator. In the real world, you will often see int
defined as 32 bits (for backward compatibility) but size_t
defined as 64 bits (so you can declare arrays and structures more than 4 GiB in size) on 64-bit platforms. If a long int
is also 64-bits, this is called the LP64 convention; if long int
is 32 bits but long long int
and pointers are 64 bits, that’s LLP64. You also might get the reverse, a program that uses 64-bit instructions for speed, but 32-bit pointers to save memory. Also, int
is signed and size_t
is unsigned.
size_t类型定义为sizeof运算符的无符号整型。在现实世界中,您经常会看到int定义为32位(用于向后兼容性),而size_t定义为64位(因此您可以在64位平台上声明数组和大于4 GiB的结构)。如果一个长整数也是64位,这称为LP64约定;如果长整数是32位,长整数和指针是64位,那就是LLP64。您也可能会得到相反的结果,一个使用64位指令的程序,但是32位指针可以节省内存。此外,int是有符号的,size_t是无符号的。
There were historically a number of other platforms where addresses were wider or shorter than the native size of int
. In fact, in the ’70s and early ’80s, this was more common than not: all the popular 8-bit microcomputers had 8-bit registers and 16-bit addresses, and the transition between 16 and 32 bits also produced many machines that had addresses wider than their registers. I occasionally still see questions here about Borland Turbo C for MS-DOS, whose Huge memory mode had 20-bit addresses stored in 32 bits on a 16-bit CPU (but which could support the 32-bit instruction set of the 80386); the Motorola 68000 had a 16-bit ALU with 32-bit registers and addresses; there were IBM mainframes with 15-bit, 24-bit or 31-bit addresses. You also still see different ALU and address-bus sizes in embedded systems.
历史上有许多其他平台地址在哪里更广泛或短于本机int的大小。事实上,在70年代和80年代早期,这是比不常见:所有流行的8位微机8位寄存器和16位地址,以及16和32位之间的过渡也产生了许多机器,比他们的寄存器地址更广泛。我偶尔还会在这里看到一些关于MS-DOS的Borland Turbo C的问题,它的巨大内存模式在16位CPU上以32位的方式存储了20位地址(但是可以支持80386的32位指令集);摩托罗拉68000有一个16位的ALU和32位寄存器和地址;IBM大型机有15位、24位或31位地址。在嵌入式系统中,您还可以看到不同的ALU和地址总线大小。
Any time int
is smaller than size_t
, and you try to store the size or offset of a very large file or object in an unsigned int
, there is the possibility that it could overflow and cause a bug. With an int
, there is also the possibility of getting a negative number. If an int
or unsigned int
is wider, the program will run correctly but waste memory.
任何时候int都小于size_t,并且您试图将一个非常大的文件或对象的大小或偏移量存储在一个无符号int中,它可能会溢出并导致错误。对于整数,也有可能得到一个负数。如果一个int型或无符号int型更宽,程序将正确运行,但会浪费内存。
You should generally use the correct type for the purpose if you want portability. A lot of people will recommend that you use signed math instead of unsigned (to avoid nasty, subtle bugs like 1U < -3
). For that purpose, the standard library defines ptrdiff_t
in <stddef.h>
as the signed type of the result of subtracting a pointer from another.
如果想要可移植性,一般应该使用正确的类型。很多人会建议您使用带符号的数学而不是无符号的(以避免像1U < -3这样讨厌的、微妙的错误)。为此,标准库在
That said, a workaround might be to bounds-check all addresses and offsets against INT_MAX
and either 0
or INT_MIN
as appropriate, and turn on the compiler warnings about comparing signed and unsigned quantities in case you miss any. You should always, always, always be checking your array accesses for overflow in C anyway.
也就是说,一个解决方案可能是:根据INT_MAX和0或INT_MIN检查所有地址和偏移量,并打开编译器警告,以防遗漏任何有符号和无符号量。无论如何,您都应该始终检查数组访问是否在C中溢出。
#5
0
The definition of SIZE_T
is found at: https://msdn.microsoft.com/en-us/library/cc441980.aspx and https://msdn.microsoft.com/en-us/library/cc230394.aspx
SIZE_T的定义如下:https://msdn.microsoft.com/en-us/library/cc441980.aspx和https://msdn.microsoft.com/en-us/library/cc230394.aspx
Pasting here the required information:
在此粘贴所需信息:
SIZE_T
is a ULONG_PTR
representing the maximum number of bytes to which a pointer can point.
SIZE_T是一个ULONG_PTR,表示一个指针可以指向的最大字节数。
This type is declared as follows:
这种类型的声明如下:
typedef ULONG_PTR SIZE_T;
A ULONG_PTR
is an unsigned long type used for pointer precision. It is used when casting a pointer to a long type to perform pointer arithmetic.
ULONG_PTR是用于指针精度的无符号长类型。它用于将指针转换为长类型以执行指针算法。
This type is declared as follows:
这种类型的声明如下:
typedef unsigned __int3264 ULONG_PTR;