何时使用std::size_t?

时间:2021-03-28 17:06:26

I'm just wondering should I use std::size_t for loops and stuff instead of int? For instance:

我只是想知道我是否应该使用std: size_t来表示循环之类的东西而不是int?例如:

#include <cstdint>

int main()
{
    for (std::size_t i = 0; i < 10; ++i) {
        // std::size_t OK here? Or should I use, say, unsigned int instead?
    }
}

In general, what is the best practice regarding when to use std::size_t?

一般来说,关于何时使用std::size_t的最佳实践是什么?

13 个解决方案

#1


147  

A good rule of thumb is for anything that you need to compare in the loop condition against something that is naturally a std::size_t itself.

一个很好的经验法则是,对于任何需要在循环条件下与自然的std: size_t本身进行比较的东西。

std::size_t is the type of any sizeof expression and as is guaranteed to be able to express the maximum size of any object (including any array) in C++. By extension it is also guaranteed to be big enough for any array index so it is a natural type for a loop by index over an array.

size_t是任何sizeof表达式的类型,并且保证能够在c++中表示任何对象(包括任何数组)的最大大小。通过扩展,它还保证足够大,可以用于任何数组索引,因此它是对数组上的索引进行循环的自然类型。

If you are just counting up to a number then it may be more natural to use either the type of the variable that holds that number or an int or unsigned int (if large enough) as these should be a natural size for the machine.

如果您只是数一个数字,那么使用包含该数字的变量的类型或者使用int或unsigned int(如果足够大)的类型(如果足够大)可能更自然,因为这些类型应该是机器的自然大小。

#2


62  

size_t is the result type of the sizeof operator.

size_t是sizeof运算符的结果类型。

Use size_t for variables that model size or index in an array. size_t conveys semantics: you immediately know it represents a size in bytes or an index, rather than just another integer.

对于在数组中建模大小或索引的变量,请使用size_t。size_t表达了语义:您马上就知道它表示一个字节或一个索引的大小,而不仅仅是一个整数。

Also, using size_t to represent a size in bytes helps making the code portable.

此外,使用size_t表示字节大小有助于使代码具有可移植性。

#3


25  

The size_t type is meant to specify the size of something so it's natural to use it, for example, getting the length of a string and then processing each character:

size_t类型的含义是指定某物的大小,因此使用它是很自然的,例如,获取字符串的长度,然后处理每个字符:

for (size_t i = 0, max = strlen (str); i < max; i++)
    doSomethingWith (str[i]);

You do have to watch out for boundary conditions of course, since it's an unsigned type. The boundary at the top end is not usually that important since the maximum is usually large (though it is possible to get there). Most people just use an int for that sort of thing because they rarely have structures or arrays that get big enough to exceed the capacity of that int.

当然,你必须注意边界条件,因为它是无符号类型。顶端的边界通常不是那么重要,因为最大值通常是很大的(尽管有可能达到)。大多数人只是对这类事情使用int类型,因为他们很少有足够大的结构或数组来超过int类型的容量。

But watch out for things like:

但要注意以下事项:

for (size_t i = strlen (str) - 1; i >= 0; i--)

which will cause an infinite loop due to the wrapping behaviour of unsigned values (although I've seen compilers warn against this). This can also be alleviated by the (slightly harder to understand but at least immune to wrapping problems):

由于无符号值的包装行为(尽管我已经看到编译器对此发出警告),这将导致无限循环。这也可以通过(稍微难理解但至少对包装问题免疫):

for (size_t i = strlen (str); i-- > 0; )

By shifting the decrement into a post-check side-effect of the continuation condition, this does the check for continuation on the value before decrement, but still uses the decremented value inside the loop (which is why the loop runs from len .. 1 rather than len-1 .. 0).

通过将减量转换为连续条件的后检查副作用,可以在减量之前对值进行继续检查,但仍然在循环中使用减量值(这就是循环从len运行的原因)。1而不是len1。0)。

#4


12  

By definition, size_t is the result of the sizeof operator. size_t was created to refer to sizes.

根据定义,size_t是sizeof运算符的结果。创建size_t是为了引用大小。

The number of times you do something (10, in your example) is not about sizes, so why use size_t? int, or unsigned int, should be ok.

你做某件事的次数(在你的例子中是10)与大小无关,那么为什么要使用size_t呢?int,或无符号int,应该没问题。

Of course it is also relevant what you do with i inside the loop. If you pass it to a function which takes an unsigned int, for example, pick unsigned int.

当然,在循环中使用i也有关系。如果您将它传递给一个函数,该函数接受无符号int,例如,选择无符号int。

In any case, I recommend to avoid implicit type conversions. Make all type conversions explicit.

无论如何,我建议避免隐式类型转换。使所有类型转换都显式。

#5


8  

size_t is a very readable way to specify the size dimension of an item - length of a string, amount of bytes a pointer takes, etc. It's also portable across platforms - you'll find that 64bit and 32bit both behave nicely with system functions and size_t - something that unsigned int might not do (e.g. when should you use unsigned long

size_t是一个非常可读的方式来指定一个项目的尺寸大小,一个字符串的长度,数量的字节的指针,等等。这也是跨平台移植——你会发现64位和32位都表现很好地与系统功能和size_t unsigned int可能不会做的事情(比如当你应该使用无符号长

#6


6  

Use std::size_t for indexing/counting C-style arrays.

使用std: size_t索引/计数c风格的数组。

For STL containers, you'll have (for example) vector<int>::size_type, which should be used for indexing and counting vector elements.

对于STL容器,您将拥有(例如)向量 ::size_type,它应该用于索引和计数向量元素。

In practice, they are usually both unsigned ints, but it isn't guaranteed, especially when using custom allocators.

实际上,它们通常都是未签名的ints,但不能保证,尤其是使用自定义分配器时。

#7


6  

Soon most computers will be 64-bit architectures with 64-bit OS:es running programs operating on containers of billions of elements. Then you must use size_t instead of int as loop index, otherwise your index will wrap around at the 2^32:th element, on both 32- and 64-bit systems.

很快,大多数计算机将成为具有64位操作系统的64位体系结构:运行在数十亿个元素容器上的程序。然后你必须使用size_t而不是int作为循环指数,否则你的索引将环绕2 ^ 32:th元素,在32位和64位系统。

Prepare for the future!

为未来做准备!

#8


4  

short answer:

almost never

几乎从来没有

long answer:

Whenever you need to have a vector of char bigger that 2gb on a 32 bit system. In every other use case, using a signed type is much safer than using an unsigned type.

无论何时你需要一个大于2gb的char向量在一个32位系统上。在所有其他的用例中,使用带符号类型要比使用无符号类型安全得多。

example:

例子:

std::vector<A> data;
[...]
// calculate the index that should be used;
size_t i = calc_index(param1, param2);
// doing calculations close to the underflow of an integer is already dangerous

// do some bounds checking
if( i - 1 < 0 ) {
    // always false, because 0-1 on unsigned creates an underflow
    return LEFT_BORDER;
} else if( i >= data.size() - 1 ) {
    // if i already had an underflow, this becomes true
    return RIGHT_BORDER;
}

// now you have a bug that is very hard to track, because you never 
// get an exception or anything anymore, to detect that you actually 
// return the false border case.

return calc_something(data[i-1], data[i], data[i+1]);

The signed equivalent of size_t is ptrdiff_t, not int. But using int is still much better in most cases than size_t. ptrdiff_t is long on 32 and 64 bit systems.

与size_t等价的符号是ptrdiff_t,而不是int,但是在大多数情况下使用int仍然比size_t要好得多。ptrdiff_t长在32和64位系统上。

This means that you always have to convert to and from size_t whenever you interact with a std::containers, which not very beautiful. But on a going native conference the authors of c++ mentioned that designing std::vector with an unsigned size_t was a mistake.

这意味着,每当您与std:::container交互时,都必须转换到size_t,并转换到size_t,这不是很漂亮。但是在一个正在进行的本地会议上,c++的作者提到用无符号size_t设计std::vector是一个错误。

If your compiler gives you warnings on implicit conversions from ptrdiff_t to size_t, you can make it explicit with constructor syntax:

如果编译器对从ptrdiff_t到size_t的隐式转换给出警告,可以使用构造函数语法使其显式:

calc_something(data[size_t(i-1)], data[size_t(i)], data[size_t(i+1)]);

if just want to iterate a collection, without bounds cheking, use range based for:

如果只是想要迭代一个集合,没有边界限制,使用范围为:

for(const auto& d : data) {
    [...]
}

here some words from Bjarne Stroustrup (C++ author) at going native

以下是来自Bjarne Stroustrup (c++作者)在《going native》中的一些话

For some people this signed/unsigned design error in the STL is reason enough, to not use the std::vector, but instead an own implementation.

对于某些人来说,STL中的签名/未签名设计错误是不使用std::vector的理由,而是使用自己的实现。

#9


2  

When using size_t be careful with the following expression

使用size_t时,请注意下面的表达式

size_t i = containner.find("mytoken");
size_t x = 99;
if (i-x>-1 && i+x < containner.size()) {
    cout << containner[i-x] << " " << containner[i+x] << endl;
}

You will get false in the if expression regardless of what value you have for x. It took me several days to realize this (the code is so simple that I did not do unit test), although it only take a few minutes to figure the source of the problem. Not sure it is better to do a cast or use zero.

不管x的值是多少,在if表达式中都会得到false。我花了几天时间才意识到这一点(代码太简单了,我没有做单元测试),尽管只花了几分钟就找到了问题的根源。不确定是做石膏还是使用零为好。

if ((int)(i-x) > -1 or (i-x) >= 0)

Both ways should work. Here is my test run

两种方法应该工作。这是我的试运行

size_t i = 5;
cerr << "i-7=" << i-7 << " (int)(i-7)=" << (int)(i-7) << endl;

The output: i-7=18446744073709551614 (int)(i-7)=-2

输出:我= 18446744073709551614(int)(我)= 2

I would like other's comments.

我想听听其他人的意见。

#10


2  

size_t is returned by various libraries to indicate that the size of that container is non-zero. You use it when you get once back :0

各种库返回size_t,以表明该容器的大小是非零的。当你得到一次:0

However, in the your example above looping on a size_t is a potential bug. Consider the following:

但是,在上面的示例中,在size_t上循环是一个潜在的错误。考虑以下:

for (size_t i = thing.size(); i >= 0; --i) {
  // this will never terminate because size_t is a typedef for
  // unsigned int which can not be negative by definition
  // therefore i will always be >= 0
  printf("the never ending story. la la la la");
}

the use of unsigned integers has the potential to create these types of subtle issues. Therefore imho I prefer to use size_t only when I interact with containers/types that require it.

使用无符号整数有可能创建这些类型的微妙问题。因此,我宁愿只在与需要它的容器/类型交互时才使用size_t。

#11


-1  

size_t is an unsigned type that can hold maximum integer value for your architecture, so it is protected from integer overflows due to sign (signed int 0x7FFFFFFF incremented by 1 will give you -1) or short size (unsigned short int 0xFFFF incremented by 1 will give you 0).

size_t是一种无符号类型,它可以为您的体系结构保存最大的整数值,因此由于符号(加1的有符号整数0x7fffff会给您-1)或短的大小(加1的无符号整数0xFFFF会给您0),所以它是受保护的。

It is mainly used in array indexing/loops/address arithmetic and so on. Functions like memset() and alike accept size_t only, because theoretically you may have a block of memory of size 2^32-1 (on 32bit platform).

它主要用于数组索引/循环/地址算法等。函数像memset()和同样接受size_t,因为理论上你可能一块内存的大小2 ^ 32-1(32位平台上)。

For such simple loops don't bother and use just int.

对于这样简单的循环,不用麻烦,只使用int。

#12


-3  

size_t is an unsigned integral type, that can represent the largest integer on you system. Only use it if you need very large arrays,matrices etc.

size_t是一个无符号整数类型,它可以表示系统上最大的整数。只有当你需要非常大的数组、矩阵等时才使用它。

Some functions return an size_t and your compiler will warn you if you try to do comparisons.

有些函数返回size_t,如果您试图进行比较,编译器会警告您。

Avoid that by using a the appropriate signed/unsigned datatype or simply typecast for a fast hack.

通过使用适当的签名/无签名数据类型或简单的类型转换来避免这种情况。

#13


-5  

size_t is unsigned int. so whenever you want unsigned int you can use it.

size_t是无符号int.所以无论何时你想要无符号int你都可以使用它。

I use it when i want to specify size of the array , counter ect...

当我想指定数组的大小时,我就会使用它。

void * operator new (size_t size); is a good use of it.

#1


147  

A good rule of thumb is for anything that you need to compare in the loop condition against something that is naturally a std::size_t itself.

一个很好的经验法则是,对于任何需要在循环条件下与自然的std: size_t本身进行比较的东西。

std::size_t is the type of any sizeof expression and as is guaranteed to be able to express the maximum size of any object (including any array) in C++. By extension it is also guaranteed to be big enough for any array index so it is a natural type for a loop by index over an array.

size_t是任何sizeof表达式的类型,并且保证能够在c++中表示任何对象(包括任何数组)的最大大小。通过扩展,它还保证足够大,可以用于任何数组索引,因此它是对数组上的索引进行循环的自然类型。

If you are just counting up to a number then it may be more natural to use either the type of the variable that holds that number or an int or unsigned int (if large enough) as these should be a natural size for the machine.

如果您只是数一个数字,那么使用包含该数字的变量的类型或者使用int或unsigned int(如果足够大)的类型(如果足够大)可能更自然,因为这些类型应该是机器的自然大小。

#2


62  

size_t is the result type of the sizeof operator.

size_t是sizeof运算符的结果类型。

Use size_t for variables that model size or index in an array. size_t conveys semantics: you immediately know it represents a size in bytes or an index, rather than just another integer.

对于在数组中建模大小或索引的变量,请使用size_t。size_t表达了语义:您马上就知道它表示一个字节或一个索引的大小,而不仅仅是一个整数。

Also, using size_t to represent a size in bytes helps making the code portable.

此外,使用size_t表示字节大小有助于使代码具有可移植性。

#3


25  

The size_t type is meant to specify the size of something so it's natural to use it, for example, getting the length of a string and then processing each character:

size_t类型的含义是指定某物的大小,因此使用它是很自然的,例如,获取字符串的长度,然后处理每个字符:

for (size_t i = 0, max = strlen (str); i < max; i++)
    doSomethingWith (str[i]);

You do have to watch out for boundary conditions of course, since it's an unsigned type. The boundary at the top end is not usually that important since the maximum is usually large (though it is possible to get there). Most people just use an int for that sort of thing because they rarely have structures or arrays that get big enough to exceed the capacity of that int.

当然,你必须注意边界条件,因为它是无符号类型。顶端的边界通常不是那么重要,因为最大值通常是很大的(尽管有可能达到)。大多数人只是对这类事情使用int类型,因为他们很少有足够大的结构或数组来超过int类型的容量。

But watch out for things like:

但要注意以下事项:

for (size_t i = strlen (str) - 1; i >= 0; i--)

which will cause an infinite loop due to the wrapping behaviour of unsigned values (although I've seen compilers warn against this). This can also be alleviated by the (slightly harder to understand but at least immune to wrapping problems):

由于无符号值的包装行为(尽管我已经看到编译器对此发出警告),这将导致无限循环。这也可以通过(稍微难理解但至少对包装问题免疫):

for (size_t i = strlen (str); i-- > 0; )

By shifting the decrement into a post-check side-effect of the continuation condition, this does the check for continuation on the value before decrement, but still uses the decremented value inside the loop (which is why the loop runs from len .. 1 rather than len-1 .. 0).

通过将减量转换为连续条件的后检查副作用,可以在减量之前对值进行继续检查,但仍然在循环中使用减量值(这就是循环从len运行的原因)。1而不是len1。0)。

#4


12  

By definition, size_t is the result of the sizeof operator. size_t was created to refer to sizes.

根据定义,size_t是sizeof运算符的结果。创建size_t是为了引用大小。

The number of times you do something (10, in your example) is not about sizes, so why use size_t? int, or unsigned int, should be ok.

你做某件事的次数(在你的例子中是10)与大小无关,那么为什么要使用size_t呢?int,或无符号int,应该没问题。

Of course it is also relevant what you do with i inside the loop. If you pass it to a function which takes an unsigned int, for example, pick unsigned int.

当然,在循环中使用i也有关系。如果您将它传递给一个函数,该函数接受无符号int,例如,选择无符号int。

In any case, I recommend to avoid implicit type conversions. Make all type conversions explicit.

无论如何,我建议避免隐式类型转换。使所有类型转换都显式。

#5


8  

size_t is a very readable way to specify the size dimension of an item - length of a string, amount of bytes a pointer takes, etc. It's also portable across platforms - you'll find that 64bit and 32bit both behave nicely with system functions and size_t - something that unsigned int might not do (e.g. when should you use unsigned long

size_t是一个非常可读的方式来指定一个项目的尺寸大小,一个字符串的长度,数量的字节的指针,等等。这也是跨平台移植——你会发现64位和32位都表现很好地与系统功能和size_t unsigned int可能不会做的事情(比如当你应该使用无符号长

#6


6  

Use std::size_t for indexing/counting C-style arrays.

使用std: size_t索引/计数c风格的数组。

For STL containers, you'll have (for example) vector<int>::size_type, which should be used for indexing and counting vector elements.

对于STL容器,您将拥有(例如)向量 ::size_type,它应该用于索引和计数向量元素。

In practice, they are usually both unsigned ints, but it isn't guaranteed, especially when using custom allocators.

实际上,它们通常都是未签名的ints,但不能保证,尤其是使用自定义分配器时。

#7


6  

Soon most computers will be 64-bit architectures with 64-bit OS:es running programs operating on containers of billions of elements. Then you must use size_t instead of int as loop index, otherwise your index will wrap around at the 2^32:th element, on both 32- and 64-bit systems.

很快,大多数计算机将成为具有64位操作系统的64位体系结构:运行在数十亿个元素容器上的程序。然后你必须使用size_t而不是int作为循环指数,否则你的索引将环绕2 ^ 32:th元素,在32位和64位系统。

Prepare for the future!

为未来做准备!

#8


4  

short answer:

almost never

几乎从来没有

long answer:

Whenever you need to have a vector of char bigger that 2gb on a 32 bit system. In every other use case, using a signed type is much safer than using an unsigned type.

无论何时你需要一个大于2gb的char向量在一个32位系统上。在所有其他的用例中,使用带符号类型要比使用无符号类型安全得多。

example:

例子:

std::vector<A> data;
[...]
// calculate the index that should be used;
size_t i = calc_index(param1, param2);
// doing calculations close to the underflow of an integer is already dangerous

// do some bounds checking
if( i - 1 < 0 ) {
    // always false, because 0-1 on unsigned creates an underflow
    return LEFT_BORDER;
} else if( i >= data.size() - 1 ) {
    // if i already had an underflow, this becomes true
    return RIGHT_BORDER;
}

// now you have a bug that is very hard to track, because you never 
// get an exception or anything anymore, to detect that you actually 
// return the false border case.

return calc_something(data[i-1], data[i], data[i+1]);

The signed equivalent of size_t is ptrdiff_t, not int. But using int is still much better in most cases than size_t. ptrdiff_t is long on 32 and 64 bit systems.

与size_t等价的符号是ptrdiff_t,而不是int,但是在大多数情况下使用int仍然比size_t要好得多。ptrdiff_t长在32和64位系统上。

This means that you always have to convert to and from size_t whenever you interact with a std::containers, which not very beautiful. But on a going native conference the authors of c++ mentioned that designing std::vector with an unsigned size_t was a mistake.

这意味着,每当您与std:::container交互时,都必须转换到size_t,并转换到size_t,这不是很漂亮。但是在一个正在进行的本地会议上,c++的作者提到用无符号size_t设计std::vector是一个错误。

If your compiler gives you warnings on implicit conversions from ptrdiff_t to size_t, you can make it explicit with constructor syntax:

如果编译器对从ptrdiff_t到size_t的隐式转换给出警告,可以使用构造函数语法使其显式:

calc_something(data[size_t(i-1)], data[size_t(i)], data[size_t(i+1)]);

if just want to iterate a collection, without bounds cheking, use range based for:

如果只是想要迭代一个集合,没有边界限制,使用范围为:

for(const auto& d : data) {
    [...]
}

here some words from Bjarne Stroustrup (C++ author) at going native

以下是来自Bjarne Stroustrup (c++作者)在《going native》中的一些话

For some people this signed/unsigned design error in the STL is reason enough, to not use the std::vector, but instead an own implementation.

对于某些人来说,STL中的签名/未签名设计错误是不使用std::vector的理由,而是使用自己的实现。

#9


2  

When using size_t be careful with the following expression

使用size_t时,请注意下面的表达式

size_t i = containner.find("mytoken");
size_t x = 99;
if (i-x>-1 && i+x < containner.size()) {
    cout << containner[i-x] << " " << containner[i+x] << endl;
}

You will get false in the if expression regardless of what value you have for x. It took me several days to realize this (the code is so simple that I did not do unit test), although it only take a few minutes to figure the source of the problem. Not sure it is better to do a cast or use zero.

不管x的值是多少,在if表达式中都会得到false。我花了几天时间才意识到这一点(代码太简单了,我没有做单元测试),尽管只花了几分钟就找到了问题的根源。不确定是做石膏还是使用零为好。

if ((int)(i-x) > -1 or (i-x) >= 0)

Both ways should work. Here is my test run

两种方法应该工作。这是我的试运行

size_t i = 5;
cerr << "i-7=" << i-7 << " (int)(i-7)=" << (int)(i-7) << endl;

The output: i-7=18446744073709551614 (int)(i-7)=-2

输出:我= 18446744073709551614(int)(我)= 2

I would like other's comments.

我想听听其他人的意见。

#10


2  

size_t is returned by various libraries to indicate that the size of that container is non-zero. You use it when you get once back :0

各种库返回size_t,以表明该容器的大小是非零的。当你得到一次:0

However, in the your example above looping on a size_t is a potential bug. Consider the following:

但是,在上面的示例中,在size_t上循环是一个潜在的错误。考虑以下:

for (size_t i = thing.size(); i >= 0; --i) {
  // this will never terminate because size_t is a typedef for
  // unsigned int which can not be negative by definition
  // therefore i will always be >= 0
  printf("the never ending story. la la la la");
}

the use of unsigned integers has the potential to create these types of subtle issues. Therefore imho I prefer to use size_t only when I interact with containers/types that require it.

使用无符号整数有可能创建这些类型的微妙问题。因此,我宁愿只在与需要它的容器/类型交互时才使用size_t。

#11


-1  

size_t is an unsigned type that can hold maximum integer value for your architecture, so it is protected from integer overflows due to sign (signed int 0x7FFFFFFF incremented by 1 will give you -1) or short size (unsigned short int 0xFFFF incremented by 1 will give you 0).

size_t是一种无符号类型,它可以为您的体系结构保存最大的整数值,因此由于符号(加1的有符号整数0x7fffff会给您-1)或短的大小(加1的无符号整数0xFFFF会给您0),所以它是受保护的。

It is mainly used in array indexing/loops/address arithmetic and so on. Functions like memset() and alike accept size_t only, because theoretically you may have a block of memory of size 2^32-1 (on 32bit platform).

它主要用于数组索引/循环/地址算法等。函数像memset()和同样接受size_t,因为理论上你可能一块内存的大小2 ^ 32-1(32位平台上)。

For such simple loops don't bother and use just int.

对于这样简单的循环,不用麻烦,只使用int。

#12


-3  

size_t is an unsigned integral type, that can represent the largest integer on you system. Only use it if you need very large arrays,matrices etc.

size_t是一个无符号整数类型,它可以表示系统上最大的整数。只有当你需要非常大的数组、矩阵等时才使用它。

Some functions return an size_t and your compiler will warn you if you try to do comparisons.

有些函数返回size_t,如果您试图进行比较,编译器会警告您。

Avoid that by using a the appropriate signed/unsigned datatype or simply typecast for a fast hack.

通过使用适当的签名/无签名数据类型或简单的类型转换来避免这种情况。

#13


-5  

size_t is unsigned int. so whenever you want unsigned int you can use it.

size_t是无符号int.所以无论何时你想要无符号int你都可以使用它。

I use it when i want to specify size of the array , counter ect...

当我想指定数组的大小时,我就会使用它。

void * operator new (size_t size); is a good use of it.