It seems that uint32_t
is much more prevalent than uint_fast32_t
(I realise this is anecdotal evidence). That seems counter-intuitive to me, though.
uint32_t似乎比uint_fast32_t要普遍得多(我意识到这是轶事证据)。但在我看来,这似乎与直觉相反。
Almost always when I see an implementation use uint32_t
, all it really wants is an integer that can hold values up to 4,294,967,295 (usually a much lower bound somewhere between 65,535 and 4,294,967,295).
几乎总是当我看到一个实现使用uint32_t时,它真正想要的是一个整数,它能容纳的值高达4,294,967,295(通常是在65,535到4,294,967,295之间的一个很低的界限)。
It seems weird to then use uint32_t
, as the 'exactly 32 bits' guarantee is not needed, and the 'fastest available >= 32 bits' guarantee of uint_fast32_t
seem to be exactly the right idea. Moreover, while it's usually implemented, uint32_t
is not actually guaranteed to exist.
使用uint32_t似乎有些奇怪,因为“确切的32位”保证是不需要的,而uint_fast32_t的“最快可用>= 32位”保证似乎是正确的想法。此外,虽然uint32_t通常是实现的,但实际上并不保证存在。
Why, then, would uint32_t
be preferred? Is it simply better known or are there technical advantages over the other?
那么,为什么uint32_t是首选呢?它是简单的更好地为人所知,还是有技术上的优势?
10 个解决方案
#1
73
uint32_t
is guaranteed to have nearly the same properties on any platform that supports it.1
uint32_t在任何支持的平台上都具有几乎相同的属性。
uint_fast32_t
has very little guarantees about how it behaves on different systems in comparison.
相比之下,uint_fast32_t几乎不能保证它在不同系统上的行为。
If you switch to a platform where uint_fast32_t
has a different size, all code that uses uint_fast32_t
has to be retested and validated. All stability assumptions are going to be out the window. The entire system is going to work differently.
如果您切换到一个平台,其中uint_fast32_t具有不同的大小,则必须对所有使用uint_fast32_t的代码进行重新测试和验证。所有的稳定性假设都将被排除在外。整个系统将会以不同的方式工作。
When writing your code, you may not even have access to a uint_fast32_t
system that isn't 32 bits in size.
在编写代码时,您甚至可能无法访问非32位大小的uint_fast32_t系统。
uint32_t
won't work differently (except see footnote).
uint32_t不会有不同的工作方式(除了脚注)。
Correctness is more important than speed. Premature correctness is thus a better plan than premature optimization.
正确性比速度更重要。因此,不成熟的正确性是比不成熟的优化更好的计划。
In the event I was writing code for systems where I uint_fast32_t
was 64 or more bits, I might test my code for both cases and use it. Barring both need and opportunity, doing so is a bad plan.
在事件中,我为系统编写代码,其中uint_fast32_t是64位或更多位,我可能会测试这两种情况的代码并使用它。除了需要和机会,这样做是一个糟糕的计划。
Finally, uint_fast32_t
when you are storing it for any length of time or number of instances can be slower than uint32
simply due to cache size issues and memory bandwidth. Todays computers are far more often memory-bound than CPU bound, and uint_fast32_t
could be faster in isolation but not after you account for memory overhead.
最后,uint_fast32_t在存储任何时间或实例数时都可能比uint32慢,这仅仅是由于缓存大小问题和内存带宽的问题。今天的计算机通常是内存绑定的,而不是CPU绑定的,uint_fast32_t在隔离时可能会更快,但在您考虑到内存开销之后就不会了。
1 As @chux has noted in a comment, if unsigned
is larger than uint32_t
, arithmetic on uint32_t
goes through the usual integer promotions, and if not, it stays as uint32_t
. This can cause bugs. Nothing is ever perfect.
正如@chux在评论中所指出的,如果unsigned大于uint32_t,那么uint32_t的算术就会通过通常的整数提升,如果没有,它就会保持uint32_t。这可能会导致错误。没有什么是完美的。
#2
29
Why do many people use
uint32_t
rather thanuint32_fast_t
?为什么很多人使用uint32_t而不是uint32_fast_t?
Note: Mis-named uint32_fast_t
should be uint_fast32_t
.
注意:错误命名的uint32_fast_t应该是uint_fast32_t。
uint32_t
has a tighter specification than uint_fast32_t
and so makes for more consistent functionality.
uint32_t具有比uint_fast32_t更严格的规范,因此具有更一致的功能。
uint32_t
pros:
uint32_t优点:
- Various algorithms specify this type. IMO - best reason to use.
- 各种算法都指定了这种类型。最好的使用理由。
- Exact width and range known.
- 确切的宽度和范围已知。
- Arrays of this type incur no waste.
- 这种类型的数组不会产生任何浪费。
- unsigned integer math with its overflow is more predictable.
- 带溢出的无符号整数数学更容易预测。
- Closer match in range and math of other languages' 32-bit types.
- 在范围和数学上更接近其他语言的32位类型。
- Never padded.
- 从来没有的。
uint32_t
cons:
uint32_t缺点:
- Not always available (yet this is rare in 2018).
E.g.: Platforms lacking 8/16/32-bit integers (9/18/36-bit, others).
E.g.: Platforms using non-2's complement. old 2200 - 这并不总是可行的(但这在2018年是很少见的)。例:缺少8/16/32位整数(9/18/36-位,其他)的平台。例如:平台使用非2的补码。2200年老
uint_fast32_t
pros:
uint_fast32_t优点:
- Always available.
This always allow all platforms, new and old, to use fast/minimum types. - 总是可用的。这总是允许所有新老平台使用快速/最小类型。
- "Fastest" type that support 32-bit range.
- 支持32位范围的“最快”类型。
uint_fast32_t
cons:
uint_fast32_t缺点:
- Range is only minimally known. Example, it could be a 64-bit type.
- 范围是最小已知的。例如,它可以是64位类型。
- Arrays of this type may be wasteful in memory.
- 这种类型的数组在内存中可能会造成浪费。
- All answers (mine too at first), the post and comments used the wrong name
uint32_fast_t
. Looks like many just don't need and use this type. We didn't even use the right name! - 所有的答案(首先是我的),帖子和评论使用了错误的名称uint32_fast_t。看起来很多人都不需要这个类型。我们甚至都没有用对名字!
- Padding possible - (rare).
- 填充可能——(罕见)。
- In select cases, the "fastest" type may really be another type. So
uint_fast32_t
is only a 1st order approximation. - 在选择的情况下,“最快”类型可能是另一种类型。uint_fast32_t只是一阶近似。
In the end, what is best depends on the coding goal. Unless coding for very wide portability or some niched performance function, use uint32_t
.
最后,什么是最好的取决于编码目标。除非是为了非常广泛的可移植性或某种小型性能函数而编写代码,否则使用uint32_t。
There is another issue when using these types that comes into play: their rank compared to int/unsigned
在使用这些类型时,还有另一个问题:它们的级别与int/unsigned比较
Presumably uint_fastN_t
would be at least the rank of unsigned
. This is not specified, but a certain and testable condition.
假设uint_fastN_t至少是无符号的秩。这不是指定的,而是一个确定的、可测试的条件。
Thus, uintN_t
is more likely than uint_fastN_t
to be narrower the unsigned
. This means that code that uses uintN_t
math is more likely subject to integer promotions than uint_fastN_t
when concerning portability.
因此,uintN_t比uint_fastN_t更可能更窄。这意味着使用uintN_t数学的代码在可移植性方面比uint_fastN_t更容易受到整数提升的影响。
With this concern: portability advantage uint_fastN_t
with select math operations.
考虑到这个问题:使用select math操作的可移植性优势uint_fastN_t。
Side note about int32_t
rather than int_fast32_t
: On rare machines, INT_FAST32_MIN
may be -2,147,483,647 and not -2,147,483,648. The larger point: (u)intN_t
types are tightly specified and lead to portable code.
关于int32_t而不是int_fast32_t的旁注:在稀有机器上,INT_FAST32_MIN可能是-2,147,483,647,而不是-2,147,483,648。更重要的一点是:(u)intN_t类型被严格指定,导致可移植代码。
#3
24
Why do many people use
uint32_t
rather thanuint32_fast_t
?为什么很多人使用uint32_t而不是uint32_fast_t?
Silly answer:
愚蠢的回答:
- There is no standard type
uint32_fast_t
, the correct spelling isuint_fast32_t
. - 没有标准类型uint32_fast_t,正确的拼写是uint_fast32_t。
Practical answer:
实际的回答:
- Many people actually use
uint32_t
orint32_t
for their precise semantics, exactly 32 bits with unsigned wrap around arithmetic (uint32_t
) or 2's complement representation (int32_t
). Thexxx_fast32_t
types may be larger and thus inappropriate to store to binary files, use in packed arrays and structures, or send over a network. Furthermore, they may not even be faster. - 实际上,许多人使用uint32_t或int32_t来表示精确的语义,确切地说,是32位无符号的包绕算术(uint32_t)或2的补码表示(int32_t)。xxx_fast32_t类型可能更大,因此不适合存储到二进制文件中、用于填充数组和结构中或通过网络发送。此外,它们甚至可能不会更快。
Pragmatic answer:
务实的回答:
- Many people just don't know (or simply don't care) about
uint_fast32_t
, as demonstrated in comments and answers, and probably assume plainunsigned int
to have the same semantics, although many current architectures still have 16-bitint
s and some rare Museum samples have other strange int sizes less than 32. - 许多人只是不知道(或者根本不关心)uint_fast32_t,如注释和答案所示,并且可能假设普通的无符号int具有相同的语义,尽管许多当前的体系结构仍然有16位int,一些罕见的博物馆示例具有小于32的其他奇怪int大小。
UX answer:
用户体验的回答:
- Although possibly faster than
uint32_t
,uint_fast32_t
is slower to use: it takes longer to type, especially accounting for looking up spelling and semantics in the C documentation ;-) - 虽然uint_fast32_t可能比uint32_t快,但是使用uint_fast32_t要慢一些:输入需要更长的时间,特别是在C文档中查找拼写和语义的时候;
Elegance matters, (obviously opinion based):
优雅很重要(显然基于意见):
-
uint32_t
looks bad enough that many programmers prefer to define their ownu32
oruint32
type... From this perspective,uint_fast32_t
looks clumsy beyond repair. No surprise it sits on the bench with its friendsuint_least32_t
and such. - uint32_t看起来很糟糕,以至于许多程序员喜欢定义自己的u32或uint32类型……从这个角度来看,uint_fast32_t看起来笨拙得难以修复。难怪它和它的朋友uint_least32_t坐在长椅上。
#4
8
One reason is that unsigned int
is already "fastest" without the need for any special typedefs or the need to include something. So, if you need it fast, just use the fundamental int
or unsigned int
type.
While the standard does not explicitly guarantee that it is fastest, it indirectly does so by stating "Plain ints have the natural size suggested by the architecture of the execution environment" in 3.9.1. In other words, int
(or its unsigned counterpart) is what the processor is most comfortable with.
一个原因是,无符号int已经是“最快的”,不需要任何特殊类型或需要包含某些东西。因此,如果您需要它很快,只需使用基本的int类型或无符号int类型。虽然标准没有明确地保证它是最快的,但它间接地做到了这一点,在3.9.1中指出“普通的ints具有执行环境的体系结构所建议的自然大小”。换句话说,int(或它的无符号对等物)是处理器最喜欢的。
Now of course, you don't know what size unsigned int
might be. You only know it is at least as large as short
(and I seem to remember that short
must be at least 16 bits, although I can't find that in the standard now!). Usually it's just plain simply 4 bytes, but it could in theory be larger, or in extreme cases, even smaller (
although I've personally never encountered an architecture where this was the case, not even on 8-bit computers in the 1980s... maybe some microcontrollers, who knows
turns out I suffer from dementia, int
was very clearly 16 bits back then).
当然,你不知道无符号int的大小。你只知道它至少和短一样大(而且我似乎记得短必须至少是16位,尽管我现在在标准中找不到)。通常它只是简单的4个字节,但理论上它可能更大,或者在极端情况下,甚至更小(尽管我个人从未遇到过这样的架构,甚至在20世纪80年代的8位计算机上……)也许一些微控制器,谁知道我得了痴呆,int在当时是16位)。
The C++ standard doesn't bother to specify what the <cstdint>
types are or what they guarantee, it merely mentions "same as in C".
c++标准不需要指定
uint32_t
, per the C standard, guarantees that you get exactly 32 bits. Not anything different, none less and no padding bits. Sometimes this is exactly what you need, and thus it is very valuable.
uint32_t,根据C标准,保证您得到准确的32位。没有什么不同,没有空格。有时这正是你所需要的,因此它是非常有价值的。
uint_least32_t
guarantees that whatever the size is, it cannot be smaller than 32 bits (but it could very well be larger). Sometimes, but much more rarely than an exact witdh or "don't care", this is what you want.
uint_least32_t保证无论大小如何,它都不能小于32位(但它很可能更大)。有时候,但这比一个确切的witdh或“不在乎”要难得多,这就是你想要的。
Lastly, uint_fast32_t
is somewhat superfluous in my opinion, except for documentation-of-intent purposes. The C standard states "designates an integer type that is usually fastest" (note the word "usually") and explicitly mentions that it needs not be fastest for all purposes. In other words, uint_fast32_t
is just about the same as uint_least32_t
, which is usually fastest too, only no guarantee given (but no guarantee either way).
最后,在我看来,uint_fast32_t有些多余,除了用于意图文档目的之外。C标准状态“指定了一个通常是最快的整数类型”(注意单词“usually”),并明确地提到,对于所有目的来说,它不需要是最快的。换句话说,uint_fast32_t与uint_least32_t几乎相同,uint_least32_t通常也是最快的,只是没有给出保证(但没有任何保证)。
Since most of the time you either don't care about the exact size or you want exactly 32 (or 64, sometimes 16) bits, and since the "don't care" unsigned int
type is fastest anyway, this explains why uint_fast32_t
isn't so frequently used.
由于大多数情况下,您要么不关心确切的大小,要么您想要确切的32位(或64位,有时是16位),而且由于“不关心”无符号int类型是最快的,这就解释了为什么uint_fast32_t不经常使用。
#5
6
I have not seen evidence that uint32_t
be used for its range. Instead, most of the time that I've seen uint32_t
is used, it is to hold exactly 4 octets of data in various algorithms, with guaranteed wraparound and shift semantics!
我没有看到uint32_t用于其范围的证据。相反,我所见过的大多数情况下都使用uint32_t,它是在各种算法中精确地保存4个8字节的数据,具有保证的包装和移位语义!
There are also other reasons to use uint32_t
instead of uint_fast32_t
: Often it is that it will provide stable ABI. Additionally the memory usage can be known accurately. This very much offsets whatever the speed gain would be from uint_fast32_t
, whenever that type would be distinct from that of uint32_t
.
使用uint32_t而不是uint_fast32_t还有其他原因:通常它将提供稳定的ABI。此外,还可以准确地知道内存的使用情况。无论从uint_fast32_t获得的速度增益是多少,只要该类型与uint32_t不同,这个值就会大大抵消。
For values < 65536, there is already a handy type, it is called unsigned int
(unsigned short
is required to have at least that range as well, but unsigned int
is of the native word size) For values < 4294967296, there is another called unsigned long
.
对于值< 65536,已经有了一个方便的类型,它被称为无符号int(无符号短必须至少有这个范围,但无符号int是原生字大小),对于值< 4294967296,还有一个叫做无符号长。
And lastly, people do not use uint_fast32_t
because it is annoyingly long to type and easy to mistype :D
最后,人们不使用uint_fast32_t,因为输入时间太长,容易出错
#6
5
Several reasons.
以下几个原因。
- Many people don't know the 'fast' types exist.
- 许多人不知道“快速”类型的存在。
- It's more verbose to type.
- 打字比较麻烦。
- It's harder to reason about your programs behaviour when you don't know the actual size of the type.
- 当您不知道类型的实际大小时,就很难对您的程序行为进行推理。
- The standard doesn't actually pin down fastest, nor can it really what type is actually fastest can be very context dependent.
- 这个标准实际上并不是最快的,也不能确定哪种类型是最快的。
- I have seen no evidence of platform developers putting any thought into the size of these types when defining their platforms. For example on x86-64 Linux the "fast" types are all 64-bit even though x86-64 has hardware support for fast operations on 32-bit values.
- 我没有看到平台开发人员在定义他们的平台时考虑到这些类型的大小。例如,在x86-64 Linux上,“快速”类型都是64位的,尽管x86-64具有对32位值的快速操作的硬件支持。
In summary the "fast" types are worthless garbage. If you really need to figure out what type is fastest for a given application you need to benchmark your code on your compiler.
总之,“快速”类型是无用的垃圾。如果您确实需要确定给定应用程序的最快类型,那么需要在编译器上对代码进行基准测试。
#7
5
From the viewpoint of correctness and ease of coding, uint32_t
has many advantages over uint_fast32_t
in particular because of the more precisely defined size and arithmetic semantics, as many users above have pointed out.
从编码的正确性和简便性的角度来看,uint32_t比uint_fast32_t有很多优势,特别是因为上面的许多用户已经指出,uint_fast32_t具有更精确的定义大小和算术语义。
What has perhaps been missed is that the one supposed advantage of uint_fast32_t
- that it can be faster, just never materialized in any meaningful way. Most of the 64-bit processors that have dominated the 64-bit era (x86-64 and Aarch64 mostly) evolved from 32-bit architectures and have fast 32-bit native operations even in 64-bit mode. So uint_fast32_t
is just the same as uint32_t
on those platforms.
人们可能忽略了uint_fast32_t的一个假定优势——它可以更快,只是从未以任何有意义的方式实现。大多数主导64位时代(主要是x86-64和Aarch64)的64位处理器都是从32位架构演化而来的,并且即使在64位模式下也有快速的32位本机操作。uint_fast32_t和uint32_t在这些平台上是一样的。
Even if some of the "also ran" platforms like POWER, MIPS64, SPARC only offer 64-bit ALU operations, the vast majority of interesting 32-bit operations can be done just fine on 64-bit registers: the bottom 32-bit will have the desired results (and all mainstream platforms at least allow you to load/store 32-bits). Left shift is the main problematic one, but even that can be optimized in many cases by value/range tracking optimizations in the compiler.
即使一些这样的“也”平台的力量,MIPS64,SPARC只提供64位ALU操作,绝大多数的有趣的32位的操作可以很好完成64位寄存器:32位底部会有预期的结果(至少和所有主流平台允许您加载/存储32位)。左移是主要的问题之一,但即使是在很多情况下,也可以通过编译器中的值/范围跟踪优化进行优化。
I doubt the occasional slightly slower left shift or 32x32 -> 64 multiplication is going to outweigh double the memory use for such values, in all but the most obscure applications.
我怀疑偶尔稍微慢一点的左移或32x32 - >64乘法是否会超过这些值的内存使用的两倍,除了最模糊的应用程序。
Finally, I'll note that while the tradeoff has largely been characterized as "memory use and vectorization potential" (in favor of uint32_t
) versus instruction count/speed (in favor of uint_fast32_t
) - even that isn't clear to me. Yes, on some platforms you'll need additional instructions for some 32-bit operations, but you'll also save some instructions because:
最后,我要指出的是,虽然折衷在很大程度上被描述为“内存使用和向量化潜力”(支持uint32_t)和指令计数/速度(支持uint_fast32_t)——但我并不清楚这一点。是的,在某些平台上,对于某些32位操作,您将需要额外的指令,但您也将保存一些指令,因为:
- Using a smaller type often allows the compiler to cleverly combine adjacent operations by using one 64-bit operation to accomplish two 32-bit ones. An example of this type of "poor man's vectorization" is not uncommon. For example, create of a constant
struct two32{ uint32_t a, b; }
intorax
liketwo32{1, 2}
can be optimized into a singlemov rax, 0x20001
while the 64-bit version needs two instructions. In principle this should also be possible for adjacent arithmetic operations (same operation, different operand), but I haven't seen it in practice. - 使用较小的类型通常允许编译器通过使用一个64位操作来实现两个32位的操作,从而巧妙地组合相邻的操作。这种“穷人的矢量化”的例子并不少见。例如,创建一个常量结构体two32{uint32_t a, b;}到rax就像two32{1,2}可以优化为单个mov rax, 0x20001,而64位版本需要两个指令。原则上,这对于相邻的算术运算(相同的运算,不同的操作数)也是可能的,但是我在实践中没有见过。
- Lower "memory use" also often leads to fewer instructions, even if memory or cache footprint isn't a problem, because any type structure or arrays of this type are copied, you get twice the bang for your buck per register copied.
- 更低的“内存使用”通常也会导致更少的指令,即使内存或缓存占用不是问题,因为任何类型的类型结构或数组都会被复制,所以每复制一个寄存器,您的开销就会增加一倍。
-
Smaller data types often exploit better modern calling conventions like the SysV ABI which pack data structure data efficiently into registers. For example, you can return up to a 16-byte structure in registers
rdx:rax
. For a function returning structure with 4uint32_t
values (initialized from a constant), that translates into较小的数据类型通常利用更好的现代调用约定,比如SysV ABI,它将数据结构数据有效地打包到寄存器中。例如,您可以在寄存器rdx:rax中返回16字节的结构。对于返回带有4个uint32_t值(从常量初始化)的结构的函数,则转换为
ret_constant32(): movabs rax, 8589934593 movabs rdx, 17179869187 ret
The same structure with 4 64-bit
uint_fast32_t
needs a register move and four stores to memory to do the same thing (and the caller will probablyhave to read the values back from memory after the return):使用4个64位uint_fast32_t的相同结构需要一个寄存器移动,并将4个存储存储存储到内存中来执行相同的操作(调用方可能必须在返回后从内存中读取值):
ret_constant64(): mov rax, rdi mov QWORD PTR [rdi], 1 mov QWORD PTR [rdi+8], 2 mov QWORD PTR [rdi+16], 3 mov QWORD PTR [rdi+24], 4 ret
Similarly, when passing structure arguments, 32-bit values are packed about twice as densely into the registers available for parameters, so it makes it less likely that you'll run out of register arguments and have to spill to the stack1.
类似地,在传递结构参数时,32位值被密集地填充到可用参数的寄存器中,这样就不太可能耗尽寄存器参数,而不得不泄漏到stack1中。
-
Even if you choose to use
uint_fast32_t
for places where "speed matters" you'll often also have places where you need a fixed size type. For example, when passing values for external output, from external input, as part of your ABI, as part of a structure that needs a specific layout, or because you smartly useuint32_t
for large aggregations of values to save on memory footprint. In the places where youruint_fast32_t
and ``uint32_t` types need to interface, you might find (in addition to the development complexity), unnecessary sign extensions or other size-mismatch related code. Compilers do an OK job at optimizing this away in many cases, but it still not unusual to see this in optimized output when mixing types of different sizes.即使您选择在“速度很重要”的地方使用uint_fast32_t,您也常常需要一个固定大小的类型。例如,当为外部输出、外部输入传递值时,作为ABI的一部分,作为需要特定布局的结构的一部分,或者因为您聪明地使用uint32_t对大型值聚合进行聚合,以节省内存占用。在您的uint_fast32_t和' uint32_t '类型需要进行接口的地方,您可能会发现(除了开发复杂性之外)、不必要的签名扩展或其他大小不匹配的相关代码。在许多情况下,编译器在优化这一点上做得很好,但是在混合不同大小的类型时,在优化输出中看到这种情况并不少见。
You can play with some of the examples above and more on godbolt.
您可以使用上面的一些示例和更多关于godbolt的示例。
1 To be clear, the convention of packing structures tightly into registers isn't always a clear win for smaller values. It does mean that the smaller values may have to be "extracted" before they can be used. For example a simple function that returns the sum of the two structure members together needs a mov rax, rdi; shr rax, 32; add edi, eax
while for the 64-bit version each argument gets its own register and just needs a single add
or lea
. Still if you accept that the "tightly pack structures while passing" design makes sense overall, then smaller values will take more advantage of this feature.
需要说明的是,将结构紧紧的装入寄存器的惯例并不总是对较小的值来说是一个明显的胜利。这确实意味着在使用较小的值之前,可能需要“提取”它们。例如,一个返回两个结构成员的和的简单函数需要一个mov rax, rdi;32个月递交;添加edi, eax,而对于64位版本,每个参数都有自己的寄存器,只需要一个add或lea。不过,如果您接受“传递时紧包结构”设计总体上是有意义的,那么较小的值将更充分地利用这个特性。
#8
4
To my understanding, int
was initially supposed to be a "native" integer type with additional guarantee that it should be at least 16 bits in size - something that was considered "reasonable" size back then.
在我的理解中,int最初被认为是一个“本机”整数类型,并额外保证它的大小至少为16位——这在当时被认为是“合理的”大小。
When 32-bit platforms became more common, we can say that "reasonable" size has changed to 32 bits:
当32位平台变得更常见时,我们可以说“合理”的大小已改为32位:
- Modern Windows uses 32-bit
int
on all platforms. - 现代Windows在所有平台上都使用32位int。
- POSIX guarantees that
int
is at least 32 bits. - POSIX保证int至少为32位。
- C#, Java has type
int
which is guaranteed to be exactly 32 bits. - c#, Java有类型int,保证是32位。
But when 64-bit platform became the norm, no one expanded int
to be a 64-bit integer because of:
但是当64位平台成为标准时,没有人将int扩展为64位整数,因为:
- Portability: a lot of code depends on
int
being 32 bit in size. - 可移植性:许多代码取决于int的大小为32位。
- Memory consumption: doubling memory usage for every
int
might be unreasonable for most cases, as in most cases numbers in use are much smaller than 2 billion. - 内存消耗:在大多数情况下,将每个int类型的内存使用量增加一倍是不合理的,因为在大多数情况下,使用的数字要比20亿小得多。
Now, why would you prefer uint32_t
to uint_fast32_t
? For the same reason languages, C# and Java always use fixed size integers: programmer does not write code thinking about possible sizes of different types, they write for one platform and test code on that platform. Most of the code implicitly depends on specific sizes of data types. And this is why uint32_t
is a better choice for most cases - it does not allow any ambiguity regarding its behavior.
为什么你喜欢uint32_t而不喜欢uint_fast32_t?出于同样的原因,c#和Java总是使用固定大小的整数:程序员不编写考虑不同类型可能大小的代码,而是为一个平台编写代码,并在该平台上测试代码。大多数代码隐式地依赖于数据类型的特定大小。这就是为什么uint32_t在大多数情况下是一个更好的选择——它不允许在其行为上有任何歧义。
Moreover, is uint_fast32_t
really the fastest type on a platform with a size equal or greater to 32 bits? Not really. Consider this code compiler by GCC for x86_64 on Windows:
此外,uint_fast32_t真的是一个大小等于或大于32位的平台上最快的类型吗?不是真的。考虑一下GCC为Windows上的x86_64编写的这个代码编译器:
extern uint64_t get(void);
uint64_t sum(uint64_t value)
{
return value + get();
}
Generated assembly looks like this:
生成的程序集如下:
push %rbx
sub $0x20,%rsp
mov %rcx,%rbx
callq d <sum+0xd>
add %rbx,%rax
add $0x20,%rsp
pop %rbx
retq
Now if you change get()
's return value to uint_fast32_t
(which is 4 bytes on Windows x86_64) you get this:
现在,如果您将get()的返回值更改为uint_fast32_t(在Windows x86_64上是4字节),您将得到:
push %rbx
sub $0x20,%rsp
mov %rcx,%rbx
callq d <sum+0xd>
mov %eax,%eax ; <-- additional instruction
add %rbx,%rax
add $0x20,%rsp
pop %rbx
retq
Notice how generated code is almost the same except for additional mov %eax,%eax
instruction after function call which is meant to expand 32-bit value into a 64-bit value.
请注意,生成的代码几乎是相同的,除了额外的mov %eax、%eax指令之后的函数调用,这意味着将32位值扩展为64位值。
There is no such issue if you only use 32-bit values, but you will probably be using those with size_t
variables (array sizes probably?) and those are 64 bits on x86_64. On Linux uint_fast32_t
is 8 bytes, so the situation is different.
如果您只使用32位的值,则不会出现这种问题,但是您可能会使用size_t变量(可能是数组大小?),这些是x86_64上的64位。在Linux上uint_fast32_t是8字节,所以情况不同。
Many programmers use int
when they need to return small value (let's say in the range [-32,32]). This would work perfectly if int
would be platforms native integer size, but since it is not on 64-bit platforms, another type which matches platform native type is a better choice (unless it is frequently used with other integers of smaller size).
许多程序员在需要返回小值时使用int(假设在范围[-32,32])。如果int是平台的本机整数大小,那么这将非常有效,但是由于它不在64位平台上,另一种匹配平台本机类型的类型是更好的选择(除非它经常与其他较小的整数一起使用)。
Basically, regardless of what standard says, uint_fast32_t
is broken on some implementations anyway. If you care about additional instruction generated in some places, you should define your own "native" integer type. Or you can use size_t
for this purpose, as it will usually match native
size (I am not including old and obscure platforms like 8086, only platforms that can run Windows, Linux etc).
基本上,无论标准是什么,uint_fast32_t都在某些实现上被打破。如果您关心在某些地方生成的附加指令,您应该定义自己的“本机”整数类型。或者您可以为此目的使用size_t,因为它通常会匹配本机大小(我不包括8086这样的老的和不知名的平台,只有能够运行Windows和Linux等的平台)。
Another sign that shows int
was supposed to be a native integer type is "integer promotion rule". Most CPUs can only perform operations on native, so 32 bit CPU usually can only do 32-bit additions, subtractions etc (Intel CPUs are an exception here). Integer types of other sizes are supported only through load and store instructions. For example, the 8-bit value should be loaded with appropriate "load 8-bit signed" or "load 8-bit unsigned" instruction and will expand value to 32 bits after load. Without integer promotion rule C compilers would have to add a little bit more code for expressions that use types smaller than native type. Unfortunately, this does not hold anymore with 64-bit architectures as compilers now have to emit additional instructions in some cases (as was shown above).
显示int应该是本机整数类型的另一个标志是“整数提升规则”。大多数CPU只能在本机上执行操作,所以32位CPU通常只能执行32位的添加、减法等操作(这里的Intel CPU是个例外)。其他大小的整数类型只能通过装载和存储指令来支持。例如,8位值应该加载适当的“load 8位签名”或“load 8位无签名”指令,并在加载后将值扩展到32位。如果没有整数提升规则,C编译器将不得不为那些使用小于本机类型的类型的表达式添加更多的代码。不幸的是,这在64位体系结构中不再适用,因为编译器现在在某些情况下必须发出额外的指令(如上所示)。
#9
3
For practical purposes, uint_fast32_t
is completely useless. It's defined incorrectly on the most widespread platform (x86_64), and doesn't really offer any advantages anywhere unless you have a very low-quality compiler. Conceptually, it never makes sense to use the "fast" types in data structures/arrays - any savings you get from the type being more efficient to operate on will be dwarfed by the cost (cache misses, etc.) of increasing the size of your working data set. And for individual local variables (loop counters, temps, etc.) a non-toy compiler can usually just work with a larger type in the generated code if that's more efficient, and only truncate to the nominal size when necessary for correctness (and with signed types, it's never necessary).
实际上,uint_fast32_t是完全无用的。它在最广泛的平台(x86_64)上定义不正确,除非您有一个非常低质量的编译器,否则在任何地方都没有任何优势。从概念上说,它从未有意义使用数据结构中的“快速”类型/数组-类型的任何储蓄得到更高效的操作将相形见绌的成本(缓存缺失等)增加工作数据集的大小,为个体局部变量(循环计数器、临时工等)non-toy编译器通常可以处理更大的输入生成的代码更高效,当需要时,只需要将其截断到标称大小(并且有符号类型,这就不需要了)。
The one variant that is theoretically useful is uint_least32_t
, for when you need to be able to store any 32-bit value, but want to be portable to machines that lack an exact-size 32-bit type. Practically, speaking, however, that's not something you need to worry about.
理论上有用的一种变体是uint_least32_t,当您需要能够存储任何32位值,但是希望能够移植到没有精确大小的32位类型的机器上时。实际上,无论如何,这不是你需要担心的事情。
#10
2
In many cases, when an algorithm works on an array of data, the best way to improve performance is to minimize the number of cache misses. The smaller each element, the more of them can fit into the cache. This is why a lot of code is still written to use 32-bit pointers on 64-bit machines: they don’t need anything close to 4 GiB of data, but the cost of making all pointers and offsets need eight bytes instead of four would be substantial.
在许多情况下,当算法处理数据数组时,提高性能的最佳方法是最小化缓存丢失的数量。每个元素越小,就能容纳更多的元素到缓存中。这就是为什么很多代码在64位机器上仍然使用32位指针:它们不需要任何接近4千字节的数据,但是制造所有指针和偏移量的成本需要8字节而不是4字节。
There are also some ABIs and protocols specified to need exactly 32 bits, for example, IPv4 addresses. That’s what uint32_t
really means: use exactly 32 bits, regardless of whether that’s efficient on the CPU or not. These used to be declared as long
or unsigned long
, which caused a lot of problems during the 64-bit transition. If you just need an unsigned type that holds numbers up to at least 2³²-1, that’s been the definition of unsigned long
since the first C standard came out. In practice, though, enough old code assumed that a long
could hold any pointer or file offset or timestamp, and enough old code assumed that it was exactly 32 bits wide, that compilers can’t necessarily make long
the same as int_fast32_t
without breaking too much stuff.
还指定了一些ABIs和协议,它们只需要32位,例如IPv4地址。这就是uint32_t的真正含义:准确地使用32位,不管这对CPU是否有效。它们曾经被声明为long或unsigned long,这在64位转换期间造成了很多问题。如果你只是需要一个无符号类型,数字至少2³²1,这是无符号的定义早已第C标准出来了。但是,在实践中,有足够多的旧代码假设long可以保存任何指针、文件偏移量或时间戳,有足够多的旧代码假设long正好32位宽,因此编译器不可能在不破坏太多内容的情况下长得和int_fast32_t一样长。
In theory, it would be more future-proof for a program to use uint_least32_t
, and maybe even load uint_least32_t
elements into a uint_fast32_t
variable for calculations. An implementation that had no uint32_t
type at all could even declare itself in formal compliance with the standard! (It just wouldn’t be able to compile many existing programs.) In practice, there’s no architecture any more where int
, uint32_t
, and uint_least32_t
are not the same, and no advantage, currently, to the performance of uint_fast32_t
. So why overcomplicate things?
从理论上讲,使用uint_least32_t对程序来说是更有前途的,甚至可能将uint_least32_t元素加载到uint_fast32_t变量中进行计算。一个完全没有uint32_t类型的实现甚至可以声明自己符合标准!(它无法编译许多现有的程序。)在实践中,不再存在int、uint32_t和uint_least32_t不相同的体系结构,并且当前对uint_fast32_t的性能没有任何优势。为什么使事情复杂化呢?
Yet look at the reason all the 32_t
types needed to exist when we already had long
, and you’ll see that those assumptions have blown up in our faces before. Your code might well end up running someday on a machine where exact-width 32-bit calculations are slower than the native word size, and you would have been better off using uint_least32_t
for storage and uint_fast32_t
for calculation religiously. Or if you’ll cross that bridge when you get to it and just want something simple, there’s unsigned long
.
然而,看看当我们已经拥有了很长时间时,所有32 - t类型都需要存在的原因,你就会发现,这些假设在我们的面前已经破灭了。您的代码很可能在某一天在一台机器上运行,在这台机器上,精确的32位计算比本机的字长要慢,您最好使用uint_least32_t进行存储,使用uint_fast32_t进行严格的计算。或者,如果你想要一些简单的东西,当你走到那座桥的时候,你会看到无符号长。
#1
73
uint32_t
is guaranteed to have nearly the same properties on any platform that supports it.1
uint32_t在任何支持的平台上都具有几乎相同的属性。
uint_fast32_t
has very little guarantees about how it behaves on different systems in comparison.
相比之下,uint_fast32_t几乎不能保证它在不同系统上的行为。
If you switch to a platform where uint_fast32_t
has a different size, all code that uses uint_fast32_t
has to be retested and validated. All stability assumptions are going to be out the window. The entire system is going to work differently.
如果您切换到一个平台,其中uint_fast32_t具有不同的大小,则必须对所有使用uint_fast32_t的代码进行重新测试和验证。所有的稳定性假设都将被排除在外。整个系统将会以不同的方式工作。
When writing your code, you may not even have access to a uint_fast32_t
system that isn't 32 bits in size.
在编写代码时,您甚至可能无法访问非32位大小的uint_fast32_t系统。
uint32_t
won't work differently (except see footnote).
uint32_t不会有不同的工作方式(除了脚注)。
Correctness is more important than speed. Premature correctness is thus a better plan than premature optimization.
正确性比速度更重要。因此,不成熟的正确性是比不成熟的优化更好的计划。
In the event I was writing code for systems where I uint_fast32_t
was 64 or more bits, I might test my code for both cases and use it. Barring both need and opportunity, doing so is a bad plan.
在事件中,我为系统编写代码,其中uint_fast32_t是64位或更多位,我可能会测试这两种情况的代码并使用它。除了需要和机会,这样做是一个糟糕的计划。
Finally, uint_fast32_t
when you are storing it for any length of time or number of instances can be slower than uint32
simply due to cache size issues and memory bandwidth. Todays computers are far more often memory-bound than CPU bound, and uint_fast32_t
could be faster in isolation but not after you account for memory overhead.
最后,uint_fast32_t在存储任何时间或实例数时都可能比uint32慢,这仅仅是由于缓存大小问题和内存带宽的问题。今天的计算机通常是内存绑定的,而不是CPU绑定的,uint_fast32_t在隔离时可能会更快,但在您考虑到内存开销之后就不会了。
1 As @chux has noted in a comment, if unsigned
is larger than uint32_t
, arithmetic on uint32_t
goes through the usual integer promotions, and if not, it stays as uint32_t
. This can cause bugs. Nothing is ever perfect.
正如@chux在评论中所指出的,如果unsigned大于uint32_t,那么uint32_t的算术就会通过通常的整数提升,如果没有,它就会保持uint32_t。这可能会导致错误。没有什么是完美的。
#2
29
Why do many people use
uint32_t
rather thanuint32_fast_t
?为什么很多人使用uint32_t而不是uint32_fast_t?
Note: Mis-named uint32_fast_t
should be uint_fast32_t
.
注意:错误命名的uint32_fast_t应该是uint_fast32_t。
uint32_t
has a tighter specification than uint_fast32_t
and so makes for more consistent functionality.
uint32_t具有比uint_fast32_t更严格的规范,因此具有更一致的功能。
uint32_t
pros:
uint32_t优点:
- Various algorithms specify this type. IMO - best reason to use.
- 各种算法都指定了这种类型。最好的使用理由。
- Exact width and range known.
- 确切的宽度和范围已知。
- Arrays of this type incur no waste.
- 这种类型的数组不会产生任何浪费。
- unsigned integer math with its overflow is more predictable.
- 带溢出的无符号整数数学更容易预测。
- Closer match in range and math of other languages' 32-bit types.
- 在范围和数学上更接近其他语言的32位类型。
- Never padded.
- 从来没有的。
uint32_t
cons:
uint32_t缺点:
- Not always available (yet this is rare in 2018).
E.g.: Platforms lacking 8/16/32-bit integers (9/18/36-bit, others).
E.g.: Platforms using non-2's complement. old 2200 - 这并不总是可行的(但这在2018年是很少见的)。例:缺少8/16/32位整数(9/18/36-位,其他)的平台。例如:平台使用非2的补码。2200年老
uint_fast32_t
pros:
uint_fast32_t优点:
- Always available.
This always allow all platforms, new and old, to use fast/minimum types. - 总是可用的。这总是允许所有新老平台使用快速/最小类型。
- "Fastest" type that support 32-bit range.
- 支持32位范围的“最快”类型。
uint_fast32_t
cons:
uint_fast32_t缺点:
- Range is only minimally known. Example, it could be a 64-bit type.
- 范围是最小已知的。例如,它可以是64位类型。
- Arrays of this type may be wasteful in memory.
- 这种类型的数组在内存中可能会造成浪费。
- All answers (mine too at first), the post and comments used the wrong name
uint32_fast_t
. Looks like many just don't need and use this type. We didn't even use the right name! - 所有的答案(首先是我的),帖子和评论使用了错误的名称uint32_fast_t。看起来很多人都不需要这个类型。我们甚至都没有用对名字!
- Padding possible - (rare).
- 填充可能——(罕见)。
- In select cases, the "fastest" type may really be another type. So
uint_fast32_t
is only a 1st order approximation. - 在选择的情况下,“最快”类型可能是另一种类型。uint_fast32_t只是一阶近似。
In the end, what is best depends on the coding goal. Unless coding for very wide portability or some niched performance function, use uint32_t
.
最后,什么是最好的取决于编码目标。除非是为了非常广泛的可移植性或某种小型性能函数而编写代码,否则使用uint32_t。
There is another issue when using these types that comes into play: their rank compared to int/unsigned
在使用这些类型时,还有另一个问题:它们的级别与int/unsigned比较
Presumably uint_fastN_t
would be at least the rank of unsigned
. This is not specified, but a certain and testable condition.
假设uint_fastN_t至少是无符号的秩。这不是指定的,而是一个确定的、可测试的条件。
Thus, uintN_t
is more likely than uint_fastN_t
to be narrower the unsigned
. This means that code that uses uintN_t
math is more likely subject to integer promotions than uint_fastN_t
when concerning portability.
因此,uintN_t比uint_fastN_t更可能更窄。这意味着使用uintN_t数学的代码在可移植性方面比uint_fastN_t更容易受到整数提升的影响。
With this concern: portability advantage uint_fastN_t
with select math operations.
考虑到这个问题:使用select math操作的可移植性优势uint_fastN_t。
Side note about int32_t
rather than int_fast32_t
: On rare machines, INT_FAST32_MIN
may be -2,147,483,647 and not -2,147,483,648. The larger point: (u)intN_t
types are tightly specified and lead to portable code.
关于int32_t而不是int_fast32_t的旁注:在稀有机器上,INT_FAST32_MIN可能是-2,147,483,647,而不是-2,147,483,648。更重要的一点是:(u)intN_t类型被严格指定,导致可移植代码。
#3
24
Why do many people use
uint32_t
rather thanuint32_fast_t
?为什么很多人使用uint32_t而不是uint32_fast_t?
Silly answer:
愚蠢的回答:
- There is no standard type
uint32_fast_t
, the correct spelling isuint_fast32_t
. - 没有标准类型uint32_fast_t,正确的拼写是uint_fast32_t。
Practical answer:
实际的回答:
- Many people actually use
uint32_t
orint32_t
for their precise semantics, exactly 32 bits with unsigned wrap around arithmetic (uint32_t
) or 2's complement representation (int32_t
). Thexxx_fast32_t
types may be larger and thus inappropriate to store to binary files, use in packed arrays and structures, or send over a network. Furthermore, they may not even be faster. - 实际上,许多人使用uint32_t或int32_t来表示精确的语义,确切地说,是32位无符号的包绕算术(uint32_t)或2的补码表示(int32_t)。xxx_fast32_t类型可能更大,因此不适合存储到二进制文件中、用于填充数组和结构中或通过网络发送。此外,它们甚至可能不会更快。
Pragmatic answer:
务实的回答:
- Many people just don't know (or simply don't care) about
uint_fast32_t
, as demonstrated in comments and answers, and probably assume plainunsigned int
to have the same semantics, although many current architectures still have 16-bitint
s and some rare Museum samples have other strange int sizes less than 32. - 许多人只是不知道(或者根本不关心)uint_fast32_t,如注释和答案所示,并且可能假设普通的无符号int具有相同的语义,尽管许多当前的体系结构仍然有16位int,一些罕见的博物馆示例具有小于32的其他奇怪int大小。
UX answer:
用户体验的回答:
- Although possibly faster than
uint32_t
,uint_fast32_t
is slower to use: it takes longer to type, especially accounting for looking up spelling and semantics in the C documentation ;-) - 虽然uint_fast32_t可能比uint32_t快,但是使用uint_fast32_t要慢一些:输入需要更长的时间,特别是在C文档中查找拼写和语义的时候;
Elegance matters, (obviously opinion based):
优雅很重要(显然基于意见):
-
uint32_t
looks bad enough that many programmers prefer to define their ownu32
oruint32
type... From this perspective,uint_fast32_t
looks clumsy beyond repair. No surprise it sits on the bench with its friendsuint_least32_t
and such. - uint32_t看起来很糟糕,以至于许多程序员喜欢定义自己的u32或uint32类型……从这个角度来看,uint_fast32_t看起来笨拙得难以修复。难怪它和它的朋友uint_least32_t坐在长椅上。
#4
8
One reason is that unsigned int
is already "fastest" without the need for any special typedefs or the need to include something. So, if you need it fast, just use the fundamental int
or unsigned int
type.
While the standard does not explicitly guarantee that it is fastest, it indirectly does so by stating "Plain ints have the natural size suggested by the architecture of the execution environment" in 3.9.1. In other words, int
(or its unsigned counterpart) is what the processor is most comfortable with.
一个原因是,无符号int已经是“最快的”,不需要任何特殊类型或需要包含某些东西。因此,如果您需要它很快,只需使用基本的int类型或无符号int类型。虽然标准没有明确地保证它是最快的,但它间接地做到了这一点,在3.9.1中指出“普通的ints具有执行环境的体系结构所建议的自然大小”。换句话说,int(或它的无符号对等物)是处理器最喜欢的。
Now of course, you don't know what size unsigned int
might be. You only know it is at least as large as short
(and I seem to remember that short
must be at least 16 bits, although I can't find that in the standard now!). Usually it's just plain simply 4 bytes, but it could in theory be larger, or in extreme cases, even smaller (
although I've personally never encountered an architecture where this was the case, not even on 8-bit computers in the 1980s... maybe some microcontrollers, who knows
turns out I suffer from dementia, int
was very clearly 16 bits back then).
当然,你不知道无符号int的大小。你只知道它至少和短一样大(而且我似乎记得短必须至少是16位,尽管我现在在标准中找不到)。通常它只是简单的4个字节,但理论上它可能更大,或者在极端情况下,甚至更小(尽管我个人从未遇到过这样的架构,甚至在20世纪80年代的8位计算机上……)也许一些微控制器,谁知道我得了痴呆,int在当时是16位)。
The C++ standard doesn't bother to specify what the <cstdint>
types are or what they guarantee, it merely mentions "same as in C".
c++标准不需要指定
uint32_t
, per the C standard, guarantees that you get exactly 32 bits. Not anything different, none less and no padding bits. Sometimes this is exactly what you need, and thus it is very valuable.
uint32_t,根据C标准,保证您得到准确的32位。没有什么不同,没有空格。有时这正是你所需要的,因此它是非常有价值的。
uint_least32_t
guarantees that whatever the size is, it cannot be smaller than 32 bits (but it could very well be larger). Sometimes, but much more rarely than an exact witdh or "don't care", this is what you want.
uint_least32_t保证无论大小如何,它都不能小于32位(但它很可能更大)。有时候,但这比一个确切的witdh或“不在乎”要难得多,这就是你想要的。
Lastly, uint_fast32_t
is somewhat superfluous in my opinion, except for documentation-of-intent purposes. The C standard states "designates an integer type that is usually fastest" (note the word "usually") and explicitly mentions that it needs not be fastest for all purposes. In other words, uint_fast32_t
is just about the same as uint_least32_t
, which is usually fastest too, only no guarantee given (but no guarantee either way).
最后,在我看来,uint_fast32_t有些多余,除了用于意图文档目的之外。C标准状态“指定了一个通常是最快的整数类型”(注意单词“usually”),并明确地提到,对于所有目的来说,它不需要是最快的。换句话说,uint_fast32_t与uint_least32_t几乎相同,uint_least32_t通常也是最快的,只是没有给出保证(但没有任何保证)。
Since most of the time you either don't care about the exact size or you want exactly 32 (or 64, sometimes 16) bits, and since the "don't care" unsigned int
type is fastest anyway, this explains why uint_fast32_t
isn't so frequently used.
由于大多数情况下,您要么不关心确切的大小,要么您想要确切的32位(或64位,有时是16位),而且由于“不关心”无符号int类型是最快的,这就解释了为什么uint_fast32_t不经常使用。
#5
6
I have not seen evidence that uint32_t
be used for its range. Instead, most of the time that I've seen uint32_t
is used, it is to hold exactly 4 octets of data in various algorithms, with guaranteed wraparound and shift semantics!
我没有看到uint32_t用于其范围的证据。相反,我所见过的大多数情况下都使用uint32_t,它是在各种算法中精确地保存4个8字节的数据,具有保证的包装和移位语义!
There are also other reasons to use uint32_t
instead of uint_fast32_t
: Often it is that it will provide stable ABI. Additionally the memory usage can be known accurately. This very much offsets whatever the speed gain would be from uint_fast32_t
, whenever that type would be distinct from that of uint32_t
.
使用uint32_t而不是uint_fast32_t还有其他原因:通常它将提供稳定的ABI。此外,还可以准确地知道内存的使用情况。无论从uint_fast32_t获得的速度增益是多少,只要该类型与uint32_t不同,这个值就会大大抵消。
For values < 65536, there is already a handy type, it is called unsigned int
(unsigned short
is required to have at least that range as well, but unsigned int
is of the native word size) For values < 4294967296, there is another called unsigned long
.
对于值< 65536,已经有了一个方便的类型,它被称为无符号int(无符号短必须至少有这个范围,但无符号int是原生字大小),对于值< 4294967296,还有一个叫做无符号长。
And lastly, people do not use uint_fast32_t
because it is annoyingly long to type and easy to mistype :D
最后,人们不使用uint_fast32_t,因为输入时间太长,容易出错
#6
5
Several reasons.
以下几个原因。
- Many people don't know the 'fast' types exist.
- 许多人不知道“快速”类型的存在。
- It's more verbose to type.
- 打字比较麻烦。
- It's harder to reason about your programs behaviour when you don't know the actual size of the type.
- 当您不知道类型的实际大小时,就很难对您的程序行为进行推理。
- The standard doesn't actually pin down fastest, nor can it really what type is actually fastest can be very context dependent.
- 这个标准实际上并不是最快的,也不能确定哪种类型是最快的。
- I have seen no evidence of platform developers putting any thought into the size of these types when defining their platforms. For example on x86-64 Linux the "fast" types are all 64-bit even though x86-64 has hardware support for fast operations on 32-bit values.
- 我没有看到平台开发人员在定义他们的平台时考虑到这些类型的大小。例如,在x86-64 Linux上,“快速”类型都是64位的,尽管x86-64具有对32位值的快速操作的硬件支持。
In summary the "fast" types are worthless garbage. If you really need to figure out what type is fastest for a given application you need to benchmark your code on your compiler.
总之,“快速”类型是无用的垃圾。如果您确实需要确定给定应用程序的最快类型,那么需要在编译器上对代码进行基准测试。
#7
5
From the viewpoint of correctness and ease of coding, uint32_t
has many advantages over uint_fast32_t
in particular because of the more precisely defined size and arithmetic semantics, as many users above have pointed out.
从编码的正确性和简便性的角度来看,uint32_t比uint_fast32_t有很多优势,特别是因为上面的许多用户已经指出,uint_fast32_t具有更精确的定义大小和算术语义。
What has perhaps been missed is that the one supposed advantage of uint_fast32_t
- that it can be faster, just never materialized in any meaningful way. Most of the 64-bit processors that have dominated the 64-bit era (x86-64 and Aarch64 mostly) evolved from 32-bit architectures and have fast 32-bit native operations even in 64-bit mode. So uint_fast32_t
is just the same as uint32_t
on those platforms.
人们可能忽略了uint_fast32_t的一个假定优势——它可以更快,只是从未以任何有意义的方式实现。大多数主导64位时代(主要是x86-64和Aarch64)的64位处理器都是从32位架构演化而来的,并且即使在64位模式下也有快速的32位本机操作。uint_fast32_t和uint32_t在这些平台上是一样的。
Even if some of the "also ran" platforms like POWER, MIPS64, SPARC only offer 64-bit ALU operations, the vast majority of interesting 32-bit operations can be done just fine on 64-bit registers: the bottom 32-bit will have the desired results (and all mainstream platforms at least allow you to load/store 32-bits). Left shift is the main problematic one, but even that can be optimized in many cases by value/range tracking optimizations in the compiler.
即使一些这样的“也”平台的力量,MIPS64,SPARC只提供64位ALU操作,绝大多数的有趣的32位的操作可以很好完成64位寄存器:32位底部会有预期的结果(至少和所有主流平台允许您加载/存储32位)。左移是主要的问题之一,但即使是在很多情况下,也可以通过编译器中的值/范围跟踪优化进行优化。
I doubt the occasional slightly slower left shift or 32x32 -> 64 multiplication is going to outweigh double the memory use for such values, in all but the most obscure applications.
我怀疑偶尔稍微慢一点的左移或32x32 - >64乘法是否会超过这些值的内存使用的两倍,除了最模糊的应用程序。
Finally, I'll note that while the tradeoff has largely been characterized as "memory use and vectorization potential" (in favor of uint32_t
) versus instruction count/speed (in favor of uint_fast32_t
) - even that isn't clear to me. Yes, on some platforms you'll need additional instructions for some 32-bit operations, but you'll also save some instructions because:
最后,我要指出的是,虽然折衷在很大程度上被描述为“内存使用和向量化潜力”(支持uint32_t)和指令计数/速度(支持uint_fast32_t)——但我并不清楚这一点。是的,在某些平台上,对于某些32位操作,您将需要额外的指令,但您也将保存一些指令,因为:
- Using a smaller type often allows the compiler to cleverly combine adjacent operations by using one 64-bit operation to accomplish two 32-bit ones. An example of this type of "poor man's vectorization" is not uncommon. For example, create of a constant
struct two32{ uint32_t a, b; }
intorax
liketwo32{1, 2}
can be optimized into a singlemov rax, 0x20001
while the 64-bit version needs two instructions. In principle this should also be possible for adjacent arithmetic operations (same operation, different operand), but I haven't seen it in practice. - 使用较小的类型通常允许编译器通过使用一个64位操作来实现两个32位的操作,从而巧妙地组合相邻的操作。这种“穷人的矢量化”的例子并不少见。例如,创建一个常量结构体two32{uint32_t a, b;}到rax就像two32{1,2}可以优化为单个mov rax, 0x20001,而64位版本需要两个指令。原则上,这对于相邻的算术运算(相同的运算,不同的操作数)也是可能的,但是我在实践中没有见过。
- Lower "memory use" also often leads to fewer instructions, even if memory or cache footprint isn't a problem, because any type structure or arrays of this type are copied, you get twice the bang for your buck per register copied.
- 更低的“内存使用”通常也会导致更少的指令,即使内存或缓存占用不是问题,因为任何类型的类型结构或数组都会被复制,所以每复制一个寄存器,您的开销就会增加一倍。
-
Smaller data types often exploit better modern calling conventions like the SysV ABI which pack data structure data efficiently into registers. For example, you can return up to a 16-byte structure in registers
rdx:rax
. For a function returning structure with 4uint32_t
values (initialized from a constant), that translates into较小的数据类型通常利用更好的现代调用约定,比如SysV ABI,它将数据结构数据有效地打包到寄存器中。例如,您可以在寄存器rdx:rax中返回16字节的结构。对于返回带有4个uint32_t值(从常量初始化)的结构的函数,则转换为
ret_constant32(): movabs rax, 8589934593 movabs rdx, 17179869187 ret
The same structure with 4 64-bit
uint_fast32_t
needs a register move and four stores to memory to do the same thing (and the caller will probablyhave to read the values back from memory after the return):使用4个64位uint_fast32_t的相同结构需要一个寄存器移动,并将4个存储存储存储到内存中来执行相同的操作(调用方可能必须在返回后从内存中读取值):
ret_constant64(): mov rax, rdi mov QWORD PTR [rdi], 1 mov QWORD PTR [rdi+8], 2 mov QWORD PTR [rdi+16], 3 mov QWORD PTR [rdi+24], 4 ret
Similarly, when passing structure arguments, 32-bit values are packed about twice as densely into the registers available for parameters, so it makes it less likely that you'll run out of register arguments and have to spill to the stack1.
类似地,在传递结构参数时,32位值被密集地填充到可用参数的寄存器中,这样就不太可能耗尽寄存器参数,而不得不泄漏到stack1中。
-
Even if you choose to use
uint_fast32_t
for places where "speed matters" you'll often also have places where you need a fixed size type. For example, when passing values for external output, from external input, as part of your ABI, as part of a structure that needs a specific layout, or because you smartly useuint32_t
for large aggregations of values to save on memory footprint. In the places where youruint_fast32_t
and ``uint32_t` types need to interface, you might find (in addition to the development complexity), unnecessary sign extensions or other size-mismatch related code. Compilers do an OK job at optimizing this away in many cases, but it still not unusual to see this in optimized output when mixing types of different sizes.即使您选择在“速度很重要”的地方使用uint_fast32_t,您也常常需要一个固定大小的类型。例如,当为外部输出、外部输入传递值时,作为ABI的一部分,作为需要特定布局的结构的一部分,或者因为您聪明地使用uint32_t对大型值聚合进行聚合,以节省内存占用。在您的uint_fast32_t和' uint32_t '类型需要进行接口的地方,您可能会发现(除了开发复杂性之外)、不必要的签名扩展或其他大小不匹配的相关代码。在许多情况下,编译器在优化这一点上做得很好,但是在混合不同大小的类型时,在优化输出中看到这种情况并不少见。
You can play with some of the examples above and more on godbolt.
您可以使用上面的一些示例和更多关于godbolt的示例。
1 To be clear, the convention of packing structures tightly into registers isn't always a clear win for smaller values. It does mean that the smaller values may have to be "extracted" before they can be used. For example a simple function that returns the sum of the two structure members together needs a mov rax, rdi; shr rax, 32; add edi, eax
while for the 64-bit version each argument gets its own register and just needs a single add
or lea
. Still if you accept that the "tightly pack structures while passing" design makes sense overall, then smaller values will take more advantage of this feature.
需要说明的是,将结构紧紧的装入寄存器的惯例并不总是对较小的值来说是一个明显的胜利。这确实意味着在使用较小的值之前,可能需要“提取”它们。例如,一个返回两个结构成员的和的简单函数需要一个mov rax, rdi;32个月递交;添加edi, eax,而对于64位版本,每个参数都有自己的寄存器,只需要一个add或lea。不过,如果您接受“传递时紧包结构”设计总体上是有意义的,那么较小的值将更充分地利用这个特性。
#8
4
To my understanding, int
was initially supposed to be a "native" integer type with additional guarantee that it should be at least 16 bits in size - something that was considered "reasonable" size back then.
在我的理解中,int最初被认为是一个“本机”整数类型,并额外保证它的大小至少为16位——这在当时被认为是“合理的”大小。
When 32-bit platforms became more common, we can say that "reasonable" size has changed to 32 bits:
当32位平台变得更常见时,我们可以说“合理”的大小已改为32位:
- Modern Windows uses 32-bit
int
on all platforms. - 现代Windows在所有平台上都使用32位int。
- POSIX guarantees that
int
is at least 32 bits. - POSIX保证int至少为32位。
- C#, Java has type
int
which is guaranteed to be exactly 32 bits. - c#, Java有类型int,保证是32位。
But when 64-bit platform became the norm, no one expanded int
to be a 64-bit integer because of:
但是当64位平台成为标准时,没有人将int扩展为64位整数,因为:
- Portability: a lot of code depends on
int
being 32 bit in size. - 可移植性:许多代码取决于int的大小为32位。
- Memory consumption: doubling memory usage for every
int
might be unreasonable for most cases, as in most cases numbers in use are much smaller than 2 billion. - 内存消耗:在大多数情况下,将每个int类型的内存使用量增加一倍是不合理的,因为在大多数情况下,使用的数字要比20亿小得多。
Now, why would you prefer uint32_t
to uint_fast32_t
? For the same reason languages, C# and Java always use fixed size integers: programmer does not write code thinking about possible sizes of different types, they write for one platform and test code on that platform. Most of the code implicitly depends on specific sizes of data types. And this is why uint32_t
is a better choice for most cases - it does not allow any ambiguity regarding its behavior.
为什么你喜欢uint32_t而不喜欢uint_fast32_t?出于同样的原因,c#和Java总是使用固定大小的整数:程序员不编写考虑不同类型可能大小的代码,而是为一个平台编写代码,并在该平台上测试代码。大多数代码隐式地依赖于数据类型的特定大小。这就是为什么uint32_t在大多数情况下是一个更好的选择——它不允许在其行为上有任何歧义。
Moreover, is uint_fast32_t
really the fastest type on a platform with a size equal or greater to 32 bits? Not really. Consider this code compiler by GCC for x86_64 on Windows:
此外,uint_fast32_t真的是一个大小等于或大于32位的平台上最快的类型吗?不是真的。考虑一下GCC为Windows上的x86_64编写的这个代码编译器:
extern uint64_t get(void);
uint64_t sum(uint64_t value)
{
return value + get();
}
Generated assembly looks like this:
生成的程序集如下:
push %rbx
sub $0x20,%rsp
mov %rcx,%rbx
callq d <sum+0xd>
add %rbx,%rax
add $0x20,%rsp
pop %rbx
retq
Now if you change get()
's return value to uint_fast32_t
(which is 4 bytes on Windows x86_64) you get this:
现在,如果您将get()的返回值更改为uint_fast32_t(在Windows x86_64上是4字节),您将得到:
push %rbx
sub $0x20,%rsp
mov %rcx,%rbx
callq d <sum+0xd>
mov %eax,%eax ; <-- additional instruction
add %rbx,%rax
add $0x20,%rsp
pop %rbx
retq
Notice how generated code is almost the same except for additional mov %eax,%eax
instruction after function call which is meant to expand 32-bit value into a 64-bit value.
请注意,生成的代码几乎是相同的,除了额外的mov %eax、%eax指令之后的函数调用,这意味着将32位值扩展为64位值。
There is no such issue if you only use 32-bit values, but you will probably be using those with size_t
variables (array sizes probably?) and those are 64 bits on x86_64. On Linux uint_fast32_t
is 8 bytes, so the situation is different.
如果您只使用32位的值,则不会出现这种问题,但是您可能会使用size_t变量(可能是数组大小?),这些是x86_64上的64位。在Linux上uint_fast32_t是8字节,所以情况不同。
Many programmers use int
when they need to return small value (let's say in the range [-32,32]). This would work perfectly if int
would be platforms native integer size, but since it is not on 64-bit platforms, another type which matches platform native type is a better choice (unless it is frequently used with other integers of smaller size).
许多程序员在需要返回小值时使用int(假设在范围[-32,32])。如果int是平台的本机整数大小,那么这将非常有效,但是由于它不在64位平台上,另一种匹配平台本机类型的类型是更好的选择(除非它经常与其他较小的整数一起使用)。
Basically, regardless of what standard says, uint_fast32_t
is broken on some implementations anyway. If you care about additional instruction generated in some places, you should define your own "native" integer type. Or you can use size_t
for this purpose, as it will usually match native
size (I am not including old and obscure platforms like 8086, only platforms that can run Windows, Linux etc).
基本上,无论标准是什么,uint_fast32_t都在某些实现上被打破。如果您关心在某些地方生成的附加指令,您应该定义自己的“本机”整数类型。或者您可以为此目的使用size_t,因为它通常会匹配本机大小(我不包括8086这样的老的和不知名的平台,只有能够运行Windows和Linux等的平台)。
Another sign that shows int
was supposed to be a native integer type is "integer promotion rule". Most CPUs can only perform operations on native, so 32 bit CPU usually can only do 32-bit additions, subtractions etc (Intel CPUs are an exception here). Integer types of other sizes are supported only through load and store instructions. For example, the 8-bit value should be loaded with appropriate "load 8-bit signed" or "load 8-bit unsigned" instruction and will expand value to 32 bits after load. Without integer promotion rule C compilers would have to add a little bit more code for expressions that use types smaller than native type. Unfortunately, this does not hold anymore with 64-bit architectures as compilers now have to emit additional instructions in some cases (as was shown above).
显示int应该是本机整数类型的另一个标志是“整数提升规则”。大多数CPU只能在本机上执行操作,所以32位CPU通常只能执行32位的添加、减法等操作(这里的Intel CPU是个例外)。其他大小的整数类型只能通过装载和存储指令来支持。例如,8位值应该加载适当的“load 8位签名”或“load 8位无签名”指令,并在加载后将值扩展到32位。如果没有整数提升规则,C编译器将不得不为那些使用小于本机类型的类型的表达式添加更多的代码。不幸的是,这在64位体系结构中不再适用,因为编译器现在在某些情况下必须发出额外的指令(如上所示)。
#9
3
For practical purposes, uint_fast32_t
is completely useless. It's defined incorrectly on the most widespread platform (x86_64), and doesn't really offer any advantages anywhere unless you have a very low-quality compiler. Conceptually, it never makes sense to use the "fast" types in data structures/arrays - any savings you get from the type being more efficient to operate on will be dwarfed by the cost (cache misses, etc.) of increasing the size of your working data set. And for individual local variables (loop counters, temps, etc.) a non-toy compiler can usually just work with a larger type in the generated code if that's more efficient, and only truncate to the nominal size when necessary for correctness (and with signed types, it's never necessary).
实际上,uint_fast32_t是完全无用的。它在最广泛的平台(x86_64)上定义不正确,除非您有一个非常低质量的编译器,否则在任何地方都没有任何优势。从概念上说,它从未有意义使用数据结构中的“快速”类型/数组-类型的任何储蓄得到更高效的操作将相形见绌的成本(缓存缺失等)增加工作数据集的大小,为个体局部变量(循环计数器、临时工等)non-toy编译器通常可以处理更大的输入生成的代码更高效,当需要时,只需要将其截断到标称大小(并且有符号类型,这就不需要了)。
The one variant that is theoretically useful is uint_least32_t
, for when you need to be able to store any 32-bit value, but want to be portable to machines that lack an exact-size 32-bit type. Practically, speaking, however, that's not something you need to worry about.
理论上有用的一种变体是uint_least32_t,当您需要能够存储任何32位值,但是希望能够移植到没有精确大小的32位类型的机器上时。实际上,无论如何,这不是你需要担心的事情。
#10
2
In many cases, when an algorithm works on an array of data, the best way to improve performance is to minimize the number of cache misses. The smaller each element, the more of them can fit into the cache. This is why a lot of code is still written to use 32-bit pointers on 64-bit machines: they don’t need anything close to 4 GiB of data, but the cost of making all pointers and offsets need eight bytes instead of four would be substantial.
在许多情况下,当算法处理数据数组时,提高性能的最佳方法是最小化缓存丢失的数量。每个元素越小,就能容纳更多的元素到缓存中。这就是为什么很多代码在64位机器上仍然使用32位指针:它们不需要任何接近4千字节的数据,但是制造所有指针和偏移量的成本需要8字节而不是4字节。
There are also some ABIs and protocols specified to need exactly 32 bits, for example, IPv4 addresses. That’s what uint32_t
really means: use exactly 32 bits, regardless of whether that’s efficient on the CPU or not. These used to be declared as long
or unsigned long
, which caused a lot of problems during the 64-bit transition. If you just need an unsigned type that holds numbers up to at least 2³²-1, that’s been the definition of unsigned long
since the first C standard came out. In practice, though, enough old code assumed that a long
could hold any pointer or file offset or timestamp, and enough old code assumed that it was exactly 32 bits wide, that compilers can’t necessarily make long
the same as int_fast32_t
without breaking too much stuff.
还指定了一些ABIs和协议,它们只需要32位,例如IPv4地址。这就是uint32_t的真正含义:准确地使用32位,不管这对CPU是否有效。它们曾经被声明为long或unsigned long,这在64位转换期间造成了很多问题。如果你只是需要一个无符号类型,数字至少2³²1,这是无符号的定义早已第C标准出来了。但是,在实践中,有足够多的旧代码假设long可以保存任何指针、文件偏移量或时间戳,有足够多的旧代码假设long正好32位宽,因此编译器不可能在不破坏太多内容的情况下长得和int_fast32_t一样长。
In theory, it would be more future-proof for a program to use uint_least32_t
, and maybe even load uint_least32_t
elements into a uint_fast32_t
variable for calculations. An implementation that had no uint32_t
type at all could even declare itself in formal compliance with the standard! (It just wouldn’t be able to compile many existing programs.) In practice, there’s no architecture any more where int
, uint32_t
, and uint_least32_t
are not the same, and no advantage, currently, to the performance of uint_fast32_t
. So why overcomplicate things?
从理论上讲,使用uint_least32_t对程序来说是更有前途的,甚至可能将uint_least32_t元素加载到uint_fast32_t变量中进行计算。一个完全没有uint32_t类型的实现甚至可以声明自己符合标准!(它无法编译许多现有的程序。)在实践中,不再存在int、uint32_t和uint_least32_t不相同的体系结构,并且当前对uint_fast32_t的性能没有任何优势。为什么使事情复杂化呢?
Yet look at the reason all the 32_t
types needed to exist when we already had long
, and you’ll see that those assumptions have blown up in our faces before. Your code might well end up running someday on a machine where exact-width 32-bit calculations are slower than the native word size, and you would have been better off using uint_least32_t
for storage and uint_fast32_t
for calculation religiously. Or if you’ll cross that bridge when you get to it and just want something simple, there’s unsigned long
.
然而,看看当我们已经拥有了很长时间时,所有32 - t类型都需要存在的原因,你就会发现,这些假设在我们的面前已经破灭了。您的代码很可能在某一天在一台机器上运行,在这台机器上,精确的32位计算比本机的字长要慢,您最好使用uint_least32_t进行存储,使用uint_fast32_t进行严格的计算。或者,如果你想要一些简单的东西,当你走到那座桥的时候,你会看到无符号长。