32位处理器如何支持64位整数?

时间:2022-01-29 12:00:52

In C++, you can use an int which is usually 4 bytes. A long long integer is usually 8 bytes. If the cpu was 32 bit, wouldn't that limit it to 32 bit numbers? How come I can use a long long integer if it doesn't support 64 bits? Can the alu add larger integers or something?

在c++中,可以使用一个整数,通常是4个字节。长整数通常是8字节。如果cpu是32位,那岂不是限制它为32位数字?如果一个长整数不支持64位,我怎么能使用它呢?alu可以添加更大的整数吗?

5 个解决方案

#1


15  

Most processors include a carry flag and an overflow flag to support operations on multi-word integers. The carry flag is used for unsigned math, and the overflow flag for signed math.

大多数处理器包括一个进位标志和一个溢出标志,以支持对多字整数的操作。进位标志用于无符号数学,溢出标志用于有符号数学。

For example, on an x86 you could add two unsigned 64-bit numbers (which we'll assume are in EDX:EAX and EBX:ECX) something like this:

例如,在x86上,您可以添加两个无符号的64位数字(我们假设是在EDX:EAX和EBX:ECX中)如下:

add eax, ecx  ; this does an add, ignoring the carry flag
adc edx, ebx  ; this adds the carry flag along with the numbers
; sum in edx:eax

It's possible to implement this sort of thing in higher level languages like C++ as well, but they do a lot less to support it, so the code typically ends up substantially slower than when it's written in assembly language.

也可以在高级语言(如c++)中实现这类东西,但是它们支持它的功能要少得多,因此代码通常比用汇编语言编写时慢得多。

Most operations are basically serial in nature. When you're doing addition at the binary level, you take two input bits and produce one result bit and one carry bit. The carry bit is then used as an input when adding the next least significant bit, and so on across the word (known as a "ripple adder", because the addition "ripples" across the word).

大多数操作基本上是串行的。当你在二进制级别做加法时,你取两个输入位并产生一个结果位和一个进位。当添加下一个最不重要的位时,进位被用作输入,等等(被称为“波纹加法器”,因为在单词中添加了“波纹”)。

There are more sophisticated ways to do addition that can reduce that dependency between one bit and another when a particular addition doesn't produce a dependency, and most current hardware uses such things.

有更复杂的方法来执行加法,当一个特定的加法不产生一个依赖项时,可以减少一个比特与另一个比特之间的依赖关系,并且大多数当前的硬件都使用这些东西。

In the worst case, however, adding 1 to a number that's already the largest a given word size supports will result in generating a carry from every bit to the next, all the way across the word.

然而,在最坏的情况下,将1添加到一个已经是最大的给定单词大小支持的数字中,将会产生一个从每个字节到下一个字节的传输,这是整个单词的所有方式。

That means that (to at least some extent) the word width a CPU supports imposes a limit on the maximum clock speed at which it can run. If somebody wanted to badly enough, they could build a CPU that worked with, say, 1024-bit operands. If they did that, however, they'd have two choices: either run it at a lower clock speed, or else take multiple clocks to add a single pair of operands.

这意味着(至少在某种程度上)CPU支持的字宽限制了它运行的最大时钟速度。如果有人非常想要,他们可以构建一个CPU,比如1024位操作数。但是,如果他们这样做,他们将有两个选择:要么以较低的时钟速度运行它,要么使用多个时钟来添加一对操作数。

Also note that as you widen operands like that, you need more storage (e.g., larger cache) to store as many operands, more gates to carry out each individual operation, and so on.

还要注意,当您扩展这样的操作数时,您需要更多的存储(例如,更大的缓存)来存储尽可能多的操作数,每个操作数都需要更多的门,等等。

So given identical technology, you could have a 64-bit processor that ran at 4 GHz and had, say, 4 megabytes of cache, or a 1024-bit processor that ran at about 250 MHz and had, perhaps, 2 megabytes of cache.

如果有相同的技术,你可以有一个64位处理器,运行在4 GHz上,有4兆的缓存,或者1024位处理器,运行在250兆赫z上,可能有2兆的缓存。

The latter would probably be a win if most of your work was on 1024-bit (or larger) operands. Most people don't do math on 1024-bit operands very often at all though. In fact, 64-bit numbers are large enough for most purposes. As such, supporting wider operands would probably turn out to be a net loss for most people most of the time.

如果您的大部分工作是在1024位(或更大)操作数上,那么后者可能是一个胜利。大多数人并不经常在1024位操作数上做数学运算。事实上,64位数字对于大多数用途来说都足够大。因此,对大多数人来说,支持更广泛的操作数可能在大多数时候是一个净损失。

#2


4  

Essentially the normally single instruction add is broken into two (or three) steps:

通常的单指令添加被分成两个(或三个)步骤:

1) Add the low-order 32 bits using the usual add instruction. Note whether this addition would generate a "carry out" bit (that is, if the result would actually require 33 bits to represent).

1)使用通常的添加指令添加低阶32位。注意这个添加是否会生成一个“执行”位(也就是说,如果结果实际需要33位来表示)。

2) Add the high order 32 bits the same way. If there was a carry-out from the lower order bits, set the carry in bit here (or, alternatively, add one to the result after adding).

2)以同样的方式增加高阶32位。如果有来自低阶位的进位,请在这里设置进位(或者,添加后在结果中添加一个进位)。

#3


4  

It's possible to support arbitrarily wide integers (through software implementation), even if the underlying hardware only supports less bits directly. If a 32-bit integer is added to another 32-bit integer, it could overflow and require 33 bits to store the answer. Software can detect that this overflow occurred (the processor has a carry flag that can be checked), and another 32-bit word that represents the most significant bits of the 64-bit number can be incremented by 1.

可以支持任意宽的整数(通过软件实现),即使底层硬件只支持较少的位。如果将一个32位整数添加到另一个32位整数中,它可能会溢出并需要33位元来存储答案。软件可以检测到溢出发生(处理器有一个可以检查的进位标志),另一个32位的字表示64位数字中最重要的位,可以增加1。

Here's a little more on the carry flag and how it's used.

这里有更多关于进位标志和它的用法。

#4


3  

You use two memory locations to store the number. Half of the number is stored at one location in memory, and the other half in the adjacent memory location.

您使用两个内存位置来存储数字。数字的一半存储在内存中的一个位置,另一半存储在相邻的内存位置。

#5


0  

You might also consider that we used to cope with 16 or even 32 bit sized integers back in the days of 8-bit CPUs. There's nothing that restricts any particular alu from handling arbitrary size numbers other than memory space and ultimately I suppose the patience of the user.

您可能还会考虑,在8位cpu的时候,我们使用了16个甚至32个大小的整数。除了内存空间之外,没有什么能限制任何特定的alu处理任意大小的数字,最终我假设用户的耐心。

Smalltalk for example has always provided arbitrary length integers since the original Dorados and Altos - that takes us back to 1970. Want the exact value of 963! - just do it. It'll take a while to format it to print though.

例如,自从最初的Dorados和Altos开始,Smalltalk一直提供任意长度的整数——这将我们带回1970年。想要963的确切值!——就这样做。但是打印出来需要一段时间。

#1


15  

Most processors include a carry flag and an overflow flag to support operations on multi-word integers. The carry flag is used for unsigned math, and the overflow flag for signed math.

大多数处理器包括一个进位标志和一个溢出标志,以支持对多字整数的操作。进位标志用于无符号数学,溢出标志用于有符号数学。

For example, on an x86 you could add two unsigned 64-bit numbers (which we'll assume are in EDX:EAX and EBX:ECX) something like this:

例如,在x86上,您可以添加两个无符号的64位数字(我们假设是在EDX:EAX和EBX:ECX中)如下:

add eax, ecx  ; this does an add, ignoring the carry flag
adc edx, ebx  ; this adds the carry flag along with the numbers
; sum in edx:eax

It's possible to implement this sort of thing in higher level languages like C++ as well, but they do a lot less to support it, so the code typically ends up substantially slower than when it's written in assembly language.

也可以在高级语言(如c++)中实现这类东西,但是它们支持它的功能要少得多,因此代码通常比用汇编语言编写时慢得多。

Most operations are basically serial in nature. When you're doing addition at the binary level, you take two input bits and produce one result bit and one carry bit. The carry bit is then used as an input when adding the next least significant bit, and so on across the word (known as a "ripple adder", because the addition "ripples" across the word).

大多数操作基本上是串行的。当你在二进制级别做加法时,你取两个输入位并产生一个结果位和一个进位。当添加下一个最不重要的位时,进位被用作输入,等等(被称为“波纹加法器”,因为在单词中添加了“波纹”)。

There are more sophisticated ways to do addition that can reduce that dependency between one bit and another when a particular addition doesn't produce a dependency, and most current hardware uses such things.

有更复杂的方法来执行加法,当一个特定的加法不产生一个依赖项时,可以减少一个比特与另一个比特之间的依赖关系,并且大多数当前的硬件都使用这些东西。

In the worst case, however, adding 1 to a number that's already the largest a given word size supports will result in generating a carry from every bit to the next, all the way across the word.

然而,在最坏的情况下,将1添加到一个已经是最大的给定单词大小支持的数字中,将会产生一个从每个字节到下一个字节的传输,这是整个单词的所有方式。

That means that (to at least some extent) the word width a CPU supports imposes a limit on the maximum clock speed at which it can run. If somebody wanted to badly enough, they could build a CPU that worked with, say, 1024-bit operands. If they did that, however, they'd have two choices: either run it at a lower clock speed, or else take multiple clocks to add a single pair of operands.

这意味着(至少在某种程度上)CPU支持的字宽限制了它运行的最大时钟速度。如果有人非常想要,他们可以构建一个CPU,比如1024位操作数。但是,如果他们这样做,他们将有两个选择:要么以较低的时钟速度运行它,要么使用多个时钟来添加一对操作数。

Also note that as you widen operands like that, you need more storage (e.g., larger cache) to store as many operands, more gates to carry out each individual operation, and so on.

还要注意,当您扩展这样的操作数时,您需要更多的存储(例如,更大的缓存)来存储尽可能多的操作数,每个操作数都需要更多的门,等等。

So given identical technology, you could have a 64-bit processor that ran at 4 GHz and had, say, 4 megabytes of cache, or a 1024-bit processor that ran at about 250 MHz and had, perhaps, 2 megabytes of cache.

如果有相同的技术,你可以有一个64位处理器,运行在4 GHz上,有4兆的缓存,或者1024位处理器,运行在250兆赫z上,可能有2兆的缓存。

The latter would probably be a win if most of your work was on 1024-bit (or larger) operands. Most people don't do math on 1024-bit operands very often at all though. In fact, 64-bit numbers are large enough for most purposes. As such, supporting wider operands would probably turn out to be a net loss for most people most of the time.

如果您的大部分工作是在1024位(或更大)操作数上,那么后者可能是一个胜利。大多数人并不经常在1024位操作数上做数学运算。事实上,64位数字对于大多数用途来说都足够大。因此,对大多数人来说,支持更广泛的操作数可能在大多数时候是一个净损失。

#2


4  

Essentially the normally single instruction add is broken into two (or three) steps:

通常的单指令添加被分成两个(或三个)步骤:

1) Add the low-order 32 bits using the usual add instruction. Note whether this addition would generate a "carry out" bit (that is, if the result would actually require 33 bits to represent).

1)使用通常的添加指令添加低阶32位。注意这个添加是否会生成一个“执行”位(也就是说,如果结果实际需要33位来表示)。

2) Add the high order 32 bits the same way. If there was a carry-out from the lower order bits, set the carry in bit here (or, alternatively, add one to the result after adding).

2)以同样的方式增加高阶32位。如果有来自低阶位的进位,请在这里设置进位(或者,添加后在结果中添加一个进位)。

#3


4  

It's possible to support arbitrarily wide integers (through software implementation), even if the underlying hardware only supports less bits directly. If a 32-bit integer is added to another 32-bit integer, it could overflow and require 33 bits to store the answer. Software can detect that this overflow occurred (the processor has a carry flag that can be checked), and another 32-bit word that represents the most significant bits of the 64-bit number can be incremented by 1.

可以支持任意宽的整数(通过软件实现),即使底层硬件只支持较少的位。如果将一个32位整数添加到另一个32位整数中,它可能会溢出并需要33位元来存储答案。软件可以检测到溢出发生(处理器有一个可以检查的进位标志),另一个32位的字表示64位数字中最重要的位,可以增加1。

Here's a little more on the carry flag and how it's used.

这里有更多关于进位标志和它的用法。

#4


3  

You use two memory locations to store the number. Half of the number is stored at one location in memory, and the other half in the adjacent memory location.

您使用两个内存位置来存储数字。数字的一半存储在内存中的一个位置,另一半存储在相邻的内存位置。

#5


0  

You might also consider that we used to cope with 16 or even 32 bit sized integers back in the days of 8-bit CPUs. There's nothing that restricts any particular alu from handling arbitrary size numbers other than memory space and ultimately I suppose the patience of the user.

您可能还会考虑,在8位cpu的时候,我们使用了16个甚至32个大小的整数。除了内存空间之外,没有什么能限制任何特定的alu处理任意大小的数字,最终我假设用户的耐心。

Smalltalk for example has always provided arbitrary length integers since the original Dorados and Altos - that takes us back to 1970. Want the exact value of 963! - just do it. It'll take a while to format it to print though.

例如,自从最初的Dorados和Altos开始,Smalltalk一直提供任意长度的整数——这将我们带回1970年。想要963的确切值!——就这样做。但是打印出来需要一段时间。