我应该使用什么数据结构来创建我自己的“BigInteger”类?

时间:2021-11-07 16:51:38

As an optional assignment, I'm thinking about writing my own implementation of the BigInteger class, where I will provide my own methods for addition, subtraction, multiplication, etc.

作为一个可选的作业,我正在考虑编写我自己的BigInteger类实现,在那里我将提供我自己的方法来进行加法、减法、乘法等。

This will be for arbitrarily long integer numbers, even hundreds of digits long.

这将适用于任意长整数,甚至数百位数字。

While doing the math on these numbers, digit by digit isn't hard, what do you think the best datastructure would be to represent my "BigInteger"?

在对这些数字进行计算时,逐位数并不难,您认为最好的数据结构是表示“BigInteger”吗?

At first I was considering using an Array but then I was thinking I could still potentially overflow (run out of array slots) after a large add or multiplication. Would this be a good case to use a linked list, since I can tack on digits with O(1) time complexity?

一开始我考虑使用数组,但后来我想,在进行了大量的添加或乘法之后,仍然有可能溢出(耗尽数组槽)。使用链表是否合适,因为我可以在数字上加上O(1)时间复杂度?

Is there some other data-structure that would be even better suited than a linked list? Should the type that my data-structure holds be the smallest possible integer type I have available to me?

还有什么比链表更适合的数据结构吗?我的数据结构持有的类型是否应该是我可用的最小的整数类型?

Also, should I be careful about how I store my "carry" variable? Should it, itself, be of my "BigInteger" type?

另外,我是否应该注意如何存储“carry”变量?它本身应该属于我的“BigInteger”类型吗?

10 个解决方案

#1


3  

Check out the book C Interfaces and Implementations by David R. Hanson. It has 2 chapters on the subject, covering the vector structure, word size and many other issues you are likely to encounter.

请参阅David R. Hanson的书C接口和实现。它有两章关于这个主题,包括向量结构,字数和许多你可能遇到的其他问题。

It's written for C, but most of it is applicable to C++ and/or Java. And if you use C++ it will be a bit simpler because you can use something like std::vector to manage the array allocation for you.

它是为C编写的,但是大多数都适用于c++和/或Java。如果你使用c++,它会简单一些,因为你可以使用std::vector来管理你的数组分配。

#2


1  

Always use the smallest int type that will do the job you need (bytes). A linked list should work well, since you won't have to worry about overflowing.

始终使用最小的int类型来完成所需的工作(字节)。链接列表应该可以很好地工作,因为您不必担心溢出。

#3


1  

If you use binary trees (whose leaves are ints), you get all the advantages of the linked list (unbounded number of digits, etc) with simpler divide-and-conquer algorithms. You do not have in this case a single base but many depending the level at which you're working.

如果您使用二叉树(它的叶子是ints),您可以使用更简单的分治算法获得链表的所有优点(*位数等)。在这种情况下,你没有一个单一的碱基,而是有很多碱基,这取决于你的工作水平。

If you do this, you need to use a BigInteger for the carry. You may consider it an advantage of the "linked list of ints" approach that the carry can always be represented as an int (and this is true for any base, not just for base 10 as most answers seem to assume that you should use... In any base, the carry is always a single digit)

如果这样做,您需要使用一个BigInteger作为进位。您可能会认为“ints链表”方法的一个优点是进位总是可以表示为int(这对任何base都适用,而不仅仅是base 10,因为大多数答案似乎认为您应该使用……在任何底数中,进位总是个位)

I might as well say it: it would be a terrible waste to use base 10 when you can use 2^30 or 2^31.

我不妨说:这将是一个可怕的浪费使用基地10当你可以使用2 ^ 30或2 ^ 31。

#4


1  

Accessing elements of linked lists is slow. I think arrays are the way to go, with lots of bound checking and run time array resizing as needed.

访问链表的元素很慢。我认为数组是一种方法,需要进行大量的绑定检查和运行时数组大小调整。


Clarification: Traversing a linked list and traversing an array are both O(n) operations. But traversing a linked list requires deferencing a pointer at each step. Just because two algorithms both have the same complexity it doesn't mean that they both take the same time to run. The overhead of allocating and deallocating n nodes in a linked list will also be much heavier than memory management of a single array of size n, even if the array has to be resized a few times.

说明:遍历链表和遍历数组都是O(n)操作。但是遍历一个链表需要在每一步都延迟一个指针。仅仅因为两种算法都具有相同的复杂度,并不意味着它们运行的时间相同。在链表中分配和释放n个节点的开销也要比单个大小为n的数组的内存管理大得多,即使数组需要调整几次大小。

#5


1  

Wow, there are some… interesting answers here. I'd recommend reading a book rather than try to sort through all this contradictory advice.

哇,这里有一些有趣的答案。我建议你读一本书,而不是试着把这些自相矛盾的建议整理出来。

That said, C/C++ is also ill-suited to this task. Big-integer is a kind of extended-precision math. Most CPUs provide instructions to handle extended-precision math at comparable or same speed (bits per instruction) as normal math. When you add 2^32+2^32, the answer is 0… but there is also a special carry output from the processor's ALU which a program can read and use.

也就是说,C/ c++也不适合这个任务。大整数是一种扩展精度的数学。大多数cpu都提供指令以与普通数学相当或相同的速度(每条指令位)处理扩展精度的数学。当你加入2 ^ 32 + 2 ^ 32,答案是0…但是也有一个特殊的将输出从处理器的ALU程序可以读取和使用。

C++ cannot access that flag, and there's no way in C either. You have to use assembler.

c++不能访问这个标志,而且在C中也没有办法。你必须使用汇编程序。

Just to satisfy curiosity, you can use the standard Boolean arithmetic to recover carry bits etc. But you will be much better off downloading an existing library.

为了满足您的好奇心,您可以使用标准的布尔运算来恢复进位等等。但是您最好下载现有的库。

#6


0  

I would say an array of ints.

我要说的是一个ints数组。

#7


0  

An Array is indeed a natural fit. I think it is acceptable to throw OverflowException, when you run out of place in your memory. The teacher will see attention to detail.

数组确实是一种自然匹配。我认为抛出OverflowException是可以接受的,当你在你的记忆中失去了位置。老师会注意细节。

A multiplication roughly doubles digit numbers, addition increases it by at most 1. It is easy to create a sufficiently big Array to store the result of your operation.

一个数的乘积大约是两位数的两倍,相加最多增加1。很容易创建一个足够大的数组来存储操作结果。

Carry is at most a one-digit long number in multiplication (9*9 = 1, carry 8). A single int will do.

在乘法(9*9 = 1,Carry 8)中,进位最多为1位数,一个整数就可以了。

#8


0  

std::vector<bool> or std::vector<unsigned int> is probably what you want. You will have to push_back() or resize() on them as you need more space for multiplies, etc. Also, remember to push_back the correct sign bits if you're using two-compliment.

向量 或std::向量 可能就是你想要的。您将不得不对它们进行回推()或调整大小(),因为您需要更多的空间来进行多重运算,等等。此外,如果使用双恭维语,请记住要回推正确的符号位。

#9


0  

i would say a std::vector of char (since it has to hold only 0-9) (if you plan to work in BCD)

我会说std:: char向量(因为它只能容纳0-9)(如果你打算在BCD工作)

If not BCD then use vector of int (you didnt make it clear)

如果不是BCD,则使用int向量(您没有说清楚)

Much less space overhead that link list

链接列表的空间开销要小得多

And all advice says 'use vector unless you have a good reason not too'

所有的建议都说"除非你有充分的理由否则不要使用矢量"

#10


0  

As a rule of thumb, use std::vector instead of std::list, unless you need to insert elements in the middle of the sequence very often. Vectors tend to be faster, since they are stored contiguously and thus benefit from better spatial locality (a major performance factor on modern platforms).

根据经验,使用std::vector而不是std::list,除非您经常需要在序列中间插入元素。向量往往更快,因为它们是连续存储的,因此得益于更好的空间局部性(现代平台上的主要性能因素)。

Make sure you use elements that are natural for the platform. If you want to be platform independent, use long. Remember that unless you have some special compiler intrinsics available, you'll need a type at least twice as large to perform multiplication.

确保您使用了适合平台的元素。如果您想独立于平台,请使用long。请记住,除非您有一些特殊的编译器intrinsic可用,否则您将需要一个至少两倍大的类型来执行乘法。

I don't understand why you'd want carry to be a big integer. Carry is a single bit for addition and element-sized for multiplication.

我不明白为什么你会想让进位是一个大整数。进位是加法的一个位,乘法的元素大小。

Make sure you read Knuth's Art of Computer Programming, algorithms pertaining to arbitrary precision arithmetic are described there to a great extent.

请务必阅读Knuth的《计算机编程艺术》,其中对任意精度算法进行了很大程度的描述。

#1


3  

Check out the book C Interfaces and Implementations by David R. Hanson. It has 2 chapters on the subject, covering the vector structure, word size and many other issues you are likely to encounter.

请参阅David R. Hanson的书C接口和实现。它有两章关于这个主题,包括向量结构,字数和许多你可能遇到的其他问题。

It's written for C, but most of it is applicable to C++ and/or Java. And if you use C++ it will be a bit simpler because you can use something like std::vector to manage the array allocation for you.

它是为C编写的,但是大多数都适用于c++和/或Java。如果你使用c++,它会简单一些,因为你可以使用std::vector来管理你的数组分配。

#2


1  

Always use the smallest int type that will do the job you need (bytes). A linked list should work well, since you won't have to worry about overflowing.

始终使用最小的int类型来完成所需的工作(字节)。链接列表应该可以很好地工作,因为您不必担心溢出。

#3


1  

If you use binary trees (whose leaves are ints), you get all the advantages of the linked list (unbounded number of digits, etc) with simpler divide-and-conquer algorithms. You do not have in this case a single base but many depending the level at which you're working.

如果您使用二叉树(它的叶子是ints),您可以使用更简单的分治算法获得链表的所有优点(*位数等)。在这种情况下,你没有一个单一的碱基,而是有很多碱基,这取决于你的工作水平。

If you do this, you need to use a BigInteger for the carry. You may consider it an advantage of the "linked list of ints" approach that the carry can always be represented as an int (and this is true for any base, not just for base 10 as most answers seem to assume that you should use... In any base, the carry is always a single digit)

如果这样做,您需要使用一个BigInteger作为进位。您可能会认为“ints链表”方法的一个优点是进位总是可以表示为int(这对任何base都适用,而不仅仅是base 10,因为大多数答案似乎认为您应该使用……在任何底数中,进位总是个位)

I might as well say it: it would be a terrible waste to use base 10 when you can use 2^30 or 2^31.

我不妨说:这将是一个可怕的浪费使用基地10当你可以使用2 ^ 30或2 ^ 31。

#4


1  

Accessing elements of linked lists is slow. I think arrays are the way to go, with lots of bound checking and run time array resizing as needed.

访问链表的元素很慢。我认为数组是一种方法,需要进行大量的绑定检查和运行时数组大小调整。


Clarification: Traversing a linked list and traversing an array are both O(n) operations. But traversing a linked list requires deferencing a pointer at each step. Just because two algorithms both have the same complexity it doesn't mean that they both take the same time to run. The overhead of allocating and deallocating n nodes in a linked list will also be much heavier than memory management of a single array of size n, even if the array has to be resized a few times.

说明:遍历链表和遍历数组都是O(n)操作。但是遍历一个链表需要在每一步都延迟一个指针。仅仅因为两种算法都具有相同的复杂度,并不意味着它们运行的时间相同。在链表中分配和释放n个节点的开销也要比单个大小为n的数组的内存管理大得多,即使数组需要调整几次大小。

#5


1  

Wow, there are some… interesting answers here. I'd recommend reading a book rather than try to sort through all this contradictory advice.

哇,这里有一些有趣的答案。我建议你读一本书,而不是试着把这些自相矛盾的建议整理出来。

That said, C/C++ is also ill-suited to this task. Big-integer is a kind of extended-precision math. Most CPUs provide instructions to handle extended-precision math at comparable or same speed (bits per instruction) as normal math. When you add 2^32+2^32, the answer is 0… but there is also a special carry output from the processor's ALU which a program can read and use.

也就是说,C/ c++也不适合这个任务。大整数是一种扩展精度的数学。大多数cpu都提供指令以与普通数学相当或相同的速度(每条指令位)处理扩展精度的数学。当你加入2 ^ 32 + 2 ^ 32,答案是0…但是也有一个特殊的将输出从处理器的ALU程序可以读取和使用。

C++ cannot access that flag, and there's no way in C either. You have to use assembler.

c++不能访问这个标志,而且在C中也没有办法。你必须使用汇编程序。

Just to satisfy curiosity, you can use the standard Boolean arithmetic to recover carry bits etc. But you will be much better off downloading an existing library.

为了满足您的好奇心,您可以使用标准的布尔运算来恢复进位等等。但是您最好下载现有的库。

#6


0  

I would say an array of ints.

我要说的是一个ints数组。

#7


0  

An Array is indeed a natural fit. I think it is acceptable to throw OverflowException, when you run out of place in your memory. The teacher will see attention to detail.

数组确实是一种自然匹配。我认为抛出OverflowException是可以接受的,当你在你的记忆中失去了位置。老师会注意细节。

A multiplication roughly doubles digit numbers, addition increases it by at most 1. It is easy to create a sufficiently big Array to store the result of your operation.

一个数的乘积大约是两位数的两倍,相加最多增加1。很容易创建一个足够大的数组来存储操作结果。

Carry is at most a one-digit long number in multiplication (9*9 = 1, carry 8). A single int will do.

在乘法(9*9 = 1,Carry 8)中,进位最多为1位数,一个整数就可以了。

#8


0  

std::vector<bool> or std::vector<unsigned int> is probably what you want. You will have to push_back() or resize() on them as you need more space for multiplies, etc. Also, remember to push_back the correct sign bits if you're using two-compliment.

向量 或std::向量 可能就是你想要的。您将不得不对它们进行回推()或调整大小(),因为您需要更多的空间来进行多重运算,等等。此外,如果使用双恭维语,请记住要回推正确的符号位。

#9


0  

i would say a std::vector of char (since it has to hold only 0-9) (if you plan to work in BCD)

我会说std:: char向量(因为它只能容纳0-9)(如果你打算在BCD工作)

If not BCD then use vector of int (you didnt make it clear)

如果不是BCD,则使用int向量(您没有说清楚)

Much less space overhead that link list

链接列表的空间开销要小得多

And all advice says 'use vector unless you have a good reason not too'

所有的建议都说"除非你有充分的理由否则不要使用矢量"

#10


0  

As a rule of thumb, use std::vector instead of std::list, unless you need to insert elements in the middle of the sequence very often. Vectors tend to be faster, since they are stored contiguously and thus benefit from better spatial locality (a major performance factor on modern platforms).

根据经验,使用std::vector而不是std::list,除非您经常需要在序列中间插入元素。向量往往更快,因为它们是连续存储的,因此得益于更好的空间局部性(现代平台上的主要性能因素)。

Make sure you use elements that are natural for the platform. If you want to be platform independent, use long. Remember that unless you have some special compiler intrinsics available, you'll need a type at least twice as large to perform multiplication.

确保您使用了适合平台的元素。如果您想独立于平台,请使用long。请记住,除非您有一些特殊的编译器intrinsic可用,否则您将需要一个至少两倍大的类型来执行乘法。

I don't understand why you'd want carry to be a big integer. Carry is a single bit for addition and element-sized for multiplication.

我不明白为什么你会想让进位是一个大整数。进位是加法的一个位,乘法的元素大小。

Make sure you read Knuth's Art of Computer Programming, algorithms pertaining to arbitrary precision arithmetic are described there to a great extent.

请务必阅读Knuth的《计算机编程艺术》,其中对任意精度算法进行了很大程度的描述。