在数组中就地合并。

时间:2021-10-20 21:25:48

I came across the following question.

我遇到了下面这个问题。

Given an array of n elements and an integer k where k < n. Elements {a0...ak} and {ak+1...an} are already sorted. Give an algorithm to sort in O(n) time and O(1) space.

给定一个包含n个元素和k个整数的数组,其中k < n.元素{a0…正义与发展党}和{ ak + 1……一个}已经排序。给出一种在O(n)时间和O(1)空间中排序的算法。

It does not seem to me like it can be done in O(n) time and O(1) space. The problem really seems to be asking how to do the merge step of mergesort but in-place. If it was possible, wouldn't mergesort be implemented that way? I am unable to convince myself though and need some opinion.

在我看来,在O(n)时间和O(1)空间里是不可能完成的。真正的问题似乎是如何执行归并排序的合并步骤。如果这是可能的,那么merge esort就不是这样实现的吗?我无法说服自己,需要一些意见。

3 个解决方案

#1


9  

This seems to indicate that it is possible to do in O(lg^2 n) space. I cannot see how to prove that it is impossible to merge in constant space, but I cannot see how to do it either.

这似乎表明,可以在O(lg ^ 2 n)空间。我不知道如何证明在恒定空间中是不可能合并的,但我也不知道怎么去做。

Edit: Chasing references, Knuth Vol 3 - Exercise 5.5.3 says "A considerably more complicated algorithm of L. Trabb-Pardo provides the best possible answer to this problem: It is possible to do stable merging in O(n) time and stable sorting in O(n lg n) time, using only O(lg n) bits of auxiliary memory for a fixed number of index variables.

编辑:追逐引用,Knuth卷3 -锻炼5.5.3说:“一个更复杂的算法l . Trabb-Pardo提供最好的回答这个问题:可以做稳定的合并在O(n)时间和稳定的排序在O(nlgn)时间,只使用O(log n)的辅助记忆固定数量的指标变量。

More references that I have not read. Thanks for an interesting problem.

更多的参考文献我没有读过。谢谢你提出了一个有趣的问题。

Further edit: This article claims that the article by Huang and Langston have an algorithm that merges two lists of size m and n in time O(m + n), so the answer to your question would seem to be yes. Unfortunately I do not have access to the article, so I must trust the second hand information. I'm not sure how to reconcile this with Knuth's pronouncement that the Trabb-Pardo algorithm is optimal. If my life depended on it, I'd go with Knuth.

进一步编辑:这篇文章声称Huang和Langston的文章有一个算法,它在时间O(m + n)中合并了两个大小为m和n的列表,所以你的问题的答案似乎是肯定的。不幸的是,我无法访问这篇文章,所以我必须相信第一手的信息。我不知道如何调和Knuth所说的Trabb-Pardo算法是最优的。如果我的生命依靠它,我就和Knuth一起走。

I now see that this had been asked as and earlier Stack Overflow question a number of times. I don't have the heart to flag it as a duplicate.

我现在看到这个问题已经被问过很多次了。我不忍心把它标记为复制品。

Huang B.-C. and Langston M. A., Practical in-place merging, Comm. ACM 31 (1988) 348-352

黄在公元前。和兰斯顿·m·A。(1988) 348-352

#2


2  

There are several algorithms for doing this, none of which are very easy to intuit. The key idea is to use a part of the arrays to merge as a buffer, then doing a standard merge using this buffer for auxiliary space. If you can then reposition the elements so that the buffer elements are in the right place, you're golden.

有几种算法可以做到这一点,但没有一种是非常容易直觉的。关键思想是使用数组的一部分将其合并为缓冲区,然后使用这个缓冲区为辅助空间执行标准的合并。如果您可以重新定位元素,使缓冲元素在正确的位置,那么您就是黄金。

I have written up an implementation of one of these algorithms on my personal site if you're interested in looking at it. It's based on the paper "Practical In-Place Merging" by Huang and Langston. You probably will want to look over that paper for some insight.

如果你有兴趣的话,我已经在我的个人网站上写了一个算法的实现。这是基于黄和兰斯顿的论文《现场实际合并》。你可能会想看看那篇论文,以获得一些见解。

I've also heard that there are good adaptive algorithms for this, which use some fixed-size buffer of your choosing (which could be O(1) if you wanted), but then scale elegantly with the buffer size. I don't know any of these off the top of my head, but I'm sure a quick search for "adaptive merge" might turn something up.

我还听说有很好的自适应算法,它使用了你选择的固定大小的缓冲区(如果你想要的话可以是O(1)),但是要优雅地使用缓冲区大小。我自己都不知道这些,但我确信快速搜索“适应性合并”可能会有帮助。

#3


1  

No it isn't possible, although my job would be much easier if it was :).

不,这是不可能的,尽管如果是这样的话,我的工作会容易得多。

You have a O(log n) factor which you can't avoid. You can choose to take it as time or space, but the only way to avoid it is to not sort. With O(log n) space you can build a list of continuations that keep track of where you stashed the elements that didn't quite fit. With recursion this can be made to fit in O(1) heap, but that's only by using O(log n) stack frames instead.

你有一个O(log n)因子你无法避免。你可以选择将它作为时间或空间,但避免它的唯一方法是不排序。使用O(log n)空间,您可以构建一个延续列表,以跟踪不太适合的元素的存放位置。使用递归可以使其适合于O(1)堆,但这只是通过使用O(log n)堆栈帧来实现的。

Here is the progress of merge-sorting odds and evens from 1-9. Notice how you require log-space accounting to track the order inversions caused by the twin constraints of constant space and linear swaps.

这是合并排序赔率和平均值的进展。请注意,如何要求日志空间会计跟踪由常量空间和线性交换的孪生约束引起的顺序逆序性。

.     -
135792468
 .   -
135792468
  :  .-
125793468
   : .-
123795468
    #.:-
123495768
     :.-
123459768
      .:-
123456798
       .-
123456789

123456789

There are some delicate boundary conditions, slightly harder than binary search to get right, and even in this (possible) form, and therefore a bad homework problem; but a really good mental exercise.

有一些微妙的边界条件,比二分查找稍微难一点,即使是这种(可能的)形式,因此是一个糟糕的家庭作业问题;但这是一个很好的心理训练。

Update Apparently I am mistaken and there is an algorithm that provides O(n) time and O(1) space. I have downloaded the papers to enlighten myself, and withdraw this answer as incorrect.

更新显然我错了,有一个算法提供O(n)时间和O(1)空间。我下载了这些论文来启发我自己,并以不正确为由收回了这个答案。

#1


9  

This seems to indicate that it is possible to do in O(lg^2 n) space. I cannot see how to prove that it is impossible to merge in constant space, but I cannot see how to do it either.

这似乎表明,可以在O(lg ^ 2 n)空间。我不知道如何证明在恒定空间中是不可能合并的,但我也不知道怎么去做。

Edit: Chasing references, Knuth Vol 3 - Exercise 5.5.3 says "A considerably more complicated algorithm of L. Trabb-Pardo provides the best possible answer to this problem: It is possible to do stable merging in O(n) time and stable sorting in O(n lg n) time, using only O(lg n) bits of auxiliary memory for a fixed number of index variables.

编辑:追逐引用,Knuth卷3 -锻炼5.5.3说:“一个更复杂的算法l . Trabb-Pardo提供最好的回答这个问题:可以做稳定的合并在O(n)时间和稳定的排序在O(nlgn)时间,只使用O(log n)的辅助记忆固定数量的指标变量。

More references that I have not read. Thanks for an interesting problem.

更多的参考文献我没有读过。谢谢你提出了一个有趣的问题。

Further edit: This article claims that the article by Huang and Langston have an algorithm that merges two lists of size m and n in time O(m + n), so the answer to your question would seem to be yes. Unfortunately I do not have access to the article, so I must trust the second hand information. I'm not sure how to reconcile this with Knuth's pronouncement that the Trabb-Pardo algorithm is optimal. If my life depended on it, I'd go with Knuth.

进一步编辑:这篇文章声称Huang和Langston的文章有一个算法,它在时间O(m + n)中合并了两个大小为m和n的列表,所以你的问题的答案似乎是肯定的。不幸的是,我无法访问这篇文章,所以我必须相信第一手的信息。我不知道如何调和Knuth所说的Trabb-Pardo算法是最优的。如果我的生命依靠它,我就和Knuth一起走。

I now see that this had been asked as and earlier Stack Overflow question a number of times. I don't have the heart to flag it as a duplicate.

我现在看到这个问题已经被问过很多次了。我不忍心把它标记为复制品。

Huang B.-C. and Langston M. A., Practical in-place merging, Comm. ACM 31 (1988) 348-352

黄在公元前。和兰斯顿·m·A。(1988) 348-352

#2


2  

There are several algorithms for doing this, none of which are very easy to intuit. The key idea is to use a part of the arrays to merge as a buffer, then doing a standard merge using this buffer for auxiliary space. If you can then reposition the elements so that the buffer elements are in the right place, you're golden.

有几种算法可以做到这一点,但没有一种是非常容易直觉的。关键思想是使用数组的一部分将其合并为缓冲区,然后使用这个缓冲区为辅助空间执行标准的合并。如果您可以重新定位元素,使缓冲元素在正确的位置,那么您就是黄金。

I have written up an implementation of one of these algorithms on my personal site if you're interested in looking at it. It's based on the paper "Practical In-Place Merging" by Huang and Langston. You probably will want to look over that paper for some insight.

如果你有兴趣的话,我已经在我的个人网站上写了一个算法的实现。这是基于黄和兰斯顿的论文《现场实际合并》。你可能会想看看那篇论文,以获得一些见解。

I've also heard that there are good adaptive algorithms for this, which use some fixed-size buffer of your choosing (which could be O(1) if you wanted), but then scale elegantly with the buffer size. I don't know any of these off the top of my head, but I'm sure a quick search for "adaptive merge" might turn something up.

我还听说有很好的自适应算法,它使用了你选择的固定大小的缓冲区(如果你想要的话可以是O(1)),但是要优雅地使用缓冲区大小。我自己都不知道这些,但我确信快速搜索“适应性合并”可能会有帮助。

#3


1  

No it isn't possible, although my job would be much easier if it was :).

不,这是不可能的,尽管如果是这样的话,我的工作会容易得多。

You have a O(log n) factor which you can't avoid. You can choose to take it as time or space, but the only way to avoid it is to not sort. With O(log n) space you can build a list of continuations that keep track of where you stashed the elements that didn't quite fit. With recursion this can be made to fit in O(1) heap, but that's only by using O(log n) stack frames instead.

你有一个O(log n)因子你无法避免。你可以选择将它作为时间或空间,但避免它的唯一方法是不排序。使用O(log n)空间,您可以构建一个延续列表,以跟踪不太适合的元素的存放位置。使用递归可以使其适合于O(1)堆,但这只是通过使用O(log n)堆栈帧来实现的。

Here is the progress of merge-sorting odds and evens from 1-9. Notice how you require log-space accounting to track the order inversions caused by the twin constraints of constant space and linear swaps.

这是合并排序赔率和平均值的进展。请注意,如何要求日志空间会计跟踪由常量空间和线性交换的孪生约束引起的顺序逆序性。

.     -
135792468
 .   -
135792468
  :  .-
125793468
   : .-
123795468
    #.:-
123495768
     :.-
123459768
      .:-
123456798
       .-
123456789

123456789

There are some delicate boundary conditions, slightly harder than binary search to get right, and even in this (possible) form, and therefore a bad homework problem; but a really good mental exercise.

有一些微妙的边界条件,比二分查找稍微难一点,即使是这种(可能的)形式,因此是一个糟糕的家庭作业问题;但这是一个很好的心理训练。

Update Apparently I am mistaken and there is an algorithm that provides O(n) time and O(1) space. I have downloaded the papers to enlighten myself, and withdraw this answer as incorrect.

更新显然我错了,有一个算法提供O(n)时间和O(1)空间。我下载了这些论文来启发我自己,并以不正确为由收回了这个答案。