是否可以在O(N)中重新排列数组?

If I have a size N array of objects, and I have an array of unique numbers in the range 1...N, is there any algorithm to rearrange the object array in-place in the order specified by the list of numbers, and yet do this in O(N) time?

如果我有一个N大小的对象数组，并且在范围1中有一个唯一数字数组…N，是否有算法按照数字列表指定的顺序重新排列对象数组，然后在O(N)时间内重新排列?

Context: I am doing a quick-sort-ish algorithm on objects that are fairly large in size, so it would be faster to do the swaps on indices than on the objects themselves, and only move the objects in one final pass. I'd just like to know if I could do this last pass without allocating memory for a separate array.

上下文:我正在对大小相当大的对象做一个快速排序的算法，所以在索引上做交换比在对象本身上做的更快，并且只在一个最终的通过中移动对象。我只是想知道我是否可以在不为单独的数组分配内存的情况下进行最后的传递。

Edit: I am not asking how to do a sort in O(N) time, but rather how to do the post-sort rearranging in O(N) time with O(1) space. Sorry for not making this clear.

编辑:我不是问如何在O(N)时间中进行排序，而是如何在O(N)时间中对O(1)空间进行排序后的重新排列。很抱歉没说清楚。

9 个解决方案

#1

I think this should do:

我认为这应该是:

static <T> void arrange(T[] data, int[] p) {
    boolean[] done = new boolean[p.length];        
    for (int i = 0; i < p.length; i++) {
        if (!done[i]) {
            T t = data[i];
            for (int j = i;;) {
                done[j] = true;

                if (p[j] != i) {
                    data[j] = data[p[j]];
                    j = p[j];
                } else {
                    data[j] = t;
                    break;
                }
            }                
        }
    }
}

Note: This is Java. If you do this in a language without garbage collection, be sure to delete done.

注意:这是Java。如果在没有垃圾收集的语言中执行此操作，请确保删除已完成。

If you care about space, you can use a BitSet for done. I assume you can afford an additional bit per element because you seem willing to work with a permutation array, which is several times that size.

如果您关心空间，您可以使用位集完成。我假设您可以为每个元素提供额外的位元，因为您似乎愿意使用一个排列数组，它的大小是这个数组的数倍。

This algorithm copies instances of T n + k times, where k is the number of cycles in the permutation. You can reduce this to the optimal number of copies by skipping those i where p[i] = i.

该算法复制tn + k次的实例，其中k是置换的周期数。通过跳过p[i] = i的i，可以将其减少到最佳拷贝数。

#2

The approach is to follow the "permutation cycles" of the permutation, rather than indexing the array left-to-right. But since you do have to begin somewhere, everytime a new permutation cycle is needed, the search for unpermuted elements is left-to-right:

方法是跟踪置换的“置换循环”，而不是从左到右索引数组。但是，既然你必须从某个地方开始，每一次都需要一个新的排列周期，对不完全的元素的搜索是从左到右的:

// Pseudo-code
N : integer, N > 0 // N is the number of elements
swaps : integer [0..N]
data[N] : array of object
permute[N] : array of integer [-1..N]  denoting permutation (used element is -1)
next_scan_start : integer;

next_scan_start  = 0;

    while (swaps < N )
    {
        // Search for the next index that is not-yet-permtued.  
        for (idx_cycle_search = next_scan_start; 
             idx_cycle_search < N;
             ++ idx_cycle_search)
            if (permute[idx_cycle_search] >= 0)
                break;

        next_scan_start = idx_cycle_search + 1;

        // This is a provable invariant.  In short, number of non-negative
        // elements in permute[] equals (N - swaps)
        assert( idx_cycle_search < N );

        // Completely permute one permutation cycle, 'following the
        // permutation cycle's trail'  This is O(N)
        while (permute[idx_cycle_search] >= 0)
        {
             swap( data[idx_cycle_search], data[permute[idx_cycle_search] )
             swaps ++;

             old_idx = idx_cycle_search;
             idx_cycle_search = permute[idx_cycle_search];
             permute[old_idx] = -1;  
             // Also '= -idx_cycle_search -1' could be used rather than '-1'
             // and would allow reversal of these changes to permute[] array
        }
    }

#3

Do you mean that you have an array of objects O[1..N] and then you have an array P[1..N] that contains a permutation of numbers 1..N and in the end you want to get an array O1 of objects such that O1[k] = O[P[k]] for all k=1..N ?

你的意思是你有一个对象数组吗?然后有一个数组P[1]。包含数字1的排列。最后，你想要得到一个对象的数组O1这样，对于所有的k=1, O1[k] = O[P[k]]。N ?

As an example, if your objects are letters A,B,C...,Y,Z and your array P is [26,25,24,..,2,1] is your desired output Z,Y,...C,B,A ?

例如，如果你的对象是字母A,B,C…，Y,Z，你的数组P是[26,25,24，。，2,1]是你想要的输出Z,Y，…C,B,A ?

If yes, I believe you can do it in linear time using only O(1) additional memory. Reversing elements of an array is a special case of this scenario. In general, I think you would need to consider decomposition of your permutation P into cycles and then use it to move around the elements of your original array O[].

如果是，我相信你只需要O(1)额外的内存就可以用线性时间完成。数组的反转元素是这种情况的特殊情况。一般来说，我认为您需要考虑将permutation P分解为循环，然后使用它来移动原始数组O[]的元素。

If that's what you are looking for, I can elaborate more.

如果这就是你想要的，我可以详细说明。

EDIT: Others already presented excellent solutions while I was sleeping, so no need to repeat it here. ^_^

编辑:其他人已经在我睡觉的时候提出了很好的解决方案，所以没必要在这里重复。^ _ ^

EDIT: My O(1) additional space is indeed not entirely correct. I was thinking only about "data" elements, but in fact you also need to store one bit per permutation element, so if we are precise, we need O(log n) extra bits for that. But most of the time using a sign bit (as suggested by J.F. Sebastian) is fine, so in practice we may not need anything more than we already have.

编辑:我的O(1)附加空间确实不完全正确。我只考虑“数据”元素，但实际上你还需要为每个置换元素存储一个位，所以如果我们很精确的话，我们需要O(log n)额外的位。但是大多数时候使用符号位(正如J.F. Sebastian建议的)是可以的，因此在实践中我们可能不需要比我们已经拥有的更多的东西。

#4

If you didn't mind allocating memory for an extra hash of indexes, you could keep a mapping of original location to current location to get a time complexity of near O(n). Here's an example in Ruby, since it's readable and pseudocode-ish. (This could be shorter or more idiomatically Ruby-ish, but I've written it out for clarity.)

如果您不介意为一个额外的索引哈希分配内存，您可以保持原始位置到当前位置的映射，以获得接近O(n)的时间复杂度。这是Ruby中的一个例子，因为它是可读的和伪代码的。(这段文字可能更短，也可能更像泥土，但我写出来是为了表达清楚。)

#!/usr/bin/ruby

objects       = ['d', 'e', 'a', 'c', 'b']
order         = [2, 4, 3, 0, 1]
cur_locations = {}

order.each_with_index do |orig_location, ordinality|
  # Find the current location of the item.
  cur_location = orig_location
  while not cur_locations[cur_location].nil? do
    cur_location = cur_locations[cur_location]
  end

  # Swap the items and keep track of whatever we swapped forward.
  objects[ordinality], objects[cur_location] = objects[cur_location], objects[ordinality]
  cur_locations[ordinality] = orig_location
end

puts objects.join(' ')

That obviously does involve some extra memory for the hash, but since it's just for indexes and not your "fairly large" objects, hopefully that's acceptable. Since hash lookups are O(1), even though there is a slight bump to the complexity due to the case where an item has been swapped forward more than once and you have to rewrite cur_location multiple times, the algorithm as a whole should be reasonably close to O(n).

显然，这确实需要为哈希添加一些额外的内存，但因为它只用于索引，而不是“相当大”的对象，所以希望这是可以接受的。由于哈希查找是O(1)，尽管由于一个项目已经多次交换，而且必须多次重写cur_location，因此复杂性会有轻微的提高，因此，整个算法应该与O(n)相当接近。

If you wanted you could build a full hash of original to current positions ahead of time, or keep a reverse hash of current to original, and modify the algorithm a bit to get it down to strictly O(n). It'd be a little more complicated and take a little more space, so this is the version I wrote out, but the modifications shouldn't be difficult.

如果需要，您可以提前构建一个原始哈希到当前位置的完整哈希，或者将一个反向哈希到原始状态，并稍微修改一下算法，使其严格降到O(n)。它会稍微复杂一点，再多花一点空间，这是我写的版本，但修改并不难。

EDIT: Actually, I'm fairly certain the time complexity is just O(n), since each ordinality can have at most one hop associated, and thus the maximum number of lookups is limited to n.

编辑:实际上，我相当确定时间复杂度是O(n)，因为每个序数最多可以有一个跳转关联，因此查找的最大数量被限制为n。

#5

#!/usr/bin/env python

def rearrange(objects, permutation):
    """Rearrange `objects` inplace according to `permutation`.

       ``result = [objects[p] for p in permutation]``
    """
    seen = [False] * len(permutation)
    for i, already_seen in enumerate(seen):
        if not already_seen: # start permutation cycle
            first_obj, j = objects[i], i
            while True:
                seen[j] = True
                p = permutation[j]
                if p == i: # end permutation cycle
                    objects[j] = first_obj    # [old] p -> j
                    break
                objects[j], j = objects[p], p #       p -> j

The algorithm (as I've noticed after I wrote it) is the same as the one from @meriton's answer in Java.

这个算法(正如我写完后注意到的)与@meriton在Java中的回答相同。

Here's a test function for the code:

下面是代码的测试函数:

def test():
    import itertools
    N = 9
    for perm in itertools.permutations(range(N)):
        L = range(N)
        LL = L[:]
        rearrange(L, perm)
        assert L == [LL[i] for i in perm] == list(perm), (L, list(perm), LL)

    # test whether assertions are enabled
    try:
        assert 0
    except AssertionError:
        pass
    else:
        raise RuntimeError("assertions must be enabled for the test")

if __name__ == "__main__":
    test()

#6

There's a histogram sort, though the running time is given as a bit higher than O(N) (N log log n).

有一个直方图排序，虽然运行时间比O(N) (N log log N)略高。

#7

I can do it given O(N) scratch space -- copy to new array and copy back.

我可以在O(N)的基础上做——复制到新的数组，然后复制回来。

EDIT: I am aware of the existance of an algorithm that will proceed through. The idea is to perform the swaps on the array of integers 1..N while at the same time mirroring the swaps on your array of large objects. I just cannot find the algorithm right now.

编辑:我意识到一个算法的存在，它将继续进行。其思想是对整数数组执行交换。同时镜像你的大对象数组的交换。我现在找不到算法。

#8

The problem is one of applying a permutation in place with minimal O(1) extra storage: "in-situ permutation".

问题是，在有最小O(1)额外存储的地方应用置换:“原位置换”。

It is solvable, but an algorithm is not obvious beforehand.

它是可解的，但是算法事先并不明显。

It is described briefly as an exercise in Knuth, and for work I had to decipher it and figure out how it worked. Look at 5.2 #13.

它被简单地描述为《Knuth》的一个练习，在工作中我必须破译它并弄清楚它是如何工作的。看5.2 # 13。

For some more modern work on this problem, with pseudocode:

对于这个问题的一些更现代的工作，用伪代码:

http://www.fernuni-hagen.de/imperia/md/content/fakultaetfuermathematikundinformatik/forschung/berichte/bericht_273.pdf

#9

I ended up writing a different algorithm for this, which first generates a list of swaps to apply an order and then runs through the swaps to apply it. The advantage is that if you're applying the ordering to multiple lists, you can reuse the swap list, since the swap algorithm is extremely simple.

最后我为它编写了一个不同的算法，它首先生成一个应用订单的交换列表，然后遍历交换来应用它。优点是，如果对多个列表应用排序，您可以重用交换列表，因为交换算法极其简单。

void make_swaps(vector<int> order, vector<pair<int,int>> &swaps)
{
    // order[0] is the index in the old list of the new list's first value.
    // Invert the mapping: inverse[0] is the index in the new list of the
    // old list's first value.
    vector<int> inverse(order.size());
    for(int i = 0; i < order.size(); ++i)
        inverse[order[i]] = i;

    swaps.resize(0);

    for(int idx1 = 0; idx1 < order.size(); ++idx1)
    {
        // Swap list[idx] with list[order[idx]], and record this swap.
        int idx2 = order[idx1];
        if(idx1 == idx2)
            continue;

        swaps.push_back(make_pair(idx1, idx2));

        // list[idx1] is now in the correct place, but whoever wanted the value we moved out
        // of idx2 now needs to look in its new position.
        int idx1_dep = inverse[idx1];
        order[idx1_dep] = idx2;
        inverse[idx2] = idx1_dep;
    }
}

template<typename T>
void run_swaps(T data, const vector<pair<int,int>> &swaps)
{
    for(const auto &s: swaps)
    {
        int src = s.first;
        int dst = s.second;
        swap(data[src], data[dst]);
    }
}

void test()
{
    vector<int> order = { 2, 3, 1, 4, 0 };

    vector<pair<int,int>> swaps;
    make_swaps(order, swaps);

    vector<string> data = { "a", "b", "c", "d", "e" };
    run_swaps(data, swaps);
}

#1