Quicksort算法,需要一些小的澄清

时间:2021-06-02 11:44:06

I'm studying basic array sorting and am struggling just a little bit to fully understand it's logic. I understand the recursion, meaning splitting an array to two arrays on each side of a pivot, and then continuing partitioning each of these sub-arrays until an array of just one element is reached. What I don't always completely understand is the implementation of the while loop itself.

我正在研究基本的数组排序,并且只是为了完全理解它的逻辑而苦苦挣扎。我理解递归,意味着将数组拆分到数据库每一侧的两个数组,然后继续对每个子数组进行分区,直到只到达一个元素的数组。我并不总是完全理解的是while循环本身的实现。

http://www.vogella.com/tutorials/JavaAlgorithmsQuicksort/article.html

Here I've encountered an implementation that chooses the middle element as the pivot. I understand the pivot can be any element, whether it's the first, last, any random element or an element specifically chosen for maximum efficiency. With the middle element as pivot for some reason I find it more intuitive to understand.

在这里,我遇到了一个选择中间元素作为数据透视的实现。我理解枢轴可以是任何元素,无论是第一个,最后一个,任何随机元素还是为最大效率而专门选择的元素。由于某种原因,将中间元素作为枢轴,我发现它更直观易懂。

  while (i <= j) {
        // If the current value from the left list is smaller than the pivot
        // element then get the next element from the left list
        while (numbers[i] < pivot) {
            i++;
        }
        // If the current value from the right list is larger than the pivot
        // element then get the next element from the right list
        while (numbers[j] > pivot) {
            j--;
        }

        // If we have found a value in the left list which is larger than
        // the pivot element and if we have found a value in the right list
        // which is smaller than the pivot element then we exchange the
        // values.
        // As we are done we can increase i and j
        if (i <= j) {
            exchange(i, j);
            i++;
            j--;
        }
    }

This is the relevant part.

这是相关部分。

1) Why do we increase i and decrease j only if the i element is smaller than the pivot or the j element is larger than the pivot? Why not also when it's equal? If an element is equal to the pivot, it's fine where it is, since it doesn't matter which side of the pivot it ends up in, so why can't we move on increasing/decreasing? I've tried doing it by the way and as I expected the end result wasn't sorted and I didn't understand what went wrong even after debugging step by step.

1)为什么只有当i元素小于pivot或j元素大于pivot时,我们才增加i并减少j?为什么不平等呢?如果一个元素等于枢轴,那么它就好了,因为它最终位于枢轴的哪一侧并不重要,那么为什么我们不能继续增加/减少呢?我试过这样做,因为我预计最终结果没有排序,即使经过一步一步的调试,我也不明白出了什么问题。

2) Am I right in saying that, excluding edge cases, upon exiting the external loop, i = j +1? Always? And that one of these elements, i or j, has the value of the pivot we used? But which one and why?

2)我是否正确地说,除了边缘情况,在退出外部循环时,i = j +1?总是?而这些元素之一,i或j,是否具有我们使用的支点的价值?但是哪一个和为什么?

// Recursion
    if (low < j)
        quicksort(low, j);
    if (i < high)
        quicksort(i, high);

3) What I would have expected here, instead of passing on low to j and i to high, is: assuming pivotIndex is the index of the pivot after the loop ends,

3)我在这里期望的是,假设pivotIndex是循环结束后的pivot的索引,而不是传递低到j和i到高。

quicksort(low, pivotIndex - 1);
quicksort(pivotIndex + 1, high);

because pivotIndex is right where the pivot should be, it's his final position. So I would love an explanation for that too, please.

因为pivotIndex正好在枢轴所在的位置,所以这是他的最终位置。所以我也很乐意为此做一个解释。

Thank you.

3 个解决方案

#1


1  

Before getting into the corner cases, let's take a look at the normal operation of the algorithm. A sample starting array is shown below. The pivot value is 33. The indexes i and j are shown above the array.

在进入角落的情况之前,让我们来看看算法的正常运行。样本起始数组如下所示。数据透视值为33.索引i和j显示在数组上方。

Quicksort算法,需要一些小的澄清

The while loops update i and j. i moves forward to the first value greater than the pivot. The values in green (25 and 8) are less than the pivot, and are in their final positions. The value in red (49) is greater than the pivot, and is in its final position.

while循环更新i和j。我向前移动到大于枢轴的第一个值。绿色(25和8)中的值小于枢轴,并且处于最终位置。红色(49)中的值大于枢轴,并且处于其最终位置。

Quicksort算法,需要一些小的澄清

Now, the numbers 51 and 14 are swapped, and i gets incremented, j gets decremented.

现在,数字51和14被交换,并且我增加,j减少。

Quicksort算法,需要一些小的澄清

The while loops update i and j. Note that j didn't move.

while循环更新i和j。请注意,j没有移动。

Quicksort算法,需要一些小的澄清

After the exchange (pivot values shown in yellow)

交换后(枢轴值显示为黄色)

Quicksort算法,需要一些小的澄清

One more time through the while loops and i has moved past j which completes the partitioning. You can see that all of the values from i to the end of the array are either red (greater than the pivot) or yellow (the pivot). And all value from j to the beginning of the array are either green (less than the pivot) or yellow.

再过一次while循环,我已经移过j完成分区。您可以看到从i到数组末尾的所有值都是红色(大于枢轴)或黄色(枢轴)。从j到数组开头的所有值都是绿色(小于枢轴)或黄色。

Quicksort算法,需要一些小的澄清

So in this case i=j+1 and one of the indexes (i) points to the pivot.

所以在这种情况下,i = j + 1,其中一个索引(i)指向枢轴。

Now let's look at an array where i doesn't end up equal to j+1, and neither i nor j points to the pivot value. The starting array (using 33 as the pivot again):

现在让我们看一个数组,其中我最终不会等于j + 1,并且i和j都不指向数据透视值。起始数组(再次使用33作为枢轴):

Quicksort算法,需要一些小的澄清

After the while loops:

在while循环之后:

Quicksort算法,需要一些小的澄清

After swapping 2 with 50:

用50交换2后:

Quicksort算法,需要一些小的澄清

After the while loops:

在while循环之后:

Quicksort算法,需要一些小的澄清

Notice that both i and j stopped at the pivot value. So after the exchange, the situation looks like this:

请注意,i和j都停在了pivot值。所以在交换之后,情况看起来像这样:

Quicksort算法,需要一些小的澄清

I believe that answers questions 2 and 3. It is not always true that i=j+1, and it is not always true that one of the indexes points to the pivot. Also, the algorithm already excludes the pivot from the recursive calls in some cases.

我相信回答问题2和3.我不总是i = j + 1,并且其中一个索引指向枢轴并不总是正确的。此外,在某些情况下,算法已经从递归调用中排除了数据透视。

You could modify the algorithm to exclude the pivot in more cases, but that seems risky and messy. And there's not much benefit unless the array size is fairly small. In production qsort implementations, the algorithm switches to a different method (e.g. selection sort) when the array sizes get small.

您可以修改算法以在更多情况下排除枢轴,但这似乎有风险和混乱。除非阵列尺寸相当小,否则没有太大的好处。在生产qsort实现中,当阵列大小变小时,算法切换到不同的方法(例如,选择排序)。

On to question 1. What happens if the while loops are modified to allow values equal to the pivot? The answer is that one of the indexes can run off the end of the array. Here's an example (using 33 as the pivot):

问题1.如果修改while循环以允许值等于pivot,会发生什么?答案是其中一个索引可以在数组末尾运行。这是一个例子(使用33作为枢轴):

Quicksort算法,需要一些小的澄清

All of the values are greater than or equal to the pivot, so the j index is going to run right off the end of the array:

所有值都大于或等于pivot,因此j索引将在数组末尾运行:

Quicksort算法,需要一些小的澄清

Depending on what's in memory ahead of the array, the j index could keep on going for a very long time, and even crash the program. This could be fixed by comparing j with zero
while(j>0 && numbers[j] >= pivot) but that extra check slows down the algorithm. The amount of time that you save by not stopping at the pivot is not nearly as much as the amount of time wasted doing the extra check in the while loops.

根据数组前面的内存,j索引可能会持续很长时间,甚至会使程序崩溃。这可以通过将j与零进行比较来确定(j> 0 && numbers [j]> = pivot)但是额外的检查会减慢算法的速度。通过不停在枢轴处节省的时间量几乎不会浪费在while循环中执行额外检查所浪费的时间。

#2


1  

  1. Let's consider your suggestion, step by step:
    Base conditions:
    arr = {1, 20, 3, 4, 20, 70, 80, 90, 100}
    i->1 j->100 pivot = 20 (the 2nd one)
    Iteration 1:
    while (numbers[i] <= pivot)... => i->70
    while (numbers[j] >= pivot)... => j->20 (2nd one)
    if (i <= j)... => false
    Iteration 2:
    while (i <= j) => false
    So this makes the array already sorted. (which is not true).

    While in the current algorithm, in the 1st iteration, i -> 20 (1st one) and j -> 20 (2nd one), and a swap occurs. What matters here is not the swap itself but the fact that i did not become greater than j on the 1st iteration
  2. 让我们一步一步考虑你的建议:基本条件:arr = {1,20,3,4,20,70,80,90,100} i-> 1 j-> 100 pivot = 20(第二个)迭代1:while(numbers [i] <= pivot)... => i-> 70 while(numbers [j]> = pivot)... => j-> 20(2nd one)if(i <= j )... => false迭代2:while(i <= j)=> false因此这使得数组已经排序。 (这不是真的)。而在当前算法中,在第一次迭代中,i - > 20(第一个)和j - > 20(第二个),并且发生交换。这里重要的不是交换本身,而是我在第一次迭代中没有变得大于j的事实

  3. What if i=j in the if statment? that would execute both i++ and j-- making the difference between them 2. if that case does not occur then yes i=j+1 and you can't tell if it's i or j equals to pivot because then it would depend on the location of the pivot in the array.
  4. 如果在if语句中i = j怎么办?这将执行i ++和j--使它们之间产生差异2.如果那种情况没有发生那么是i = j + 1并且你无法判断它是i还是j等于pivot,因为它将取决于数组中枢轴的位置。

  5. This is exactly what's happening here. Upon exiting the external loop, i will be equal to pivotIndex + 1 and j will be equal to pivotIndex - 1; either i or j will also include pivotIndex itself. You can work it out step by step using pen and paper and see what's going on.
  6. 这正是这里发生的事情。退出外部循环后,我将等于pivotIndex + 1,j将等于pivotIndex - 1; i或j也将包括pivotIndex本身。您可以使用笔和纸一步一步地完成它,看看发生了什么。

Hope that helped

希望有所帮助

#3


0  

  1. If you include equality in the first loop, it will not stop if no elements are larger than pivot. Using strict inequalities is an easy way to ensure the algorithm stops.
  2. 如果在第一个循环中包含相等性,则如果没有元素大于pivot,则不会停止。使用严格的不等式是确保算法停止的简单方法。

#1


1  

Before getting into the corner cases, let's take a look at the normal operation of the algorithm. A sample starting array is shown below. The pivot value is 33. The indexes i and j are shown above the array.

在进入角落的情况之前,让我们来看看算法的正常运行。样本起始数组如下所示。数据透视值为33.索引i和j显示在数组上方。

Quicksort算法,需要一些小的澄清

The while loops update i and j. i moves forward to the first value greater than the pivot. The values in green (25 and 8) are less than the pivot, and are in their final positions. The value in red (49) is greater than the pivot, and is in its final position.

while循环更新i和j。我向前移动到大于枢轴的第一个值。绿色(25和8)中的值小于枢轴,并且处于最终位置。红色(49)中的值大于枢轴,并且处于其最终位置。

Quicksort算法,需要一些小的澄清

Now, the numbers 51 and 14 are swapped, and i gets incremented, j gets decremented.

现在,数字51和14被交换,并且我增加,j减少。

Quicksort算法,需要一些小的澄清

The while loops update i and j. Note that j didn't move.

while循环更新i和j。请注意,j没有移动。

Quicksort算法,需要一些小的澄清

After the exchange (pivot values shown in yellow)

交换后(枢轴值显示为黄色)

Quicksort算法,需要一些小的澄清

One more time through the while loops and i has moved past j which completes the partitioning. You can see that all of the values from i to the end of the array are either red (greater than the pivot) or yellow (the pivot). And all value from j to the beginning of the array are either green (less than the pivot) or yellow.

再过一次while循环,我已经移过j完成分区。您可以看到从i到数组末尾的所有值都是红色(大于枢轴)或黄色(枢轴)。从j到数组开头的所有值都是绿色(小于枢轴)或黄色。

Quicksort算法,需要一些小的澄清

So in this case i=j+1 and one of the indexes (i) points to the pivot.

所以在这种情况下,i = j + 1,其中一个索引(i)指向枢轴。

Now let's look at an array where i doesn't end up equal to j+1, and neither i nor j points to the pivot value. The starting array (using 33 as the pivot again):

现在让我们看一个数组,其中我最终不会等于j + 1,并且i和j都不指向数据透视值。起始数组(再次使用33作为枢轴):

Quicksort算法,需要一些小的澄清

After the while loops:

在while循环之后:

Quicksort算法,需要一些小的澄清

After swapping 2 with 50:

用50交换2后:

Quicksort算法,需要一些小的澄清

After the while loops:

在while循环之后:

Quicksort算法,需要一些小的澄清

Notice that both i and j stopped at the pivot value. So after the exchange, the situation looks like this:

请注意,i和j都停在了pivot值。所以在交换之后,情况看起来像这样:

Quicksort算法,需要一些小的澄清

I believe that answers questions 2 and 3. It is not always true that i=j+1, and it is not always true that one of the indexes points to the pivot. Also, the algorithm already excludes the pivot from the recursive calls in some cases.

我相信回答问题2和3.我不总是i = j + 1,并且其中一个索引指向枢轴并不总是正确的。此外,在某些情况下,算法已经从递归调用中排除了数据透视。

You could modify the algorithm to exclude the pivot in more cases, but that seems risky and messy. And there's not much benefit unless the array size is fairly small. In production qsort implementations, the algorithm switches to a different method (e.g. selection sort) when the array sizes get small.

您可以修改算法以在更多情况下排除枢轴,但这似乎有风险和混乱。除非阵列尺寸相当小,否则没有太大的好处。在生产qsort实现中,当阵列大小变小时,算法切换到不同的方法(例如,选择排序)。

On to question 1. What happens if the while loops are modified to allow values equal to the pivot? The answer is that one of the indexes can run off the end of the array. Here's an example (using 33 as the pivot):

问题1.如果修改while循环以允许值等于pivot,会发生什么?答案是其中一个索引可以在数组末尾运行。这是一个例子(使用33作为枢轴):

Quicksort算法,需要一些小的澄清

All of the values are greater than or equal to the pivot, so the j index is going to run right off the end of the array:

所有值都大于或等于pivot,因此j索引将在数组末尾运行:

Quicksort算法,需要一些小的澄清

Depending on what's in memory ahead of the array, the j index could keep on going for a very long time, and even crash the program. This could be fixed by comparing j with zero
while(j>0 && numbers[j] >= pivot) but that extra check slows down the algorithm. The amount of time that you save by not stopping at the pivot is not nearly as much as the amount of time wasted doing the extra check in the while loops.

根据数组前面的内存,j索引可能会持续很长时间,甚至会使程序崩溃。这可以通过将j与零进行比较来确定(j> 0 && numbers [j]> = pivot)但是额外的检查会减慢算法的速度。通过不停在枢轴处节省的时间量几乎不会浪费在while循环中执行额外检查所浪费的时间。

#2


1  

  1. Let's consider your suggestion, step by step:
    Base conditions:
    arr = {1, 20, 3, 4, 20, 70, 80, 90, 100}
    i->1 j->100 pivot = 20 (the 2nd one)
    Iteration 1:
    while (numbers[i] <= pivot)... => i->70
    while (numbers[j] >= pivot)... => j->20 (2nd one)
    if (i <= j)... => false
    Iteration 2:
    while (i <= j) => false
    So this makes the array already sorted. (which is not true).

    While in the current algorithm, in the 1st iteration, i -> 20 (1st one) and j -> 20 (2nd one), and a swap occurs. What matters here is not the swap itself but the fact that i did not become greater than j on the 1st iteration
  2. 让我们一步一步考虑你的建议:基本条件:arr = {1,20,3,4,20,70,80,90,100} i-> 1 j-> 100 pivot = 20(第二个)迭代1:while(numbers [i] <= pivot)... => i-> 70 while(numbers [j]> = pivot)... => j-> 20(2nd one)if(i <= j )... => false迭代2:while(i <= j)=> false因此这使得数组已经排序。 (这不是真的)。而在当前算法中,在第一次迭代中,i - > 20(第一个)和j - > 20(第二个),并且发生交换。这里重要的不是交换本身,而是我在第一次迭代中没有变得大于j的事实

  3. What if i=j in the if statment? that would execute both i++ and j-- making the difference between them 2. if that case does not occur then yes i=j+1 and you can't tell if it's i or j equals to pivot because then it would depend on the location of the pivot in the array.
  4. 如果在if语句中i = j怎么办?这将执行i ++和j--使它们之间产生差异2.如果那种情况没有发生那么是i = j + 1并且你无法判断它是i还是j等于pivot,因为它将取决于数组中枢轴的位置。

  5. This is exactly what's happening here. Upon exiting the external loop, i will be equal to pivotIndex + 1 and j will be equal to pivotIndex - 1; either i or j will also include pivotIndex itself. You can work it out step by step using pen and paper and see what's going on.
  6. 这正是这里发生的事情。退出外部循环后,我将等于pivotIndex + 1,j将等于pivotIndex - 1; i或j也将包括pivotIndex本身。您可以使用笔和纸一步一步地完成它,看看发生了什么。

Hope that helped

希望有所帮助

#3


0  

  1. If you include equality in the first loop, it will not stop if no elements are larger than pivot. Using strict inequalities is an easy way to ensure the algorithm stops.
  2. 如果在第一个循环中包含相等性,则如果没有元素大于pivot,则不会停止。使用严格的不等式是确保算法停止的简单方法。