在n个项目中寻找k个最小数字的算法。

时间:2022-09-06 17:24:20

I'm trying to write an algorithm which can print the k smallest numbers in an n-size-array in O(n) time, but I cannot reduce the time complexity to n. How can I do this?

我在写一个算法,它可以在O(n)时间内以n-size数组输出k个最小的数,但是我不能把时间复杂度减少到n。

11 个解决方案

#1


29  

I've done this in an interview before, and one of the most elegant/efficient ways to do this is

我以前在一次面试中做过这个,最优雅、最有效的方法之一就是

O(n log k). 
with space: O(k) (thanks, @Nzbuu)

Basically you're going to use a max-heap of size limited to k. For each item in the array, check to see if it's smaller than the max (only O(1)). If it is, put that in the heap (O(log k)) and remove the max. If its bigger, go to the next item.

基本上,你将使用一个大小限制为k的max-heap,对于数组中的每个条目,检查它是否小于max(只有O(1))。如果是,将其放入堆(O(log k))并删除最大值。如果大点,就去下一个项目。

Of course, the heap doesn't yield a sorted list of k items, but that can be done in O(k log k) which is easy.

当然,堆不会产生k项的排序列表,但是可以在O(k log k)中完成,这很简单。

Similarly, you can do the same for finding the largest k items, in which case you would use a min-heap.

类似地,对于查找最大的k项,您也可以这样做,在这种情况下,您将使用min-heap。

#2


23  

you will need to find the k'th smallest element using 'selection algorithm', which is O(n), and then iterate the array again and return each element which is smaller/equals it.
selection algorithm: http://en.wikipedia.org/wiki/Selection_algorithm
you will have to pay attention if you have repeats: you will need to make sure you are not returning more then k elements (it is possible if for instance you have 1,2,...,k,k,k,...)

EDIT :
the full algorithm, and returning a list, as requested: let the array be A

您将需要使用“选择算法”找到第k个最小的元素,即O(n),然后再次迭代数组并返回每个更小/等于它的元素。选择算法:http://en.wikipedia.org/wiki/Selection_algorithm如果你有重复,你必须注意:你需要确保你没有返回超过k个元素(如果你有1、2、……、k、k、k、……)编辑:完整的算法,并返回一个列表,按要求:让数组为a

 1. find the k'th element in A using 'selection algorithm', let it be 'z'
 2. initialize an empty list 'L'
 3. initialize counter<-0
 4. for each element in A: 
 4.1. if element < z: 
   4.1.1. counter<-counter + 1 ; L.add(element)
 5. for each element in A:
 5.1. if element == z AND count < k:
   5.1.1. counter<-counter + 1 ; L.add(element)
 6. return L

note here that a 3rd iteration is required if your list might have duplicates. if it can't - it is needless, just change the condition in 4.1 to <=.
also note: L.add is inserting an element to a linked list and thus is O(1).

注意,如果您的列表可能有重复,则需要进行第3次迭代。如果不能-这是不必要的,只需将4.1中的条件更改为<=。还要注意:L。add是向链表插入一个元素,因此是O(1)。

#3


4  

Assuming you're trying to show the K smallest numbers, you can use Hoare's Select algorithm to find the kth smallest number. That partitions the array into the smaller numbers, the kth number, and the larger numbers.

假设你想要显示K个最小的数,你可以使用Hoare的选择算法来找到第K个最小的数。将数组划分为较小的数字、第k个数字和较大的数字。

#4


1  

This can be done in expected linear time(O(n)). First find the kth smallest element of the array (using pivot partition method for finding kth order statistic) and then simply iterate through the loop to check which elements are less than the kth smallest element. Note that this works correctly only for distinct element.

这可以在预期线性时间(O(n)中完成。首先找到数组的第k个最小元素(使用pivot分区方法查找第k个顺序统计量),然后简单地遍历循环,检查哪些元素小于第k个最小元素。注意,这只适用于不同的元素。

Here is the code in c:

这是c中的代码:

    /*find the k smallest elements of an array in O(n) time. Using the Kth order 
statistic-random pivoting algorithm to find the kth smallest element and then looping 
through the array to find the elements smaller than kth smallest element.Assuming 
distinct elements*/


    #include <stdio.h>
    #include <math.h>
    #include <time.h>
    #define SIZE 10
    #define swap(X,Y) {int temp=X; X=Y; Y=temp;}


    int partition(int array[], int start, int end)
    {
        if(start==end)
            return start;
        if(start>end)
            return -1;
        int pos=end+1,j;
        for(j=start+1;j<=end;j++)
        {       
            if(array[j]<=array[start] && pos!=end+1)
            {
                swap(array[j],array[pos]);
                pos++;
            }
            else if(pos==end+1 && array[j]>array[start])
                pos=j;
        }
        pos--;
        swap(array[start], array[pos]);
        return pos;
    }

    int order_statistic(int array[], int start, int end, int k)
    {
        if(start>end || (end-start+1)<k)
            return -1;                   //return -1 
        int pivot=rand()%(end-start+1)+start, position, p;
        swap(array[pivot], array[start]);
        position=partition(array, start, end);
        p=position;
        position=position-start+1;                  //size of left partition
        if(k==position)
            return array[p];
        else if(k<position)
            return order_statistic(array, start,p-1,k);
        else
            return order_statistic(array,p+1,end,k-position);
    }


    void main()
    {
        srand((unsigned int)time(NULL));
        int i, array[SIZE],k;
        printf("Printing the array...\n");
        for(i=0;i<SIZE;i++)
            array[i]=abs(rand()%100), printf("%d ",array[i]);
        printf("\n\nk=");
        scanf("%d",&k);
        int k_small=order_statistic(array,0,SIZE-1,k);
        printf("\n\n");
        if(k_small==-1)
        {
            printf("Not possible\n");
            return ;
        }
        printf("\nk smallest elements...\n");
        for(i=0;i<SIZE;i++)
        {
            if(array[i]<=k_small)
                printf("%d ",array[i]);
        }
    }

#5


1  

It is possible to find the k smallest of n elements in O(n) time (by which I mean true O(n) time, not O(n + some function of k)). Refer to the Wikipedia article "Selection algorithm", especially the subsections on "unordered partial sorting" and "median selection as pivot strategy", and also to the article "Median of medians" for the essential piece that makes this O(n).

可以在O(n)时间内找到n个元素中k最小的元素(我指的是真实的O(n)时间,而不是O(n + k的某个函数))。参考*上“选择算法”的文章,特别是“无序部分排序”和“中位数选择作为主策略”的小节,以及“中位数中位数”的文章,找到构成这个O(n)的基本部分。

#6


1  

Best possible solution to the problem is as follows.Use Quick sort to find pivots and discard the part where this kth element doesn't lie, and recursively find the next pivot. (It's kth Max finder , You need to change the if else condition to make it kth Min Finder) .Here is the JavaScript code-

对这个问题最好的可能解决办法如下。使用快速排序查找支点,并丢弃第k个元素不存在的部分,然后递归地找到下一个支点。(它是kth Max查找程序,您需要更改if else条件才能使它成为kth Min查找程序)

  // Complexity is O(n log(n))
  var source = [9, 2, 7, 11, 1, 3, 14, 22];

  var kthMax = function(minInd, MaxInd, kth) {
      // pivotInd stores the pivot position 
      // for current iteration
      var temp, pivotInd = minInd;
      if (minInd >= MaxInd) {
        return source[pivotInd];
      }

      for (var i = minInd; i < MaxInd; i++) {
        //If an element is greater than chosen pivot (i.e. last element)
        //Swap it with pivotPointer element. then increase ponter
        if (source[i] > source[MaxInd]) {
          temp = source[i];
          source[i] = source[pivotInd];
          source[pivotInd] = temp;
          pivotInd++;
        }
      }
      // we have found position for pivot elem. 
      // swap it to that position place .
      temp = source[pivotInd];
      source[pivotInd] = source[MaxInd];
      source[MaxInd] = temp;

      // Only try to sort the part in which kth index lies.
      if (kth > pivotInd) {
        return kthMax(pivotInd + 1, MaxInd, kth);
      } else if (kth < pivotInd) {
        return kthMax(minInd, pivotInd - 1, kth);
      } else {
        return source[pivotInd];
      }

    }
    // last argument is kth-1 , so if give 2 it will give you,
    // 3rd max which is 11

  console.log(kthMax(0, source.length - 1, 2));

#7


1  

I know not exactly what you are looking for but pretty simple O(n * k) time and O(k) space. This is the biggest K so would need to flop it around.

我不知道你在找什么,但是很简单的O(n * k)时间和O(k)空间。这是最大的K,所以需要翻来翻去。

For the brute for min of k (result) can substitute a heap

对于原始的最小k(结果)可以替换一个堆

private int[] FindKBiggestNumbersM(int[] testArray, int k)
{
    int[] result = new int[k];
    int indexMin = 0;
    result[indexMin] = testArray[0];
    int min = result[indexMin];

    for (int i = 1; i < testArray.Length; i++)
    {
        if(i < k)
        {
            result[i] = testArray[i];
            if (result[i] < min)
            {
                min = result[i];
                indexMin = i;
            }
        }
        else if (testArray[i] > min)
        {
            result[indexMin] = testArray[i];
            min = result[indexMin];
            for (int r = 0; r < k; r++)
            {
                if (result[r] < min)
                {
                    min = result[r];
                    indexMin = r;
                }
            }
        }
    }
    return result;
}

#8


0  

Just sort the array with Merge Sort and then print the first k number, it will take n*log2(n) in the worst case.

用归并排序对数组进行排序,然后打印第一个k号,最坏情况下需要n*log2(n)。

#9


0  

How about using a Heap to store the values. This cost is n when you go through each value in the array.

如何使用堆来存储值。当你遍历数组中的每个值时,这个代价是n。

Then go through the Heap to get the smallest k values.

然后遍历堆得到最小的k值。

Runtime is O(n) + O(k) = O(n)

运行时为O(n) + O(k) = O(n)

Of course, memory space is now O(n + n)

当然,内存空间现在是O(n + n)

#10


0  

As mentioned, there are two ways to accomplish such task:

如前所述,完成这一任务有两种方法:

1) You can sort the whole array of n elements with quicksort, heapsort or any O (n log n) sorting algorithm you want, and then pick the m smallest values in your array. This method will work in O(n log n).

1)可以使用快速排序、heapsort或任何O (n log n)排序算法对n个元素的整个数组进行排序,然后在数组中选择m个最小值。这个方法在O(n log n)中成立。

2) You can use selection algorithm to fink m smallest elements in your array. It will take O(n) time to find the kth smallest value, since you will iterate this algorithm m times, the overall time will be m x O(n) = O(n) .

2)可以使用选择算法查找数组中最小的元素。需要O(n)时间才能找到第k个最小值,因为要迭代这个算法m次,总时间将是m (n) = O(n)。

#11


0  

Another Technique- Use QuickSelect Algorithm and the result would be all the elements to the left of the returned result. The average time complexity would be O(n) and in worst case it would be O(n^2). The space complexity would be O(1).

另一种技术——使用QuickSelect算法,结果将是返回结果左边的所有元素。平均时间复杂度是O(n),在最糟糕的情况会是O(n ^ 2)。空间复杂度为O(1)。

#1


29  

I've done this in an interview before, and one of the most elegant/efficient ways to do this is

我以前在一次面试中做过这个,最优雅、最有效的方法之一就是

O(n log k). 
with space: O(k) (thanks, @Nzbuu)

Basically you're going to use a max-heap of size limited to k. For each item in the array, check to see if it's smaller than the max (only O(1)). If it is, put that in the heap (O(log k)) and remove the max. If its bigger, go to the next item.

基本上,你将使用一个大小限制为k的max-heap,对于数组中的每个条目,检查它是否小于max(只有O(1))。如果是,将其放入堆(O(log k))并删除最大值。如果大点,就去下一个项目。

Of course, the heap doesn't yield a sorted list of k items, but that can be done in O(k log k) which is easy.

当然,堆不会产生k项的排序列表,但是可以在O(k log k)中完成,这很简单。

Similarly, you can do the same for finding the largest k items, in which case you would use a min-heap.

类似地,对于查找最大的k项,您也可以这样做,在这种情况下,您将使用min-heap。

#2


23  

you will need to find the k'th smallest element using 'selection algorithm', which is O(n), and then iterate the array again and return each element which is smaller/equals it.
selection algorithm: http://en.wikipedia.org/wiki/Selection_algorithm
you will have to pay attention if you have repeats: you will need to make sure you are not returning more then k elements (it is possible if for instance you have 1,2,...,k,k,k,...)

EDIT :
the full algorithm, and returning a list, as requested: let the array be A

您将需要使用“选择算法”找到第k个最小的元素,即O(n),然后再次迭代数组并返回每个更小/等于它的元素。选择算法:http://en.wikipedia.org/wiki/Selection_algorithm如果你有重复,你必须注意:你需要确保你没有返回超过k个元素(如果你有1、2、……、k、k、k、……)编辑:完整的算法,并返回一个列表,按要求:让数组为a

 1. find the k'th element in A using 'selection algorithm', let it be 'z'
 2. initialize an empty list 'L'
 3. initialize counter<-0
 4. for each element in A: 
 4.1. if element < z: 
   4.1.1. counter<-counter + 1 ; L.add(element)
 5. for each element in A:
 5.1. if element == z AND count < k:
   5.1.1. counter<-counter + 1 ; L.add(element)
 6. return L

note here that a 3rd iteration is required if your list might have duplicates. if it can't - it is needless, just change the condition in 4.1 to <=.
also note: L.add is inserting an element to a linked list and thus is O(1).

注意,如果您的列表可能有重复,则需要进行第3次迭代。如果不能-这是不必要的,只需将4.1中的条件更改为<=。还要注意:L。add是向链表插入一个元素,因此是O(1)。

#3


4  

Assuming you're trying to show the K smallest numbers, you can use Hoare's Select algorithm to find the kth smallest number. That partitions the array into the smaller numbers, the kth number, and the larger numbers.

假设你想要显示K个最小的数,你可以使用Hoare的选择算法来找到第K个最小的数。将数组划分为较小的数字、第k个数字和较大的数字。

#4


1  

This can be done in expected linear time(O(n)). First find the kth smallest element of the array (using pivot partition method for finding kth order statistic) and then simply iterate through the loop to check which elements are less than the kth smallest element. Note that this works correctly only for distinct element.

这可以在预期线性时间(O(n)中完成。首先找到数组的第k个最小元素(使用pivot分区方法查找第k个顺序统计量),然后简单地遍历循环,检查哪些元素小于第k个最小元素。注意,这只适用于不同的元素。

Here is the code in c:

这是c中的代码:

    /*find the k smallest elements of an array in O(n) time. Using the Kth order 
statistic-random pivoting algorithm to find the kth smallest element and then looping 
through the array to find the elements smaller than kth smallest element.Assuming 
distinct elements*/


    #include <stdio.h>
    #include <math.h>
    #include <time.h>
    #define SIZE 10
    #define swap(X,Y) {int temp=X; X=Y; Y=temp;}


    int partition(int array[], int start, int end)
    {
        if(start==end)
            return start;
        if(start>end)
            return -1;
        int pos=end+1,j;
        for(j=start+1;j<=end;j++)
        {       
            if(array[j]<=array[start] && pos!=end+1)
            {
                swap(array[j],array[pos]);
                pos++;
            }
            else if(pos==end+1 && array[j]>array[start])
                pos=j;
        }
        pos--;
        swap(array[start], array[pos]);
        return pos;
    }

    int order_statistic(int array[], int start, int end, int k)
    {
        if(start>end || (end-start+1)<k)
            return -1;                   //return -1 
        int pivot=rand()%(end-start+1)+start, position, p;
        swap(array[pivot], array[start]);
        position=partition(array, start, end);
        p=position;
        position=position-start+1;                  //size of left partition
        if(k==position)
            return array[p];
        else if(k<position)
            return order_statistic(array, start,p-1,k);
        else
            return order_statistic(array,p+1,end,k-position);
    }


    void main()
    {
        srand((unsigned int)time(NULL));
        int i, array[SIZE],k;
        printf("Printing the array...\n");
        for(i=0;i<SIZE;i++)
            array[i]=abs(rand()%100), printf("%d ",array[i]);
        printf("\n\nk=");
        scanf("%d",&k);
        int k_small=order_statistic(array,0,SIZE-1,k);
        printf("\n\n");
        if(k_small==-1)
        {
            printf("Not possible\n");
            return ;
        }
        printf("\nk smallest elements...\n");
        for(i=0;i<SIZE;i++)
        {
            if(array[i]<=k_small)
                printf("%d ",array[i]);
        }
    }

#5


1  

It is possible to find the k smallest of n elements in O(n) time (by which I mean true O(n) time, not O(n + some function of k)). Refer to the Wikipedia article "Selection algorithm", especially the subsections on "unordered partial sorting" and "median selection as pivot strategy", and also to the article "Median of medians" for the essential piece that makes this O(n).

可以在O(n)时间内找到n个元素中k最小的元素(我指的是真实的O(n)时间,而不是O(n + k的某个函数))。参考*上“选择算法”的文章,特别是“无序部分排序”和“中位数选择作为主策略”的小节,以及“中位数中位数”的文章,找到构成这个O(n)的基本部分。

#6


1  

Best possible solution to the problem is as follows.Use Quick sort to find pivots and discard the part where this kth element doesn't lie, and recursively find the next pivot. (It's kth Max finder , You need to change the if else condition to make it kth Min Finder) .Here is the JavaScript code-

对这个问题最好的可能解决办法如下。使用快速排序查找支点,并丢弃第k个元素不存在的部分,然后递归地找到下一个支点。(它是kth Max查找程序,您需要更改if else条件才能使它成为kth Min查找程序)

  // Complexity is O(n log(n))
  var source = [9, 2, 7, 11, 1, 3, 14, 22];

  var kthMax = function(minInd, MaxInd, kth) {
      // pivotInd stores the pivot position 
      // for current iteration
      var temp, pivotInd = minInd;
      if (minInd >= MaxInd) {
        return source[pivotInd];
      }

      for (var i = minInd; i < MaxInd; i++) {
        //If an element is greater than chosen pivot (i.e. last element)
        //Swap it with pivotPointer element. then increase ponter
        if (source[i] > source[MaxInd]) {
          temp = source[i];
          source[i] = source[pivotInd];
          source[pivotInd] = temp;
          pivotInd++;
        }
      }
      // we have found position for pivot elem. 
      // swap it to that position place .
      temp = source[pivotInd];
      source[pivotInd] = source[MaxInd];
      source[MaxInd] = temp;

      // Only try to sort the part in which kth index lies.
      if (kth > pivotInd) {
        return kthMax(pivotInd + 1, MaxInd, kth);
      } else if (kth < pivotInd) {
        return kthMax(minInd, pivotInd - 1, kth);
      } else {
        return source[pivotInd];
      }

    }
    // last argument is kth-1 , so if give 2 it will give you,
    // 3rd max which is 11

  console.log(kthMax(0, source.length - 1, 2));

#7


1  

I know not exactly what you are looking for but pretty simple O(n * k) time and O(k) space. This is the biggest K so would need to flop it around.

我不知道你在找什么,但是很简单的O(n * k)时间和O(k)空间。这是最大的K,所以需要翻来翻去。

For the brute for min of k (result) can substitute a heap

对于原始的最小k(结果)可以替换一个堆

private int[] FindKBiggestNumbersM(int[] testArray, int k)
{
    int[] result = new int[k];
    int indexMin = 0;
    result[indexMin] = testArray[0];
    int min = result[indexMin];

    for (int i = 1; i < testArray.Length; i++)
    {
        if(i < k)
        {
            result[i] = testArray[i];
            if (result[i] < min)
            {
                min = result[i];
                indexMin = i;
            }
        }
        else if (testArray[i] > min)
        {
            result[indexMin] = testArray[i];
            min = result[indexMin];
            for (int r = 0; r < k; r++)
            {
                if (result[r] < min)
                {
                    min = result[r];
                    indexMin = r;
                }
            }
        }
    }
    return result;
}

#8


0  

Just sort the array with Merge Sort and then print the first k number, it will take n*log2(n) in the worst case.

用归并排序对数组进行排序,然后打印第一个k号,最坏情况下需要n*log2(n)。

#9


0  

How about using a Heap to store the values. This cost is n when you go through each value in the array.

如何使用堆来存储值。当你遍历数组中的每个值时,这个代价是n。

Then go through the Heap to get the smallest k values.

然后遍历堆得到最小的k值。

Runtime is O(n) + O(k) = O(n)

运行时为O(n) + O(k) = O(n)

Of course, memory space is now O(n + n)

当然,内存空间现在是O(n + n)

#10


0  

As mentioned, there are two ways to accomplish such task:

如前所述,完成这一任务有两种方法:

1) You can sort the whole array of n elements with quicksort, heapsort or any O (n log n) sorting algorithm you want, and then pick the m smallest values in your array. This method will work in O(n log n).

1)可以使用快速排序、heapsort或任何O (n log n)排序算法对n个元素的整个数组进行排序,然后在数组中选择m个最小值。这个方法在O(n log n)中成立。

2) You can use selection algorithm to fink m smallest elements in your array. It will take O(n) time to find the kth smallest value, since you will iterate this algorithm m times, the overall time will be m x O(n) = O(n) .

2)可以使用选择算法查找数组中最小的元素。需要O(n)时间才能找到第k个最小值,因为要迭代这个算法m次,总时间将是m (n) = O(n)。

#11


0  

Another Technique- Use QuickSelect Algorithm and the result would be all the elements to the left of the returned result. The average time complexity would be O(n) and in worst case it would be O(n^2). The space complexity would be O(1).

另一种技术——使用QuickSelect算法,结果将是返回结果左边的所有元素。平均时间复杂度是O(n),在最糟糕的情况会是O(n ^ 2)。空间复杂度为O(1)。