找出所需元素的最小数目,使它们的和等于或超过S

时间:2020-12-10 21:46:39

I know this can be done by sorting the array and taking the larger numbers until the required condition is met. That would take at least nlog(n) sorting time.

我知道这可以通过排序数组和取较大的数字来完成,直到满足所需条件。这至少需要nlog(n)的排序时间。

Is there any improvement over nlog(n).

nlog(n)有什么改进吗?

We can assume all numbers are positive.

我们可以假设所有的数都是正数。

5 个解决方案

#1


8  

Here is an algorithm that is O(n + size(smallest subset) * log(n)). If the smallest subset is much smaller than the array, this will be O(n).

这是一个O(n + size(最小子集)* log(n)的算法。如果最小的子集比数组小得多,那么这个就是O(n)

Read http://en.wikipedia.org/wiki/Heap_%28data_structure%29 if my description of the algorithm is unclear (it is light on details, but the details are all there).

如果我对算法的描述不清楚,请阅读http://en.wikipedia.org/wiki/Heap_%28data_structure%29(这是对细节的描述,但细节都在那里)。

  1. Turn the array into a heap arranged such that the biggest element is available in time O(n).
  2. 将数组转换成堆,以便在时间O(n)中可以使用最大的元素。
  3. Repeatedly extract the biggest element from the heap until their sum is large enough. This takes O(size(smallest subset) * log(n)).
  4. 重复地从堆中提取最大的元素,直到它们的和足够大。这需要O(大小(最小子集)* log(n))。

This is almost certainly the answer they were hoping for, though not getting it shouldn't be a deal breaker.

这几乎肯定是他们所希望的答案,尽管没有得到它不应该是一个破坏协议。

Edit: Here is another variant that is often faster, but can be slower.

编辑:这是另一个版本,它通常更快,但也可能更慢。

Walk through elements, until the sum of the first few exceeds S.  Store current_sum.
Copy those elements into an array.
Heapify that array such that the minimum is easy to find, remember the minimum.
For each remaining element in the main array:
    if min(in our heap) < element:
        insert element into heap
        increase current_sum by element
        while S + min(in our heap) < current_sum:
            current_sum -= min(in our heap)
            remove min from heap

If we get to reject most of the array without manipulating our heap, this can be up to twice as fast as the previous solution. But it is also possible to be slower, such as when the last element in the array happens to be bigger than S.

如果我们在不操纵堆的情况下拒绝了大多数数组,那么这个速度可以达到之前解决方案的两倍。但也有可能会更慢,比如数组中的最后一个元素恰好大于S。

#2


7  

Assuming the numbers are integers, you can improve upon the usual n lg(n) complexity of sorting because in this case we have the extra information that the values are between 0 and S (for our purposes, integers larger than S are the same as S).

假设这些数字是整数,你可以改进通常的n lg(n)排序的复杂性,因为在这种情况下,我们有额外的信息,值在0和S之间(对于我们的目的,大于S的整数等于S)。

Because the range of values is finite, you can use a non-comparative sorting algorithm such as Pigeonhole Sort or Radix Sort to go below n lg(n).

因为值的范围是有限的,所以你可以使用非比较排序算法,如鸽子洞排序或基数排序,使其小于nlg (n)。

Note that these methods are dependent on some function of S, so if S gets large enough (and n stays small enough) you may be better off reverting to a comparative sort.

注意,这些方法依赖于S的某个函数,所以如果S足够大(n足够小),你最好回到比较排序。

#3


5  

One improvement (asymptotically) over Theta(nlogn) you can do is to get an O(n log K) time algorithm, where K is the required minimum number of elements.

对(nlogn)有一个改进(渐进地)你可以做的是得到一个O(n log K)时间算法,其中K是所需的最小元素个数。

Thus if K is constant, or say log n, this is better (asymptotically) than sorting. Of course if K is n^epsilon, then this is not better than Theta(n logn).

因此,如果K是常数,或者log n,这比排序更好(渐进地)。当然,如果K是n ^ε,那么这不是比θ(n logn)。

The way to do this is to use selection algorithms, which can tell you the ith largest element in O(n) time.

这样做的方法是使用选择算法,它可以告诉你O(n)时间内第i个元素。

Now do a binary search for K, starting with i=1 (the largest) and doubling i etc at each turn.

现在对K做一个二分查找,从i=1(最大)开始,然后每一次翻倍。

You find the ith largest, and find the sum of the i largest elements and check if it is greater than S or not.

你找到第i个最大的元素,然后找到第i个最大元素的和然后检查它是否大于S。

This way, you would run O(log K) runs of the selection algorithm (which is O(n)) for a total running time of O(n log K).

这样,您将运行选择算法(即O(n))的O(log K)运行,总运行时间为O(n log K)。

#4


5  

Here is an O(n) expected time solution to the problem. It's somewhat like Moron's idea but we don't throw out the work that our selection algorithm did in each step, and we start trying from an item potentially in the middle rather than using the repeated doubling approach.

这是一个O(n)期望时间解。这有点像莫伦的想法,但我们没有放弃我们的选择算法在每一步所做的工作,我们开始尝试一个可能在中间的项目,而不是使用重复的加倍方法。

Alternatively, It's really just quickselect with a little additional book keeping for the remaining sum.

或者,它实际上只是quickselect,还有一些额外的簿记。

First, it's clear that if you had the elements in sorted order, you could just pick the largest items first until you exceed the desired sum. Our solution is going to be like that, except we'll try as hard as we can to not to discover ordering information, because sorting is slow.

首先,很明显,如果元素按顺序排序,你可以先选择最大的项,直到超过期望的和。我们的解决方案将是这样的,除非我们尽可能不去发现排序信息,因为排序很慢。

You want to be able to determine if a given value is the cut off. If we include that value and everything greater than it, we meet or exceed S, but when we remove it, then we are below S, then we are golden.

你想要确定一个给定的值是否是截止值,如果我们把这个值和所有大于它的值都包括在内,我们会满足或超过S,但是当我们移除它时,我们在S以下,那么我们就是黄金。

Here is the psuedo code, I didn't test it for edge cases, but this gets the idea across.

这是psuedo代码,我没有对它进行边缘情况的测试,但是这让我明白了它的意思。

def Solve(arr, s):
  # We could get rid of worse case O(n^2) behavior that basically never happens 
  # by selecting the median here deterministically, but in practice, the constant
  # factor on the algorithm will be much worse.
  p = random_element(arr)
  left_arr, right_arr = partition(arr, p)
  # assume p is in neither left_arr nor right_arr
  right_sum = sum(right_arr)
  if right_sum + p >= s:
    if right_sum < s:
      # solved it, p forms the cut off
      return len(right_arr) + 1    
    # took too much, at least we eliminated left_arr and p
    return Solve(right_arr, s) 
  else:
    # didn't take enough yet, include all elements from and eliminate right_arr and p
    return len(right_arr) + 1 + Solve(left_arr, s - right_sum - p)  

#5


0  

  1. eliminate numbers < S, if you find some number ==S, then solved
  2. 消去数字< S,如果你发现一些数字=S,然后解决
  3. pigeon-hole sort the numbers < S
  4. 将数字排序为< S

Sum elements highest to lowest in the sorted order till you exceed S.

以排序顺序从最高到最低的和元素,直到超过S。

#1


8  

Here is an algorithm that is O(n + size(smallest subset) * log(n)). If the smallest subset is much smaller than the array, this will be O(n).

这是一个O(n + size(最小子集)* log(n)的算法。如果最小的子集比数组小得多,那么这个就是O(n)

Read http://en.wikipedia.org/wiki/Heap_%28data_structure%29 if my description of the algorithm is unclear (it is light on details, but the details are all there).

如果我对算法的描述不清楚,请阅读http://en.wikipedia.org/wiki/Heap_%28data_structure%29(这是对细节的描述,但细节都在那里)。

  1. Turn the array into a heap arranged such that the biggest element is available in time O(n).
  2. 将数组转换成堆,以便在时间O(n)中可以使用最大的元素。
  3. Repeatedly extract the biggest element from the heap until their sum is large enough. This takes O(size(smallest subset) * log(n)).
  4. 重复地从堆中提取最大的元素,直到它们的和足够大。这需要O(大小(最小子集)* log(n))。

This is almost certainly the answer they were hoping for, though not getting it shouldn't be a deal breaker.

这几乎肯定是他们所希望的答案,尽管没有得到它不应该是一个破坏协议。

Edit: Here is another variant that is often faster, but can be slower.

编辑:这是另一个版本,它通常更快,但也可能更慢。

Walk through elements, until the sum of the first few exceeds S.  Store current_sum.
Copy those elements into an array.
Heapify that array such that the minimum is easy to find, remember the minimum.
For each remaining element in the main array:
    if min(in our heap) < element:
        insert element into heap
        increase current_sum by element
        while S + min(in our heap) < current_sum:
            current_sum -= min(in our heap)
            remove min from heap

If we get to reject most of the array without manipulating our heap, this can be up to twice as fast as the previous solution. But it is also possible to be slower, such as when the last element in the array happens to be bigger than S.

如果我们在不操纵堆的情况下拒绝了大多数数组,那么这个速度可以达到之前解决方案的两倍。但也有可能会更慢,比如数组中的最后一个元素恰好大于S。

#2


7  

Assuming the numbers are integers, you can improve upon the usual n lg(n) complexity of sorting because in this case we have the extra information that the values are between 0 and S (for our purposes, integers larger than S are the same as S).

假设这些数字是整数,你可以改进通常的n lg(n)排序的复杂性,因为在这种情况下,我们有额外的信息,值在0和S之间(对于我们的目的,大于S的整数等于S)。

Because the range of values is finite, you can use a non-comparative sorting algorithm such as Pigeonhole Sort or Radix Sort to go below n lg(n).

因为值的范围是有限的,所以你可以使用非比较排序算法,如鸽子洞排序或基数排序,使其小于nlg (n)。

Note that these methods are dependent on some function of S, so if S gets large enough (and n stays small enough) you may be better off reverting to a comparative sort.

注意,这些方法依赖于S的某个函数,所以如果S足够大(n足够小),你最好回到比较排序。

#3


5  

One improvement (asymptotically) over Theta(nlogn) you can do is to get an O(n log K) time algorithm, where K is the required minimum number of elements.

对(nlogn)有一个改进(渐进地)你可以做的是得到一个O(n log K)时间算法,其中K是所需的最小元素个数。

Thus if K is constant, or say log n, this is better (asymptotically) than sorting. Of course if K is n^epsilon, then this is not better than Theta(n logn).

因此,如果K是常数,或者log n,这比排序更好(渐进地)。当然,如果K是n ^ε,那么这不是比θ(n logn)。

The way to do this is to use selection algorithms, which can tell you the ith largest element in O(n) time.

这样做的方法是使用选择算法,它可以告诉你O(n)时间内第i个元素。

Now do a binary search for K, starting with i=1 (the largest) and doubling i etc at each turn.

现在对K做一个二分查找,从i=1(最大)开始,然后每一次翻倍。

You find the ith largest, and find the sum of the i largest elements and check if it is greater than S or not.

你找到第i个最大的元素,然后找到第i个最大元素的和然后检查它是否大于S。

This way, you would run O(log K) runs of the selection algorithm (which is O(n)) for a total running time of O(n log K).

这样,您将运行选择算法(即O(n))的O(log K)运行,总运行时间为O(n log K)。

#4


5  

Here is an O(n) expected time solution to the problem. It's somewhat like Moron's idea but we don't throw out the work that our selection algorithm did in each step, and we start trying from an item potentially in the middle rather than using the repeated doubling approach.

这是一个O(n)期望时间解。这有点像莫伦的想法,但我们没有放弃我们的选择算法在每一步所做的工作,我们开始尝试一个可能在中间的项目,而不是使用重复的加倍方法。

Alternatively, It's really just quickselect with a little additional book keeping for the remaining sum.

或者,它实际上只是quickselect,还有一些额外的簿记。

First, it's clear that if you had the elements in sorted order, you could just pick the largest items first until you exceed the desired sum. Our solution is going to be like that, except we'll try as hard as we can to not to discover ordering information, because sorting is slow.

首先,很明显,如果元素按顺序排序,你可以先选择最大的项,直到超过期望的和。我们的解决方案将是这样的,除非我们尽可能不去发现排序信息,因为排序很慢。

You want to be able to determine if a given value is the cut off. If we include that value and everything greater than it, we meet or exceed S, but when we remove it, then we are below S, then we are golden.

你想要确定一个给定的值是否是截止值,如果我们把这个值和所有大于它的值都包括在内,我们会满足或超过S,但是当我们移除它时,我们在S以下,那么我们就是黄金。

Here is the psuedo code, I didn't test it for edge cases, but this gets the idea across.

这是psuedo代码,我没有对它进行边缘情况的测试,但是这让我明白了它的意思。

def Solve(arr, s):
  # We could get rid of worse case O(n^2) behavior that basically never happens 
  # by selecting the median here deterministically, but in practice, the constant
  # factor on the algorithm will be much worse.
  p = random_element(arr)
  left_arr, right_arr = partition(arr, p)
  # assume p is in neither left_arr nor right_arr
  right_sum = sum(right_arr)
  if right_sum + p >= s:
    if right_sum < s:
      # solved it, p forms the cut off
      return len(right_arr) + 1    
    # took too much, at least we eliminated left_arr and p
    return Solve(right_arr, s) 
  else:
    # didn't take enough yet, include all elements from and eliminate right_arr and p
    return len(right_arr) + 1 + Solve(left_arr, s - right_sum - p)  

#5


0  

  1. eliminate numbers < S, if you find some number ==S, then solved
  2. 消去数字< S,如果你发现一些数字=S,然后解决
  3. pigeon-hole sort the numbers < S
  4. 将数字排序为< S

Sum elements highest to lowest in the sorted order till you exceed S.

以排序顺序从最高到最低的和元素,直到超过S。