Given an array of integers find the number of all ordered pairs of elements in the array whose sum lies in a given range [a,b]
给定一个整数数组,在给定范围内(a,b),数组中所有的有序元素对的数量。
Here is an O(n^2) solution for the same
这是一个O(n ^ 2)解决方案
'''
counts all pairs in array such that the
sum of pair lies in the range a and b
'''
def countpairs(array, a, b):
num_of_pairs = 0
for i in range(len(array)):
for j in range(i+1,len(array)):
total = array[i] + array[j]
if total >= a and total <= b:
num_of_pairs += 1
return num_of_pairs
I know my solution is not optimal What is a better algorithm for doing this.
我知道我的解决方案不是最优什么是更好的算法。
8 个解决方案
#1
14
- Sort the array (say in increasing order).
- 对数组进行排序(比如增加顺序)。
- For each element x in the array:
- Consider the array slice after the element.
- 考虑元素之后的数组切片。
- Do a binary search on this array slice for [a - x], call it y0. If no exact match is found, consider the closest match bigger than [a - x] as y0.
- 在这个数组切片上对[a - x]进行二分查找,称为y0。如果没有找到精确匹配,请考虑比[a - x]更大的匹配为y0。
- Output all elements (x, y) from y0 forwards as long as x + y <= b
- 输出所有元素(x, y)从y0一直到x + y <= b。
- 对于数组中的每个元素x:考虑元素之后的数组切片。在这个数组切片上对[a - x]进行二分查找,称为y0。如果没有找到精确匹配,请考虑比[a - x]更大的匹配为y0。输出所有元素(x, y)从y0一直到x + y <= b。
The time complexity is of course output-sensitive, but this is still superior to the existing algo:
时间复杂度当然是输出敏感的,但这仍然优于现有的algo:
O(nlogn) + O(k)
where k is the number of pairs that satisfy the condition.
其中k是满足条件的对数。
Note: If you only need to count the number of pairs, you can do it in O(nlogn)
. Modify the above algorithm so [b - x] (or the next smaller element) is also searched for. This way, you can count the number of 'matches' each element has in O(logn)
simply from the indices of the first and last match. Then it's just a question of summing those up to get the final count. This way, the initial O(nlogn)
sorting step is dominant.
注意:如果您只需要计算对的数量,您可以在O(nlogn)中完成。修改上面的算法,这样就可以搜索[b - x](或下一个较小的元素)。通过这种方式,您可以简单地从第一个和最后一个匹配的索引中计算每个元素在O(logn)中的“匹配”数量。然后就是把这些相加得到最终结果的问题。这样,初始O(nlogn)排序步骤就占主导地位。
#2
11
Sort the array first and count the pairs by two indexes. The two indexes approach is similar to the one in 2-sum problem, which avoids the binary-search for N
times. The time consuming of the algorithm is Sort Complexity + O(N)
, typically, sort is O(NlnN), thus this approach is O(NlnN). The idea of the algorithm is, for an index i
, find an lower bound and an upper bound such that a <= arr[i]+arr[low] <= arr[i]+arr[high] <= b
and when i
increases, what we should do is to decrease low
and high
to hold the condition. To avoid counting the same pair twice, we keep low > i
, also we keep low <= high
. The complexity of the following counting approach is O(N), because, in the while loop
, what we can do is ++i
or --low
or --high
and there are at most N
such operations.
首先对数组进行排序,并通过两个索引计算对。这两个索引方法类似于2和问题中的一个,避免了N次的二进制搜索。算法的时间复杂度是排序复杂度+ O(N),通常排序是O(NlnN),因此这种方法是O(NlnN)。该算法的思想是,对于一个索引i,找到一个下界和一个上界,这样,<= arr[i]+arr[low] <= arr[i]+arr[high] <= b,当我增加时,我们应该做的是降低低和高以保持条件。为了避免重复计算相同的两次,我们保持低> i,也保持低<=高。下面的计数方法的复杂度是O(N),因为在while循环中,我们可以做的是++i或-low或-high,在大多数情况下都有N个这样的操作。
//count pair whose sum is in [a, b]
//arr is a sorted array with size integers.
int countPair(int arr[], int size, int a, int b) {
int cnt = 0;
int i = 0, low = size-1, high = size-1;
while (i < high) {
//find the lower bound such that arr[i] + arr[low] < a,
//meanwhile arr[i]+arr[low+1] >= a
low = max(i, low);
while (low > i && arr[i] + arr[low] >= a) --low;
//find an upper bound such that arr[i] + arr[high] <= b
//meanwhile, arr[i]+arr[high+1] > b
while (high > low && arr[i] + arr[high] > b) --high;
//all pairs: arr[i]+arr[low+1], arr[i]+arr[low+2],...,arr[i]+arr[high]
//are in the rage[a, b], and we count it as follows.
cnt += (high-low);
++i;
}
return cnt;
}
#3
6
The problem of counting the pairs that work can be done in sort time + O(N). This is faster than the solution that Ani gives, which is sort time + O(N log N). The idea goes like this. First you sort. You then run nearly the same single pass algorithm twice. You then can use the results of the two single pass algorithms to calculate the answer.
计数的问题可以在排序时间+ O(N)中完成。这比阿尼给出的解要快,也就是时间+ O(N log N)这个思路是这样的。首先你排序。然后运行几乎相同的单程算法两次。然后,您可以使用这两种单步算法的结果来计算答案。
The first time we run the single pass algorithm, we will create a new array that lists the smallest index that can partner with that index to give a sum greater than a. Example:
第一次运行单遍算法时,我们将创建一个新数组,该数组列出可以与该索引进行合作的最小索引,以给出比a更大的总和。
a = 6
array = [-20, 1, 3, 4, 8, 11]
output = [6, 4, 2, 2, 1, 1]
So, the number at array index 1 is 1 (0 based indexing). The smallest number it can pair with to get over 6 is the eight, which is at index 4. Hence output[1] = 4. -20 can't pair with anything, so output[0] = 6 (out of bounds). Another example: output[4] = 1, because 8 (index 4) can pair with the 1 (index 1) or any number after it to sum more than 6.
因此,数组索引1中的数字是1(基于0的索引)。它能对的最小值是8,也就是指数4。因此输出[1]= 4。-20不能与任何东西配对,所以输出[0]= 6(出界)。另一个例子是:输出[4]= 1,因为8(索引4)可以与1(索引1)或任何后面的任何数对大于6。
What you need to do now is convince yourself that this is O(N). It is. The code is:
你现在需要做的是说服自己这是O(N)它是。的代码是:
i, j = 0, 5
while i - j <= 0:
if array[i] + array[j] >= a:
output[j] = i
j -= 1
else:
output[i] = j + 1
i += 1
Just think of two pointers starting at the edges and working inwards. It's O(N). You now do the same thing, just with the condition b <= a:
只要想想两个指针从边缘开始并向内工作。这是O(N)。你现在做同样的事情,只是条件b <= a:
while i-j <= 0:
if array[i] + array[j] <= b:
output2[i] = j
i += 1
else:
output2[j] = i-1
j-=1
In our example, this code gives you (array and b for reference):
在我们的示例中,这段代码给出了(数组和b):
b = 9
array = [-20, 1, 3, 4, 8, 11]
output2 = [5, 4, 3, 3, 1, 0]
But now, output and output2 contain all the information we need, because they contain the range of valid indices for pairings. output is the smallest index it can be paired with, output2 is the largest index it can be paired with. The difference + 1 is the number of pairings for that location. So for the first location (corresponding to -20), there are 5 - 6 + 1 = 0 pairings. For 1, there are 4-4 + 1 pairings, with the number at index 4 which is 8. Another subtlety, this algo counts self pairings, so if you don't want it, you have to subtract. E.g. 3 seems to contain 3-2 + 1 = 2 pairings, one at index 2 and one at index 3. Of course, 3 itself is at index 2, so one of those is the self pairing, the other is the pairing with 4. You just need to subtract one whenever the range of indices of output and output2 contain the index itself you're looking at. In code, you can write:
但是现在,输出和output2包含了我们需要的所有信息,因为它们包含了用于配对的有效索引范围。输出是它可以与之配对的最小索引,output2是它可以与之配对的最大索引。差值+ 1是该位置的配对数。对于第一个位置(对应-20),有5 - 6 + 1 = 0配对。对于1,有4-4 + 1配对,指数4为8。另一个微妙之处是,algo计算自配对,所以如果你不想要它,你必须减去。3似乎包含3-2 + 1 = 2个配对,一个在索引2,一个在索引3。当然,3本身在指数2,所以其中一个是自配对,另一个是与4的配对。只要在输出和输出的指标范围内包含你所看到的索引本身,你就需要减去一个。在代码中,您可以编写:
answer = [o2 - o + 1 - (o <= i <= o2) for i, (o, o2) in enumerate(zip(output, output2))]
Which yields:
收益率:
answer = [0, 1, 1, 1, 1, 0]
Which sums to 4, corresponding to (1,8), (3,4), (4,3), (8, 1)
(1,8),(3,4),(4,3)(8,1)
Anyhow, as you can see, this is sort + O(N), which is optimal.
无论如何,正如你所看到的,这是sort + O(N),这是最优的。
Edit: asked for full implementation. Provided. For reference, the full code:
编辑:要求完全实现。提供。供参考,完整的代码:
def count_ranged_pairs(x, a, b):
x.sort()
output = [0] * len(x)
output2 = [0] * len(x)
i, j = 0, len(x)-1
while i - j <= 0:
if x[i] + x[j] >= a:
output[j] = i
j -= 1
else:
output[i] = j + 1
i += 1
i, j = 0, len(x) - 1
while i-j <= 0:
if x[i] + x[j] <= b:
output2[i] = j
i += 1
else:
output2[j] = i-1
j -=1
answer = [o2 - o + 1 - (o <= i <= o2) for i, (o, o2) in enumerate(zip(output, output2))]
return sum(answer)/2
#4
5
from itertools import ifilter, combinations
def countpairs2(array, a, b):
pairInRange = lambda x: sum(x) >= a and sum(x) <= b
filtered = ifilter(pairInRange, combinations(array, 2))
return sum([2 for x in filtered])
I think the Itertools library comes in quite handy. I also noticed you counted pairs twice, for example you counted (1, 3) and (3, 1) as two different combinations. If you don't want that, just change the 2 in the last line to a 1. Note: The last could be changed to return len(list(filtered)) * 2
. This CAN be faster, but at the expense of using more RAM.
我认为Itertools库非常方便。我还注意到,你数了两次,例如你数了(1,3)和(3,1)两种不同的组合。如果你不想这样,把最后一行的2换成1。注意:最后一个可以更改为return len(list(过滤后的))* 2。这可能更快,但代价是使用更多RAM。
#5
3
With some constraints on the data we can solve problem in linear time (sorry for Java, I'm not very proficient with Python):
对于数据的一些限制,我们可以在线性时间内解决问题(抱歉,Java,我不是很精通Python):
public class Program {
public static void main(String[] args) {
test(new int[]{-2, -1, 0, 1, 3, -3}, -1, 2);
test(new int[]{100,200,300}, 300, 300);
test(new int[]{100}, 1, 1000);
test(new int[]{-1, 0, 0, 0, 1, 1, 1000}, -1, 2);
}
public static int countPairs(int[] input, int a, int b) {
int min = Integer.MAX_VALUE;
int max = Integer.MIN_VALUE;
for (int el : input) {
max = Math.max(max, el);
min = Math.min(min, el);
}
int d = max - min + 1; // "Diameter" of the array
// Build naive hash-map of input: Map all elements to range [0; d]
int[] lookup = new int[d];
for (int el : input) {
lookup[el - min]++;
}
// a and b also needs to be adjusted
int a1 = a - min;
int b1 = b - min;
int[] counts = lookup; // Just rename
// i-th element contain count of lookup elements in range [0; i]
for (int i = 1; i < counts.length; ++i) {
counts[i] += counts[i - 1];
}
int res = 0;
for (int el : input) {
int lo = a1 - el; // el2 >= lo
int hi = b1 - el; // el2 <= hi
lo = Math.max(lo, 0);
hi = Math.min(hi, d - 1);
if (lo <= hi) {
res += counts[hi];
if (lo > 0) {
res -= counts[lo - 1];
}
}
// Exclude pair with same element
if (a <= 2*el && 2*el <= b) {
--res;
}
}
// Calculated pairs are ordered, divide by 2
return res / 2;
}
public static int naive(int[] ar, int a, int b) {
int res = 0;
for (int i = 0; i < ar.length; ++i) {
for (int j = i + 1; j < ar.length; ++j) {
int sum = ar[i] + ar[j];
if (a <= sum && sum <= b) {
++res;
}
}
}
return res;
}
private static void test(int[] input, int a, int b) {
int naiveSol = naive(input, a, b);
int optimizedSol = countPairs(input, a, b);
if (naiveSol != optimizedSol) {
System.out.println("Problem!!!");
}
}
}
For each element of the array we know the range in which second element of the pair can lay. Core of this algorithm is giving the count of elements in range [a; b] in O(1) time.
对于数组的每一个元素,我们知道它们的第二个元素可以放置的范围。该算法的核心是给出范围内元素的计数[a;b]O(1)时间。
Resulting complexity is O(max(N, D)), where D is difference between max and min elements of the array. If this value is same order as N - complexity is O(N).
结果复杂度为O(max(N, D)),其中D为数组的最大值和最小元素的差值。如果这个值与N -复杂度相同,则为O(N)。
Notes:
注:
- No sorting involved!
- 不参与排序!
- Building lookup is required to make algorithm work with negative numbers and make second array as small as possible (positively impacts both memory and time)
- 为了使算法能够处理负数,并使第二个数组尽可能小(对内存和时间都有正向影响),构建查找是必需的。
- Ugly condition
if (a <= 2*el && 2*el <= b)
is required because algorithm always counts pairs (a[i],a[i]) - 如果(a <= 2*el && 2*el <= b)是必要的,因为算法总是计数对(a[i],a[i])
- Algorithm requires O(d) additional memory which can be a lot.
- 算法需要O(d)额外的内存。
Another linear algorithm would be radix sort + linear pair counting.
另一个线性算法是基数排序+线性对计数。
EDIT. This algorithm can be really good in case if D is considerably smaller than N and you are not allowed to modify the input array. Alternative option for this case would be slightly modified counting sort with allocation of counts array (additional O(D) memory) but without populating sorted elements back to input array. It's possible to adapt pair counting to use counts array instead of full sorted array.
编辑。如果D比N小得多,并且不允许修改输入数组,那么这个算法就会非常好。这种情况下的另一种选择是稍微修改计数排序,分配计数数组(附加O(D)内存),但不将排序的元素填充到输入数组中。使用计数数组而不是完全排序的数组是可能的。
#6
2
I have a solution(actually 2 solutions ;-)). Writing it in python:
我有一个解(实际上有两个解;-))。在python中写:
def find_count(input_list, min, max):
count = 0
range_diff = max - min
for i in range(len(input_list)):
if input_list[i]*2 >= min and input_list[i]*2 <= max:
count += 1
for j in range(i+1, len(input_list)):
input_sum = input_list[i] + input_list[j]
if input_sum >= min and input_sum <= max:
count += 2
This will run nCr(n combinations) times to the max and gives you the required count. This will be better than sorting the list and then finding the pairs in a range. If the number of elements that fail the combination is greater as well as all the numbers are positive integers, we can improve the result a little better by adding a condition that checks the elements for,
这将运行nCr(n组合)次数到最大值并给出所需的计数。这将比对列表进行排序,然后在一个范围内找到对。如果组合中失败的元素的数量更大,并且所有的数字都是正整数,我们可以通过添加检查元素的条件来改进结果,
- Numbers that do not fall under the range even with the addition of the max value
- 即使添加了最大值,也不属于范围的数字。
- Numbers that are greater than the maximum number of the range.
- 大于该范围最大值的数字。
Something like this:
是这样的:
# list_maximum is the maximum number of the list (i.e) max(input_list), if already known
def find_count(input_list, min, max, list_maximum):
count = 0
range_diff = max - min
for i in range(len(input_list)):
if input_list[i] > max or input_list[i] + list_maximum < min:
continue
if input_list[i]*2 >= min and input_list[i]*2 <= max:
count += 1
for j in range(i+1, len(input_list)):
input_sum = input_list[i] + input_list[j]
if input_sum >= min and input_sum <= max:
count += 2
I will also be happy to learn any better solution than this :-) If i come across one, I will update this answer.
我也很乐意学习比这更好的解决方案:-)如果我遇到一个,我会更新这个答案。
#7
1
I believe this is a simple math problem, that could be solved with numpy
with no loops and no sorting on our part. I'm not exactly sure, but I believe the complexity to be O(N^2) at worse case (would love some confirmation on that by someone more knowledgeable with time complexities in numpy).
我相信这是一个简单的数学问题,可以用没有循环的numpy来解决,也不能对我们的部分进行排序。我不确定,但是我相信复杂度是O(N ^ 2)在更糟糕的情况下(会喜欢一些确认时间复杂性numpy)更有见识的人。
At any rate, here's my solution:
无论如何,这是我的解决方案:
import numpy as np
def count_pairs(input_array, min, max):
A = np.array(input_array)
A_ones = np.ones((len(A),len(A)))
A_matrix = A*A_ones
result = np.transpose(A_matrix) + A_matrix
result = np.triu(result,0)
np.fill_diagonal(result,0)
count = ((result > min) & (result < max)).sum()
return count
Now let's walk through it - first I just create a matrix with columns representing our numbers:
现在我们来看一下-首先,我用表示数字的列来创建一个矩阵:
A = np.array(input_array)
A_ones = np.ones((len(A),len(A)))
A_matrix = A*A_ones
Let's assume that our input array looked like: [1,1,2,2,3,-1]
,thus, this should be the value of A_matrix
at this point.
假设我们的输入数组是这样的:[1,2,2,3,1],因此,这应该是这个点的a_矩阵的值。
[[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]]
If I add that to the transpose of itself...
如果我把它加到它自身的转置上…
result = np.transpose(A_matrix) + A_matrix
...I should get a matrix representing all combinations of sums of pairs:
…我应该得到一个矩阵表示所有对的和的组合:
[[ 2. 2. 3. 3. 4. 0.]
[ 2. 2. 3. 3. 4. 0.]
[ 3. 3. 4. 4. 5. 1.]
[ 3. 3. 4. 4. 5. 1.]
[ 4. 4. 5. 5. 6. 2.]
[ 0. 0. 1. 1. 2. -2.]]
Of course, this matrix is mirrored across the diagonal because the pairs (1,2) and (2,1) produce the same result. We don't want to consider these duplicate entries. We also don't want to consider the sum of an item with itself, so let's sanitize our array:
当然,这个矩阵在对角线上是镜像的,因为对(1,2)和(2,1)产生相同的结果。我们不想考虑这些重复的条目。我们也不想考虑项目本身的和,所以让我们对数组进行消毒:
result = np.triu(result,0)
np.fill_diagonal(result,0)
Our result now looks like:
我们现在的结果是:
[[ 0. 2. 3. 3. 4. 0.]
[ 0. 0. 3. 3. 4. 0.]
[ 0. 0. 0. 4. 5. 1.]
[ 0. 0. 0. 0. 5. 1.]
[ 0. 0. 0. 0. 0. 2.]
[ 0. 0. 0. 0. 0. 0.]]
All that remains is to count the items that pass our criteria.
剩下的就是计算通过我们标准的项目。
count = ((result > min) & (result < max)).sum()
A word of caution:
This method won't work if 0
is in the acceptable domain, but I'm sure it would be trivial to manipulate that result matrix above to convert those 0's to some other meaningless number....
这个方法行不通如果在可接受域的值为0,但我确信这将是微不足道的操作结果矩阵转换这些0以上的其他一些无意义的数字....
#8
1
Rather than using the relational operators, we can simply check if the sum of array elements i and j are in the specified range.
我们可以简单地检查数组元素i和j是否在指定范围内,而不是使用关系运算符。
def get_numOfPairs(array, start, stop):
num_of_pairs = 0
array_length = len(array)
for i in range(array_length):
for j in range(i+1, array_length):
if sum([array[i], array[j]]) in range(start, stop):
num_of_pairs += 1
return num_of_pairs
#1
14
- Sort the array (say in increasing order).
- 对数组进行排序(比如增加顺序)。
- For each element x in the array:
- Consider the array slice after the element.
- 考虑元素之后的数组切片。
- Do a binary search on this array slice for [a - x], call it y0. If no exact match is found, consider the closest match bigger than [a - x] as y0.
- 在这个数组切片上对[a - x]进行二分查找,称为y0。如果没有找到精确匹配,请考虑比[a - x]更大的匹配为y0。
- Output all elements (x, y) from y0 forwards as long as x + y <= b
- 输出所有元素(x, y)从y0一直到x + y <= b。
- 对于数组中的每个元素x:考虑元素之后的数组切片。在这个数组切片上对[a - x]进行二分查找,称为y0。如果没有找到精确匹配,请考虑比[a - x]更大的匹配为y0。输出所有元素(x, y)从y0一直到x + y <= b。
The time complexity is of course output-sensitive, but this is still superior to the existing algo:
时间复杂度当然是输出敏感的,但这仍然优于现有的algo:
O(nlogn) + O(k)
where k is the number of pairs that satisfy the condition.
其中k是满足条件的对数。
Note: If you only need to count the number of pairs, you can do it in O(nlogn)
. Modify the above algorithm so [b - x] (or the next smaller element) is also searched for. This way, you can count the number of 'matches' each element has in O(logn)
simply from the indices of the first and last match. Then it's just a question of summing those up to get the final count. This way, the initial O(nlogn)
sorting step is dominant.
注意:如果您只需要计算对的数量,您可以在O(nlogn)中完成。修改上面的算法,这样就可以搜索[b - x](或下一个较小的元素)。通过这种方式,您可以简单地从第一个和最后一个匹配的索引中计算每个元素在O(logn)中的“匹配”数量。然后就是把这些相加得到最终结果的问题。这样,初始O(nlogn)排序步骤就占主导地位。
#2
11
Sort the array first and count the pairs by two indexes. The two indexes approach is similar to the one in 2-sum problem, which avoids the binary-search for N
times. The time consuming of the algorithm is Sort Complexity + O(N)
, typically, sort is O(NlnN), thus this approach is O(NlnN). The idea of the algorithm is, for an index i
, find an lower bound and an upper bound such that a <= arr[i]+arr[low] <= arr[i]+arr[high] <= b
and when i
increases, what we should do is to decrease low
and high
to hold the condition. To avoid counting the same pair twice, we keep low > i
, also we keep low <= high
. The complexity of the following counting approach is O(N), because, in the while loop
, what we can do is ++i
or --low
or --high
and there are at most N
such operations.
首先对数组进行排序,并通过两个索引计算对。这两个索引方法类似于2和问题中的一个,避免了N次的二进制搜索。算法的时间复杂度是排序复杂度+ O(N),通常排序是O(NlnN),因此这种方法是O(NlnN)。该算法的思想是,对于一个索引i,找到一个下界和一个上界,这样,<= arr[i]+arr[low] <= arr[i]+arr[high] <= b,当我增加时,我们应该做的是降低低和高以保持条件。为了避免重复计算相同的两次,我们保持低> i,也保持低<=高。下面的计数方法的复杂度是O(N),因为在while循环中,我们可以做的是++i或-low或-high,在大多数情况下都有N个这样的操作。
//count pair whose sum is in [a, b]
//arr is a sorted array with size integers.
int countPair(int arr[], int size, int a, int b) {
int cnt = 0;
int i = 0, low = size-1, high = size-1;
while (i < high) {
//find the lower bound such that arr[i] + arr[low] < a,
//meanwhile arr[i]+arr[low+1] >= a
low = max(i, low);
while (low > i && arr[i] + arr[low] >= a) --low;
//find an upper bound such that arr[i] + arr[high] <= b
//meanwhile, arr[i]+arr[high+1] > b
while (high > low && arr[i] + arr[high] > b) --high;
//all pairs: arr[i]+arr[low+1], arr[i]+arr[low+2],...,arr[i]+arr[high]
//are in the rage[a, b], and we count it as follows.
cnt += (high-low);
++i;
}
return cnt;
}
#3
6
The problem of counting the pairs that work can be done in sort time + O(N). This is faster than the solution that Ani gives, which is sort time + O(N log N). The idea goes like this. First you sort. You then run nearly the same single pass algorithm twice. You then can use the results of the two single pass algorithms to calculate the answer.
计数的问题可以在排序时间+ O(N)中完成。这比阿尼给出的解要快,也就是时间+ O(N log N)这个思路是这样的。首先你排序。然后运行几乎相同的单程算法两次。然后,您可以使用这两种单步算法的结果来计算答案。
The first time we run the single pass algorithm, we will create a new array that lists the smallest index that can partner with that index to give a sum greater than a. Example:
第一次运行单遍算法时,我们将创建一个新数组,该数组列出可以与该索引进行合作的最小索引,以给出比a更大的总和。
a = 6
array = [-20, 1, 3, 4, 8, 11]
output = [6, 4, 2, 2, 1, 1]
So, the number at array index 1 is 1 (0 based indexing). The smallest number it can pair with to get over 6 is the eight, which is at index 4. Hence output[1] = 4. -20 can't pair with anything, so output[0] = 6 (out of bounds). Another example: output[4] = 1, because 8 (index 4) can pair with the 1 (index 1) or any number after it to sum more than 6.
因此,数组索引1中的数字是1(基于0的索引)。它能对的最小值是8,也就是指数4。因此输出[1]= 4。-20不能与任何东西配对,所以输出[0]= 6(出界)。另一个例子是:输出[4]= 1,因为8(索引4)可以与1(索引1)或任何后面的任何数对大于6。
What you need to do now is convince yourself that this is O(N). It is. The code is:
你现在需要做的是说服自己这是O(N)它是。的代码是:
i, j = 0, 5
while i - j <= 0:
if array[i] + array[j] >= a:
output[j] = i
j -= 1
else:
output[i] = j + 1
i += 1
Just think of two pointers starting at the edges and working inwards. It's O(N). You now do the same thing, just with the condition b <= a:
只要想想两个指针从边缘开始并向内工作。这是O(N)。你现在做同样的事情,只是条件b <= a:
while i-j <= 0:
if array[i] + array[j] <= b:
output2[i] = j
i += 1
else:
output2[j] = i-1
j-=1
In our example, this code gives you (array and b for reference):
在我们的示例中,这段代码给出了(数组和b):
b = 9
array = [-20, 1, 3, 4, 8, 11]
output2 = [5, 4, 3, 3, 1, 0]
But now, output and output2 contain all the information we need, because they contain the range of valid indices for pairings. output is the smallest index it can be paired with, output2 is the largest index it can be paired with. The difference + 1 is the number of pairings for that location. So for the first location (corresponding to -20), there are 5 - 6 + 1 = 0 pairings. For 1, there are 4-4 + 1 pairings, with the number at index 4 which is 8. Another subtlety, this algo counts self pairings, so if you don't want it, you have to subtract. E.g. 3 seems to contain 3-2 + 1 = 2 pairings, one at index 2 and one at index 3. Of course, 3 itself is at index 2, so one of those is the self pairing, the other is the pairing with 4. You just need to subtract one whenever the range of indices of output and output2 contain the index itself you're looking at. In code, you can write:
但是现在,输出和output2包含了我们需要的所有信息,因为它们包含了用于配对的有效索引范围。输出是它可以与之配对的最小索引,output2是它可以与之配对的最大索引。差值+ 1是该位置的配对数。对于第一个位置(对应-20),有5 - 6 + 1 = 0配对。对于1,有4-4 + 1配对,指数4为8。另一个微妙之处是,algo计算自配对,所以如果你不想要它,你必须减去。3似乎包含3-2 + 1 = 2个配对,一个在索引2,一个在索引3。当然,3本身在指数2,所以其中一个是自配对,另一个是与4的配对。只要在输出和输出的指标范围内包含你所看到的索引本身,你就需要减去一个。在代码中,您可以编写:
answer = [o2 - o + 1 - (o <= i <= o2) for i, (o, o2) in enumerate(zip(output, output2))]
Which yields:
收益率:
answer = [0, 1, 1, 1, 1, 0]
Which sums to 4, corresponding to (1,8), (3,4), (4,3), (8, 1)
(1,8),(3,4),(4,3)(8,1)
Anyhow, as you can see, this is sort + O(N), which is optimal.
无论如何,正如你所看到的,这是sort + O(N),这是最优的。
Edit: asked for full implementation. Provided. For reference, the full code:
编辑:要求完全实现。提供。供参考,完整的代码:
def count_ranged_pairs(x, a, b):
x.sort()
output = [0] * len(x)
output2 = [0] * len(x)
i, j = 0, len(x)-1
while i - j <= 0:
if x[i] + x[j] >= a:
output[j] = i
j -= 1
else:
output[i] = j + 1
i += 1
i, j = 0, len(x) - 1
while i-j <= 0:
if x[i] + x[j] <= b:
output2[i] = j
i += 1
else:
output2[j] = i-1
j -=1
answer = [o2 - o + 1 - (o <= i <= o2) for i, (o, o2) in enumerate(zip(output, output2))]
return sum(answer)/2
#4
5
from itertools import ifilter, combinations
def countpairs2(array, a, b):
pairInRange = lambda x: sum(x) >= a and sum(x) <= b
filtered = ifilter(pairInRange, combinations(array, 2))
return sum([2 for x in filtered])
I think the Itertools library comes in quite handy. I also noticed you counted pairs twice, for example you counted (1, 3) and (3, 1) as two different combinations. If you don't want that, just change the 2 in the last line to a 1. Note: The last could be changed to return len(list(filtered)) * 2
. This CAN be faster, but at the expense of using more RAM.
我认为Itertools库非常方便。我还注意到,你数了两次,例如你数了(1,3)和(3,1)两种不同的组合。如果你不想这样,把最后一行的2换成1。注意:最后一个可以更改为return len(list(过滤后的))* 2。这可能更快,但代价是使用更多RAM。
#5
3
With some constraints on the data we can solve problem in linear time (sorry for Java, I'm not very proficient with Python):
对于数据的一些限制,我们可以在线性时间内解决问题(抱歉,Java,我不是很精通Python):
public class Program {
public static void main(String[] args) {
test(new int[]{-2, -1, 0, 1, 3, -3}, -1, 2);
test(new int[]{100,200,300}, 300, 300);
test(new int[]{100}, 1, 1000);
test(new int[]{-1, 0, 0, 0, 1, 1, 1000}, -1, 2);
}
public static int countPairs(int[] input, int a, int b) {
int min = Integer.MAX_VALUE;
int max = Integer.MIN_VALUE;
for (int el : input) {
max = Math.max(max, el);
min = Math.min(min, el);
}
int d = max - min + 1; // "Diameter" of the array
// Build naive hash-map of input: Map all elements to range [0; d]
int[] lookup = new int[d];
for (int el : input) {
lookup[el - min]++;
}
// a and b also needs to be adjusted
int a1 = a - min;
int b1 = b - min;
int[] counts = lookup; // Just rename
// i-th element contain count of lookup elements in range [0; i]
for (int i = 1; i < counts.length; ++i) {
counts[i] += counts[i - 1];
}
int res = 0;
for (int el : input) {
int lo = a1 - el; // el2 >= lo
int hi = b1 - el; // el2 <= hi
lo = Math.max(lo, 0);
hi = Math.min(hi, d - 1);
if (lo <= hi) {
res += counts[hi];
if (lo > 0) {
res -= counts[lo - 1];
}
}
// Exclude pair with same element
if (a <= 2*el && 2*el <= b) {
--res;
}
}
// Calculated pairs are ordered, divide by 2
return res / 2;
}
public static int naive(int[] ar, int a, int b) {
int res = 0;
for (int i = 0; i < ar.length; ++i) {
for (int j = i + 1; j < ar.length; ++j) {
int sum = ar[i] + ar[j];
if (a <= sum && sum <= b) {
++res;
}
}
}
return res;
}
private static void test(int[] input, int a, int b) {
int naiveSol = naive(input, a, b);
int optimizedSol = countPairs(input, a, b);
if (naiveSol != optimizedSol) {
System.out.println("Problem!!!");
}
}
}
For each element of the array we know the range in which second element of the pair can lay. Core of this algorithm is giving the count of elements in range [a; b] in O(1) time.
对于数组的每一个元素,我们知道它们的第二个元素可以放置的范围。该算法的核心是给出范围内元素的计数[a;b]O(1)时间。
Resulting complexity is O(max(N, D)), where D is difference between max and min elements of the array. If this value is same order as N - complexity is O(N).
结果复杂度为O(max(N, D)),其中D为数组的最大值和最小元素的差值。如果这个值与N -复杂度相同,则为O(N)。
Notes:
注:
- No sorting involved!
- 不参与排序!
- Building lookup is required to make algorithm work with negative numbers and make second array as small as possible (positively impacts both memory and time)
- 为了使算法能够处理负数,并使第二个数组尽可能小(对内存和时间都有正向影响),构建查找是必需的。
- Ugly condition
if (a <= 2*el && 2*el <= b)
is required because algorithm always counts pairs (a[i],a[i]) - 如果(a <= 2*el && 2*el <= b)是必要的,因为算法总是计数对(a[i],a[i])
- Algorithm requires O(d) additional memory which can be a lot.
- 算法需要O(d)额外的内存。
Another linear algorithm would be radix sort + linear pair counting.
另一个线性算法是基数排序+线性对计数。
EDIT. This algorithm can be really good in case if D is considerably smaller than N and you are not allowed to modify the input array. Alternative option for this case would be slightly modified counting sort with allocation of counts array (additional O(D) memory) but without populating sorted elements back to input array. It's possible to adapt pair counting to use counts array instead of full sorted array.
编辑。如果D比N小得多,并且不允许修改输入数组,那么这个算法就会非常好。这种情况下的另一种选择是稍微修改计数排序,分配计数数组(附加O(D)内存),但不将排序的元素填充到输入数组中。使用计数数组而不是完全排序的数组是可能的。
#6
2
I have a solution(actually 2 solutions ;-)). Writing it in python:
我有一个解(实际上有两个解;-))。在python中写:
def find_count(input_list, min, max):
count = 0
range_diff = max - min
for i in range(len(input_list)):
if input_list[i]*2 >= min and input_list[i]*2 <= max:
count += 1
for j in range(i+1, len(input_list)):
input_sum = input_list[i] + input_list[j]
if input_sum >= min and input_sum <= max:
count += 2
This will run nCr(n combinations) times to the max and gives you the required count. This will be better than sorting the list and then finding the pairs in a range. If the number of elements that fail the combination is greater as well as all the numbers are positive integers, we can improve the result a little better by adding a condition that checks the elements for,
这将运行nCr(n组合)次数到最大值并给出所需的计数。这将比对列表进行排序,然后在一个范围内找到对。如果组合中失败的元素的数量更大,并且所有的数字都是正整数,我们可以通过添加检查元素的条件来改进结果,
- Numbers that do not fall under the range even with the addition of the max value
- 即使添加了最大值,也不属于范围的数字。
- Numbers that are greater than the maximum number of the range.
- 大于该范围最大值的数字。
Something like this:
是这样的:
# list_maximum is the maximum number of the list (i.e) max(input_list), if already known
def find_count(input_list, min, max, list_maximum):
count = 0
range_diff = max - min
for i in range(len(input_list)):
if input_list[i] > max or input_list[i] + list_maximum < min:
continue
if input_list[i]*2 >= min and input_list[i]*2 <= max:
count += 1
for j in range(i+1, len(input_list)):
input_sum = input_list[i] + input_list[j]
if input_sum >= min and input_sum <= max:
count += 2
I will also be happy to learn any better solution than this :-) If i come across one, I will update this answer.
我也很乐意学习比这更好的解决方案:-)如果我遇到一个,我会更新这个答案。
#7
1
I believe this is a simple math problem, that could be solved with numpy
with no loops and no sorting on our part. I'm not exactly sure, but I believe the complexity to be O(N^2) at worse case (would love some confirmation on that by someone more knowledgeable with time complexities in numpy).
我相信这是一个简单的数学问题,可以用没有循环的numpy来解决,也不能对我们的部分进行排序。我不确定,但是我相信复杂度是O(N ^ 2)在更糟糕的情况下(会喜欢一些确认时间复杂性numpy)更有见识的人。
At any rate, here's my solution:
无论如何,这是我的解决方案:
import numpy as np
def count_pairs(input_array, min, max):
A = np.array(input_array)
A_ones = np.ones((len(A),len(A)))
A_matrix = A*A_ones
result = np.transpose(A_matrix) + A_matrix
result = np.triu(result,0)
np.fill_diagonal(result,0)
count = ((result > min) & (result < max)).sum()
return count
Now let's walk through it - first I just create a matrix with columns representing our numbers:
现在我们来看一下-首先,我用表示数字的列来创建一个矩阵:
A = np.array(input_array)
A_ones = np.ones((len(A),len(A)))
A_matrix = A*A_ones
Let's assume that our input array looked like: [1,1,2,2,3,-1]
,thus, this should be the value of A_matrix
at this point.
假设我们的输入数组是这样的:[1,2,2,3,1],因此,这应该是这个点的a_矩阵的值。
[[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]
[ 1. 1. 2. 2. 3. -1.]]
If I add that to the transpose of itself...
如果我把它加到它自身的转置上…
result = np.transpose(A_matrix) + A_matrix
...I should get a matrix representing all combinations of sums of pairs:
…我应该得到一个矩阵表示所有对的和的组合:
[[ 2. 2. 3. 3. 4. 0.]
[ 2. 2. 3. 3. 4. 0.]
[ 3. 3. 4. 4. 5. 1.]
[ 3. 3. 4. 4. 5. 1.]
[ 4. 4. 5. 5. 6. 2.]
[ 0. 0. 1. 1. 2. -2.]]
Of course, this matrix is mirrored across the diagonal because the pairs (1,2) and (2,1) produce the same result. We don't want to consider these duplicate entries. We also don't want to consider the sum of an item with itself, so let's sanitize our array:
当然,这个矩阵在对角线上是镜像的,因为对(1,2)和(2,1)产生相同的结果。我们不想考虑这些重复的条目。我们也不想考虑项目本身的和,所以让我们对数组进行消毒:
result = np.triu(result,0)
np.fill_diagonal(result,0)
Our result now looks like:
我们现在的结果是:
[[ 0. 2. 3. 3. 4. 0.]
[ 0. 0. 3. 3. 4. 0.]
[ 0. 0. 0. 4. 5. 1.]
[ 0. 0. 0. 0. 5. 1.]
[ 0. 0. 0. 0. 0. 2.]
[ 0. 0. 0. 0. 0. 0.]]
All that remains is to count the items that pass our criteria.
剩下的就是计算通过我们标准的项目。
count = ((result > min) & (result < max)).sum()
A word of caution:
This method won't work if 0
is in the acceptable domain, but I'm sure it would be trivial to manipulate that result matrix above to convert those 0's to some other meaningless number....
这个方法行不通如果在可接受域的值为0,但我确信这将是微不足道的操作结果矩阵转换这些0以上的其他一些无意义的数字....
#8
1
Rather than using the relational operators, we can simply check if the sum of array elements i and j are in the specified range.
我们可以简单地检查数组元素i和j是否在指定范围内,而不是使用关系运算符。
def get_numOfPairs(array, start, stop):
num_of_pairs = 0
array_length = len(array)
for i in range(array_length):
for j in range(i+1, array_length):
if sum([array[i], array[j]]) in range(start, stop):
num_of_pairs += 1
return num_of_pairs