Given an array of Integers, and a range (low, high), find all contiguous subsequence in the array which have sum in the range.
给定一个整数数组和一个范围(低,高),在数组中找到所有的连续子序列,它们在这个范围内是和。
Is there a solution better than O(n^2)?
有一个解决方案比O(n ^ 2)?
I tried a lot but couldn't find a solution that does better than O(n^2). Please help me find a better solution or confirm that this is the best we can do.
我尝试了很多但找不到解决方案确实比O(n ^ 2)。请帮我找到一个更好的解决方案或者确认这是我们能做的最好的。
This is what I have right now, I'm assuming the range to be defined as [lo, hi]
.
这就是我现在所拥有的,我假设这个范围被定义为[lo, hi]。
public static int numOfCombinations(final int[] data, final int lo, final int hi, int beg, int end) {
int count = 0, sum = data[beg];
while (beg < data.length && end < data.length) {
if (sum > hi) {
break;
} else {
if (lo <= sum && sum <= hi) {
System.out.println("Range found: [" + beg + ", " + end + "]");
++count;
}
++end;
if (end < data.length) {
sum += data[end];
}
}
}
return count;
}
public static int numOfCombinations(final int[] data, final int lo, final int hi) {
int count = 0;
for (int i = 0; i < data.length; ++i) {
count += numOfCombinations(data, lo, hi, i, i);
}
return count;
}
6 个解决方案
#1
15
O(n) time solution:
O(n)时间的解决方案:
You can extend the 'two pointer' idea for the 'exact' version of the problem. We will maintain variables a
and b
such that all intervals on the form xs[i,a), xs[i,a+1), ..., xs[i,b-1)
have a sum in the sought after range [lo, hi]
.
您可以将“两个指针”的概念扩展到问题的“确切”版本。我们将保持变量a和b,这样所有的区间都在xs (i,a), xs[i,a+1)中,…,xs[i,b-1)在寻找范围内有一个和[lo, hi]。
a, b = 0, 0
for i in range(n):
while a != (n+1) and sum(xs[i:a]) < lo:
a += 1
while b != (n+1) and sum(xs[i:b]) <= hi:
b += 1
for j in range(a, b):
print(xs[i:j])
This is actually O(n^2)
because of the sum
, but we can easily fix that by first calculating the prefix sums ps
such that ps[i] = sum(xs[:i])
. Then sum(xs[i:j])
is simply ps[j]-ps[i]
.
这实际上是O(n 2)因为它的和,但是我们可以很容易地解决这个问题,首先计算前缀和ps (i) = sum(xs[:i])。然后求和(xs[i:j])就是ps[j]-ps[i]。
Here is an example of running the above code on [2, 5, 1, 1, 2, 2, 3, 4, 8, 2]
with [lo, hi] = [3, 6]
:
这里有一个运行上述代码的例子,在[2、5、1、1、2、2、3、4、8、2]和[lo, hi] = [3, 6]:
[5]
[5, 1]
[1, 1, 2]
[1, 1, 2, 2]
[1, 2]
[1, 2, 2]
[2, 2]
[2, 3]
[3]
[4]
This runs in time O(n + t)
, where t
is the size of the output. As some have noticed, the output can be as large as t = n^2
, namely if all contiguous subsequences are matched.
这是时间O(n + t), t是输出的大小。正如一些人注意到,输出可以一样大t = n ^ 2,即如果所有连续的子序列匹配。
If we allow writing the output in a compressed format (output pairs a,b
of which all subsequences are contiguous) we can get a pure O(n)
time algorithm.
如果我们允许以压缩格式(输出对a,其中所有子序列都是连续的)来编写输出,我们就可以得到一个纯O(n)时间算法。
#2
8
Starting from this problem: find all contiguous sub-sequences that sum to x. What we need is something similar.
从这个问题开始:找到所有与x相加的连续子序列,我们需要的是类似的东西。
For every index i, we can calculate the sum of the segment from 0 to i, which is x. So, the problem now is we need to find from 0 to i - 1, how many segments have sum from (x - low) to (x - high), and it should be faster than O(n). So there are several data structures help you to do that in O(logn), which are Fenwick tree and Interval tree.
我的每个索引,我们可以计算的总和段从0到我,这是x。所以,现在的问题是我们需要找到我从0到- 1,有多少段从(x -低)到(x -高),它应该比O(n)。有几个数据结构可以帮助你在O(logn)中实现,这是Fenwick树和区间树。
So what we need to do is:
所以我们需要做的是:
-
Iterating through all index from 0 to n (n is the size of the array).
遍历所有索引从0到n (n是数组的大小)。
-
At index ith, calculate, starting from 0 to ith index, the sum x, query the tree to get the total occurrences of numbers fall in the range (x - high, x - low).
在索引ith中,计算,从0到第i个索引,和x,查询树,以得到在范围(x - high, x - low)的总次数。
-
Add x to the tree.
把x加到树里。
So the time complexity will be O(n log n)
所以时间复杂度是O(n log n)
#3
5
You should use a simple dynamic programming and binary search. To find the count:
您应该使用简单的动态编程和二分查找。找到数:
from bisect import bisect_left, bisect_right
def solve(A, start, end):
"""
O(n lg n) Binary Search
Bound:
f[i] - f[j] = start
f[i] - f[j'] = end
start < end
f[j] > f[j']
:param A: an integer array
:param start: lower bound
:param end: upper bound
:return:
"""
n = len(A)
cnt = 0
f = [0 for _ in xrange(n+1)]
for i in xrange(1, n+1):
f[i] = f[i-1]+A[i-1] # sum from left
f.sort()
for i in xrange(n+1):
lo = bisect_left(f, f[i]-end, 0, i)
hi = bisect_right(f, f[i]-start, 0, i)
cnt += hi-lo
return cnt
https://github.com/algorhythms/LintCode/blob/master/Subarray%20Sum%20II.py
https://github.com/algorhythms/LintCode/blob/master/Subarray%20Sum%20II.py
To find the results rather the count, you just need another hash table to store the mapping from original (not sorted) f[i] -> list of indexes.
为了查找结果,您只需要另一个哈希表来存储来自原始(未排序)f[i] ->索引列表的映射。
Cheers.
欢呼。
#4
0
Here is way you can get O(nlogn) if there are only positive numbers :-
如果只有正数,你可以得到O(nlogn):-。
1. Evaluate cumulative sum of array
2. for i find total sum[j] in (sum[i]+low,sum[i]+high) using binary search
3. Total = Total + count
4. do 3 to 5 for all i
Time complexity:-
时间复杂度:-
Cumulative sum is O(N)
Finding sums in range is O(logN) using binary search
Total Time complexity is O(NlogN)
#5
0
If all integers are non-negative, then it can be done in O(max(size-of-input,size-of-output))
time. This is optimal.
如果所有的整数都是非负的,那么它可以在O(最大值,输入,大小,输出)时间内完成。这是最优的。
Here's the algorithm in C.
这是C中的算法。
void interview_question (int* a, int N, int lo, int hi)
{
int sum_bottom_low = 0, sum_bottom_high = 0,
bottom_low = 0, bottom_high = 0,
top = 0;
int i;
if (lo == 0) printf ("[0 0) ");
while (top < N)
{
sum_bottom_low += a[top];
sum_bottom_high += a[top];
top++;
while (sum_bottom_high >= lo && bottom_high <= top)
{
sum_bottom_high -= a[bottom_high++];
}
while (sum_bottom_low > hi && bottom_low <= bottom_high)
{
sum_bottom_low -= a[bottom_low++];
}
// print output
for (i = bottom_low; i < bottom_high; ++i)
printf ("[%d %d) ", i, top);
}
printf("\n");
}
Except for the last loop marked "print output", each operation is executed O(N) times; the last loop is executed once for each interval printed. If we only need to count the intervals and not print them, the entire algorithm becomes O(N)
.
除最后一个标记为“打印输出”的循环外,每个操作执行O(N)次;最后一个循环在每次打印时执行一次。如果我们只需要计算间隔而不是打印它们,那么整个算法就变成O(N)了。
If negative numbers are allowed, then O(N^2)
is hard to beat (might be impossible).
如果允许负数,那么O(N ^ 2)很难击败(几乎不可能)。
#6
0
yes in my opinion it can be in O(n)
struct subsequence
{
int first,last,sum;
}s;
function(array,low,high)
{
int till_max=0;
s.first=0;s.last=0;s.sum=0;
for(i=low;i<high;i++)
{
if(till_max+array[i]>array[i])
{
s.first=s.first;
s.last=i;
till_max+=array[i];
}
else
{
s.first=i;
s.last=i;
till_max=array[i];
}
if(till_max in range)
{
s.sum=till_max;
printf("print values between first=%d and last=%d and sum=%d",s.first,s.last,s.sum);
}
}
}
#1
15
O(n) time solution:
O(n)时间的解决方案:
You can extend the 'two pointer' idea for the 'exact' version of the problem. We will maintain variables a
and b
such that all intervals on the form xs[i,a), xs[i,a+1), ..., xs[i,b-1)
have a sum in the sought after range [lo, hi]
.
您可以将“两个指针”的概念扩展到问题的“确切”版本。我们将保持变量a和b,这样所有的区间都在xs (i,a), xs[i,a+1)中,…,xs[i,b-1)在寻找范围内有一个和[lo, hi]。
a, b = 0, 0
for i in range(n):
while a != (n+1) and sum(xs[i:a]) < lo:
a += 1
while b != (n+1) and sum(xs[i:b]) <= hi:
b += 1
for j in range(a, b):
print(xs[i:j])
This is actually O(n^2)
because of the sum
, but we can easily fix that by first calculating the prefix sums ps
such that ps[i] = sum(xs[:i])
. Then sum(xs[i:j])
is simply ps[j]-ps[i]
.
这实际上是O(n 2)因为它的和,但是我们可以很容易地解决这个问题,首先计算前缀和ps (i) = sum(xs[:i])。然后求和(xs[i:j])就是ps[j]-ps[i]。
Here is an example of running the above code on [2, 5, 1, 1, 2, 2, 3, 4, 8, 2]
with [lo, hi] = [3, 6]
:
这里有一个运行上述代码的例子,在[2、5、1、1、2、2、3、4、8、2]和[lo, hi] = [3, 6]:
[5]
[5, 1]
[1, 1, 2]
[1, 1, 2, 2]
[1, 2]
[1, 2, 2]
[2, 2]
[2, 3]
[3]
[4]
This runs in time O(n + t)
, where t
is the size of the output. As some have noticed, the output can be as large as t = n^2
, namely if all contiguous subsequences are matched.
这是时间O(n + t), t是输出的大小。正如一些人注意到,输出可以一样大t = n ^ 2,即如果所有连续的子序列匹配。
If we allow writing the output in a compressed format (output pairs a,b
of which all subsequences are contiguous) we can get a pure O(n)
time algorithm.
如果我们允许以压缩格式(输出对a,其中所有子序列都是连续的)来编写输出,我们就可以得到一个纯O(n)时间算法。
#2
8
Starting from this problem: find all contiguous sub-sequences that sum to x. What we need is something similar.
从这个问题开始:找到所有与x相加的连续子序列,我们需要的是类似的东西。
For every index i, we can calculate the sum of the segment from 0 to i, which is x. So, the problem now is we need to find from 0 to i - 1, how many segments have sum from (x - low) to (x - high), and it should be faster than O(n). So there are several data structures help you to do that in O(logn), which are Fenwick tree and Interval tree.
我的每个索引,我们可以计算的总和段从0到我,这是x。所以,现在的问题是我们需要找到我从0到- 1,有多少段从(x -低)到(x -高),它应该比O(n)。有几个数据结构可以帮助你在O(logn)中实现,这是Fenwick树和区间树。
So what we need to do is:
所以我们需要做的是:
-
Iterating through all index from 0 to n (n is the size of the array).
遍历所有索引从0到n (n是数组的大小)。
-
At index ith, calculate, starting from 0 to ith index, the sum x, query the tree to get the total occurrences of numbers fall in the range (x - high, x - low).
在索引ith中,计算,从0到第i个索引,和x,查询树,以得到在范围(x - high, x - low)的总次数。
-
Add x to the tree.
把x加到树里。
So the time complexity will be O(n log n)
所以时间复杂度是O(n log n)
#3
5
You should use a simple dynamic programming and binary search. To find the count:
您应该使用简单的动态编程和二分查找。找到数:
from bisect import bisect_left, bisect_right
def solve(A, start, end):
"""
O(n lg n) Binary Search
Bound:
f[i] - f[j] = start
f[i] - f[j'] = end
start < end
f[j] > f[j']
:param A: an integer array
:param start: lower bound
:param end: upper bound
:return:
"""
n = len(A)
cnt = 0
f = [0 for _ in xrange(n+1)]
for i in xrange(1, n+1):
f[i] = f[i-1]+A[i-1] # sum from left
f.sort()
for i in xrange(n+1):
lo = bisect_left(f, f[i]-end, 0, i)
hi = bisect_right(f, f[i]-start, 0, i)
cnt += hi-lo
return cnt
https://github.com/algorhythms/LintCode/blob/master/Subarray%20Sum%20II.py
https://github.com/algorhythms/LintCode/blob/master/Subarray%20Sum%20II.py
To find the results rather the count, you just need another hash table to store the mapping from original (not sorted) f[i] -> list of indexes.
为了查找结果,您只需要另一个哈希表来存储来自原始(未排序)f[i] ->索引列表的映射。
Cheers.
欢呼。
#4
0
Here is way you can get O(nlogn) if there are only positive numbers :-
如果只有正数,你可以得到O(nlogn):-。
1. Evaluate cumulative sum of array
2. for i find total sum[j] in (sum[i]+low,sum[i]+high) using binary search
3. Total = Total + count
4. do 3 to 5 for all i
Time complexity:-
时间复杂度:-
Cumulative sum is O(N)
Finding sums in range is O(logN) using binary search
Total Time complexity is O(NlogN)
#5
0
If all integers are non-negative, then it can be done in O(max(size-of-input,size-of-output))
time. This is optimal.
如果所有的整数都是非负的,那么它可以在O(最大值,输入,大小,输出)时间内完成。这是最优的。
Here's the algorithm in C.
这是C中的算法。
void interview_question (int* a, int N, int lo, int hi)
{
int sum_bottom_low = 0, sum_bottom_high = 0,
bottom_low = 0, bottom_high = 0,
top = 0;
int i;
if (lo == 0) printf ("[0 0) ");
while (top < N)
{
sum_bottom_low += a[top];
sum_bottom_high += a[top];
top++;
while (sum_bottom_high >= lo && bottom_high <= top)
{
sum_bottom_high -= a[bottom_high++];
}
while (sum_bottom_low > hi && bottom_low <= bottom_high)
{
sum_bottom_low -= a[bottom_low++];
}
// print output
for (i = bottom_low; i < bottom_high; ++i)
printf ("[%d %d) ", i, top);
}
printf("\n");
}
Except for the last loop marked "print output", each operation is executed O(N) times; the last loop is executed once for each interval printed. If we only need to count the intervals and not print them, the entire algorithm becomes O(N)
.
除最后一个标记为“打印输出”的循环外,每个操作执行O(N)次;最后一个循环在每次打印时执行一次。如果我们只需要计算间隔而不是打印它们,那么整个算法就变成O(N)了。
If negative numbers are allowed, then O(N^2)
is hard to beat (might be impossible).
如果允许负数,那么O(N ^ 2)很难击败(几乎不可能)。
#6
0
yes in my opinion it can be in O(n)
struct subsequence
{
int first,last,sum;
}s;
function(array,low,high)
{
int till_max=0;
s.first=0;s.last=0;s.sum=0;
for(i=low;i<high;i++)
{
if(till_max+array[i]>array[i])
{
s.first=s.first;
s.last=i;
till_max+=array[i];
}
else
{
s.first=i;
s.last=i;
till_max=array[i];
}
if(till_max in range)
{
s.sum=till_max;
printf("print values between first=%d and last=%d and sum=%d",s.first,s.last,s.sum);
}
}
}