This question already has an answer here:
这个问题已经有了答案:
- How do I search for a number in a 2d array sorted left to right and top to bottom? 19 answers
- 我如何在二维数组中搜索一个数字,从左到右,从上到下?19日答案
Say I have a matrix (MxN
) which has its rows and columns sorted.
假设我有一个矩阵(MxN),它的行和列都是有序的。
- All the elements in each row are arranged in increasing order
- 每一行的所有元素都按递增顺序排列。
- All the elements in each column are arranged in increasing order
- 每一列的所有元素都按递增顺序排列。
- All the elements are integers
- 所有的元素都是整数。
-
No other assumptions can be made
没有其他的假设。
Example:
例子:
[1 5 8 20]
[1 5 8 20]
[2 9 19 21]
(2 9 19 21)
[12 15 25 30]
(12 15 25 30)
I have to find if a given number is present in the matrix or not (Basic search). I have an algorithm which runs O(n)
我必须找到一个给定的数字是否存在于矩阵中(基本搜索)。我有一个运行O(n)的算法
int row = 0;
int col = N-1;
while (row < M && col >= 0) {
if (mat[row][col] == elem) {
return true;
} else if (mat[row][col] > elem) {
col--;
} else {
row++;
}
}
But I was asked an O(log (MxN)) == O(Log(n))
solution. Any ideas??
但是我被问了一个O(log (MxN)) == O(log (n))解。有什么想法?
5 个解决方案
#1
72
O(log (M * N)) solution is not possible for this task.
O(log (M * N))解决方案是不可能的。
Let's look at a simplified task: in "sorted" square matrix assume all elements above secondary diagonal (green) less than given number, all elements below secondary diagonal (red) greater than given number, and no additional assumptions for elements on secondary diagonal (yellow).
让我们来看看一个简单的任务:在“有序”的方阵中,假设所有的元素都在次级对角线以上(绿色)小于给定的数,所有元素都在二级对角线以下(红色)大于给定的数字,而对于次要对角线(黄色)的元素没有额外的假设。
Neither original assumptions of this task, nor these additional assumptions tell us how elements on secondary diagonal are related to each other. Which means we just have an unsorted array of N integers. We cannot find given number in the unsorted array faster than O(N). So for original (more complicated) problem with square matrix we cannot get a solution better than O(N).
这个任务的原始假设,以及这些额外的假设都没有告诉我们次级对角线上的元素是如何相互联系的。这意味着我们有一个未排序的N个整数数组。在未排序的数组中,我们不能比O(N)更快地找到给定的数字。所以对于原(更复杂)的方阵问题,我们不能得到比O(N)更好的解。
For a rectangular matrix, stretch the square picture and set the additional assumptions accordingly. Here we have min(N,M) sorted sub-arrays of size max(N,M)/min(N,M) each. The best way to search here is to use linear search to find one or several sub-arrays that may contain given value, then to use binary search inside these sub-arrays. In the worst case it is necessary to binary-search in each sub-array. Complexity is O(min(N,M) * (1 + log(max(N,M) / min(N,M)))). So for original (more complicated) problem with rectangular matrix we cannot get a solution better than O(min(N,M) * ( 1 + log(max(N,M)) - log(min(N,M)))).
对于一个矩形矩阵,拉伸正方形图像并相应地设置额外的假设。在这里,我们有min(N,M)排序的子数组,大小为max(N,M)/min(N,M)。在这里搜索的最好方法是使用线性搜索找到一个或几个可能包含给定值的子数组,然后在这些子数组中使用二进制搜索。在最坏的情况下,需要在每个子数组中进行二进制搜索。复杂度是O(min(N,M)*(1 +日志(max(N,M)/分钟(N,M))))。因此,对于原(更复杂)的矩形矩阵问题,我们不能得到比O(min(N,M) * (1 + log(max(N,M)) - log(min(N,M))的解。
#2
6
It is not possible to do better than O(n). Some guys (there are at least three of them on this page) think they can do better but that's because their algorithms are wrong or because they don't know how to compute the complexity of their algorithm so they try to guess it. This blog post is very good and will explain you the errors of these guys.
要比O(n)做得更好是不可能的。有些人(至少有三个人在这个页面上)认为他们可以做得更好,但那是因为他们的算法是错误的,或者他们不知道如何计算算法的复杂性,所以他们试着去猜测。这篇博文很好,可以解释这些人的错误。
Draft of a proof that O(n) is optimal: consider the following matrix:
证明O(n)是最优的证明:考虑以下矩阵:
1 2 3 4 5 6 … (n-2) (n-1) (n+1)
2 3 4 5 6 7 … (n-1) (n+1) (n+2)
3 4 5 6 7 8 … (n+1) (n+2) (n+3)
… … … … … … … … … …
(n-2) (n-1) … … … … … … … (2n-1)
(n-1) (n+1) … … … … … … … 2n
(n+1) (n+2) … … … … … (2n-1) 2n (2n+1)
If you are looking for n
in this matrix you must check at least once for each row if n
is in the row because n
could be in any row. (The proof is not complete but here is the idea)
如果你在这个矩阵中寻找n,你必须至少检查一次,如果n在行中,因为n可以在任何一行。(证据不完整,但这里有一个想法)
#3
4
You have to use recursion to solve this problem. Given a matrix X and number y, you can do binary search for y on the middle row of X and divide the matrix into four parts such that:
你必须使用递归来解决这个问题。给定一个矩阵X和y,你可以在中间的X行上对y进行二分查找,然后将矩阵分成四部分:
A|B
---
C|D
all elements in A are less than y, all elements in D are greater than y, and y can be in B and C. Iteratively find y in B and C.
A中的所有元素都小于y, D中的所有元素都大于y, y可以在B和C中,在B和C中迭代地找到y。
Since height(A)=height(B)\approx= height(C)=height(D), size(X)>= 2*(size(B)+size(C)) . So the resulting complexity if O(logn).
由于高度(A)=height(B)\approx= height(C)=height(D), size(X)>= 2*(size(B)+size(C))。所以如果O(logn)就会产生复杂度。
def find(X,y):
a,b = X.shape
i = a /2
j = binsearch(X[i,:], y)
if X[i,j]==y:
return True
else:
return find( X[ (i+1):a, 0:(j-1)], y ) or find( X[ 0:i, j:b], y )
#4
2
Since both rows and columns are sorted, if we look at the first element of each row we can find which one contains the number we're looking for. Then, again, we can exploit the fact that the elements in each row are sorted and find that number.
The fastest search algorithm I know is Binary Search, which has a complexity of O(log n), so the total complexity will be O(log m + log n).
Here's an example, suppose we're looking for 28:
因为行和列都是排序的,如果我们查看每一行的第一个元素,我们就可以找到哪个包含我们要查找的数字。然后,再一次,我们可以利用这一事实,每一行中的元素都被排序并找到那个数字。我所知道的最快的搜索算法是二分搜索,它的复杂度是O(log n),所以总复杂度是O(log m + log n)。
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
- We do a binary search over the elements of the first column (1, 11, 21, 31, 41) and find that the row is the third, because its first element is smaller than our number but the next row's first element is larger. Number of steps: 2 (21, 31, found)
- 我们对第一列的元素进行二分查找(1,11,21,31,41),发现这一行是第三列,因为它的第一个元素比我们的数字小,但是下一行的第一个元素是更大的。步骤数:2(21,31,发现)
- We do a binary search again over the third row (21, 22, 23, 24, 25, 26, 27, 28, 29, 30) and find our number. Number of steps: 2 - 3 (25, 27 or 28, found)
- 我们在第三行(21、22、23、24、25、26、27、28、29、30)再次进行二分查找,找到我们的号码。步骤数:2 - 3(25,27或28,发现)
#5
0
I think this can be done in O(log(n*n)*log(n)) time, where n is the no. of the rows of a square matrix.
我认为这可以在O(log(n*n)*log(n))时间内完成,其中n是no。一个方阵的行。
By the properties of Matrix, the principal diagonal of the matrix is a sorted array. So, we can search an element or its lower bound in O(log(n)). Now, using this element as pivot, we have 4 sub-matrix. and we can say that all the elements in sub-matrix(top-left) are smaller, all the elements in sub-matrix (lower-right) are bigger. So, we can remove that from the search space.
根据矩阵的性质,矩阵的主对角线是一个有序数组。因此,我们可以在O(log(n))中搜索一个元素或它的下界。现在,用这个元素作为主元,我们有4个子矩阵。我们可以说,子矩阵(左上角)的所有元素都更小,子矩阵(右下)的所有元素都更大。因此,我们可以从搜索空间中删除它。
Now, recursively search in sub-matrix (top-right) and in sub-matrix(lower-left).
现在,递归搜索子矩阵(右上)和子矩阵(左下)。
Since, at each step, we perform a log(n) search (along principal diagonal) ad there can be atmost log(n*n) steps (as we reduce the search space by half in each step).
因为,在每个步骤中,我们执行一个log(n)搜索(沿着主对角线),可以有最多的log(n*n)步骤(因为我们在每一步中减少了一半的搜索空间)。
So, Time complexity = O(log(n)log(nn)).
因此,时间复杂度= O(log(n)log(nn))
Please correct, if anything is wrong.
如果有问题,请纠正。
Refrences - [Book]Cracking the coding interview (Question 11.6)
(书)破解编码面试(问题11.6)
#1
72
O(log (M * N)) solution is not possible for this task.
O(log (M * N))解决方案是不可能的。
Let's look at a simplified task: in "sorted" square matrix assume all elements above secondary diagonal (green) less than given number, all elements below secondary diagonal (red) greater than given number, and no additional assumptions for elements on secondary diagonal (yellow).
让我们来看看一个简单的任务:在“有序”的方阵中,假设所有的元素都在次级对角线以上(绿色)小于给定的数,所有元素都在二级对角线以下(红色)大于给定的数字,而对于次要对角线(黄色)的元素没有额外的假设。
Neither original assumptions of this task, nor these additional assumptions tell us how elements on secondary diagonal are related to each other. Which means we just have an unsorted array of N integers. We cannot find given number in the unsorted array faster than O(N). So for original (more complicated) problem with square matrix we cannot get a solution better than O(N).
这个任务的原始假设,以及这些额外的假设都没有告诉我们次级对角线上的元素是如何相互联系的。这意味着我们有一个未排序的N个整数数组。在未排序的数组中,我们不能比O(N)更快地找到给定的数字。所以对于原(更复杂)的方阵问题,我们不能得到比O(N)更好的解。
For a rectangular matrix, stretch the square picture and set the additional assumptions accordingly. Here we have min(N,M) sorted sub-arrays of size max(N,M)/min(N,M) each. The best way to search here is to use linear search to find one or several sub-arrays that may contain given value, then to use binary search inside these sub-arrays. In the worst case it is necessary to binary-search in each sub-array. Complexity is O(min(N,M) * (1 + log(max(N,M) / min(N,M)))). So for original (more complicated) problem with rectangular matrix we cannot get a solution better than O(min(N,M) * ( 1 + log(max(N,M)) - log(min(N,M)))).
对于一个矩形矩阵,拉伸正方形图像并相应地设置额外的假设。在这里,我们有min(N,M)排序的子数组,大小为max(N,M)/min(N,M)。在这里搜索的最好方法是使用线性搜索找到一个或几个可能包含给定值的子数组,然后在这些子数组中使用二进制搜索。在最坏的情况下,需要在每个子数组中进行二进制搜索。复杂度是O(min(N,M)*(1 +日志(max(N,M)/分钟(N,M))))。因此,对于原(更复杂)的矩形矩阵问题,我们不能得到比O(min(N,M) * (1 + log(max(N,M)) - log(min(N,M))的解。
#2
6
It is not possible to do better than O(n). Some guys (there are at least three of them on this page) think they can do better but that's because their algorithms are wrong or because they don't know how to compute the complexity of their algorithm so they try to guess it. This blog post is very good and will explain you the errors of these guys.
要比O(n)做得更好是不可能的。有些人(至少有三个人在这个页面上)认为他们可以做得更好,但那是因为他们的算法是错误的,或者他们不知道如何计算算法的复杂性,所以他们试着去猜测。这篇博文很好,可以解释这些人的错误。
Draft of a proof that O(n) is optimal: consider the following matrix:
证明O(n)是最优的证明:考虑以下矩阵:
1 2 3 4 5 6 … (n-2) (n-1) (n+1)
2 3 4 5 6 7 … (n-1) (n+1) (n+2)
3 4 5 6 7 8 … (n+1) (n+2) (n+3)
… … … … … … … … … …
(n-2) (n-1) … … … … … … … (2n-1)
(n-1) (n+1) … … … … … … … 2n
(n+1) (n+2) … … … … … (2n-1) 2n (2n+1)
If you are looking for n
in this matrix you must check at least once for each row if n
is in the row because n
could be in any row. (The proof is not complete but here is the idea)
如果你在这个矩阵中寻找n,你必须至少检查一次,如果n在行中,因为n可以在任何一行。(证据不完整,但这里有一个想法)
#3
4
You have to use recursion to solve this problem. Given a matrix X and number y, you can do binary search for y on the middle row of X and divide the matrix into four parts such that:
你必须使用递归来解决这个问题。给定一个矩阵X和y,你可以在中间的X行上对y进行二分查找,然后将矩阵分成四部分:
A|B
---
C|D
all elements in A are less than y, all elements in D are greater than y, and y can be in B and C. Iteratively find y in B and C.
A中的所有元素都小于y, D中的所有元素都大于y, y可以在B和C中,在B和C中迭代地找到y。
Since height(A)=height(B)\approx= height(C)=height(D), size(X)>= 2*(size(B)+size(C)) . So the resulting complexity if O(logn).
由于高度(A)=height(B)\approx= height(C)=height(D), size(X)>= 2*(size(B)+size(C))。所以如果O(logn)就会产生复杂度。
def find(X,y):
a,b = X.shape
i = a /2
j = binsearch(X[i,:], y)
if X[i,j]==y:
return True
else:
return find( X[ (i+1):a, 0:(j-1)], y ) or find( X[ 0:i, j:b], y )
#4
2
Since both rows and columns are sorted, if we look at the first element of each row we can find which one contains the number we're looking for. Then, again, we can exploit the fact that the elements in each row are sorted and find that number.
The fastest search algorithm I know is Binary Search, which has a complexity of O(log n), so the total complexity will be O(log m + log n).
Here's an example, suppose we're looking for 28:
因为行和列都是排序的,如果我们查看每一行的第一个元素,我们就可以找到哪个包含我们要查找的数字。然后,再一次,我们可以利用这一事实,每一行中的元素都被排序并找到那个数字。我所知道的最快的搜索算法是二分搜索,它的复杂度是O(log n),所以总复杂度是O(log m + log n)。
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
- We do a binary search over the elements of the first column (1, 11, 21, 31, 41) and find that the row is the third, because its first element is smaller than our number but the next row's first element is larger. Number of steps: 2 (21, 31, found)
- 我们对第一列的元素进行二分查找(1,11,21,31,41),发现这一行是第三列,因为它的第一个元素比我们的数字小,但是下一行的第一个元素是更大的。步骤数:2(21,31,发现)
- We do a binary search again over the third row (21, 22, 23, 24, 25, 26, 27, 28, 29, 30) and find our number. Number of steps: 2 - 3 (25, 27 or 28, found)
- 我们在第三行(21、22、23、24、25、26、27、28、29、30)再次进行二分查找,找到我们的号码。步骤数:2 - 3(25,27或28,发现)
#5
0
I think this can be done in O(log(n*n)*log(n)) time, where n is the no. of the rows of a square matrix.
我认为这可以在O(log(n*n)*log(n))时间内完成,其中n是no。一个方阵的行。
By the properties of Matrix, the principal diagonal of the matrix is a sorted array. So, we can search an element or its lower bound in O(log(n)). Now, using this element as pivot, we have 4 sub-matrix. and we can say that all the elements in sub-matrix(top-left) are smaller, all the elements in sub-matrix (lower-right) are bigger. So, we can remove that from the search space.
根据矩阵的性质,矩阵的主对角线是一个有序数组。因此,我们可以在O(log(n))中搜索一个元素或它的下界。现在,用这个元素作为主元,我们有4个子矩阵。我们可以说,子矩阵(左上角)的所有元素都更小,子矩阵(右下)的所有元素都更大。因此,我们可以从搜索空间中删除它。
Now, recursively search in sub-matrix (top-right) and in sub-matrix(lower-left).
现在,递归搜索子矩阵(右上)和子矩阵(左下)。
Since, at each step, we perform a log(n) search (along principal diagonal) ad there can be atmost log(n*n) steps (as we reduce the search space by half in each step).
因为,在每个步骤中,我们执行一个log(n)搜索(沿着主对角线),可以有最多的log(n*n)步骤(因为我们在每一步中减少了一半的搜索空间)。
So, Time complexity = O(log(n)log(nn)).
因此,时间复杂度= O(log(n)log(nn))
Please correct, if anything is wrong.
如果有问题,请纠正。
Refrences - [Book]Cracking the coding interview (Question 11.6)
(书)破解编码面试(问题11.6)