I am working with a 2D NumPy array. I would like to get the (column, row) index, or (x, y) coordinate, if you prefer thinking that way, from my 2D array that meets a boolean condition.
我正在使用2D NumPy数组。我想得到(列,行)索引,或者(x,y)坐标,如果你更喜欢这样思考,从我的符合布尔条件的2D数组。
The best way I can explain what I am trying to do is via a trivial example:
我能解释我想要做的最好的方法是通过一个简单的例子:
>>> a = np.arange(9).reshape(3, 3)
>>> b = a > 4
>>> b
>>> array([[False, False, False],
[False, False, True],
[ True, True, True]], dtype=bool)
At this point I now have a boolean array, indicating where a > 4
.
此时我现在有一个布尔数组,指示a> 4的位置。
My goal at this point is grab the indexes of the boolean array where the value is True
. For example, the indexes (1, 2)
, (2, 0)
, (2, 1)
, and (2, 2)
all have a value of True.
我此时的目标是获取值为True的布尔数组的索引。例如,索引(1,2),(2,0),(2,1)和(2,2)都具有值True。
My end goal is to end up with a list of indexes:
我的最终目标是最终得到一个索引列表:
>>> indexes = [(1, 2), (2, 0), (2, 1), (2, 2)]
Again, I stress the point that the code above is a trivial example, but the application of what I'm trying to do could have arbitrary indexes where a > 4
and not something based on arange
and reshape
.
同样,我强调指出上面的代码是一个简单的例子,但我正在尝试做的应用可能有任意索引,其中a> 4而不是基于arange和reshape的东西。
2 个解决方案
#1
15
Use numpy.where
with numpy.column_stack
:
使用numpy.column_stack的numpy.where:
>>> np.column_stack(np.where(b))
array([[1, 2],
[2, 0],
[2, 1],
[2, 2]])
#2
5
An alternative to the answer of @Ashwini Chaudhary, is numpy.nonzero
@Ashwini Chaudhary答案的另一种选择是numpy.nonzero
>>> a = np.arange(9).reshape(3,3)
>>> b = a > 4
>>> np.nonzero(b)
(array([1, 2, 2, 2]), array([2, 0, 1, 2]))
>>> np.transpose(np.nonzero(b))
array([[1, 2],
[2, 0],
[2, 1],
[2, 2]])
EDIT: What is faster. nonzero
and where
are essentially equivalent, but transpose
turns out to be the wrong one here (even though it's mentioned in the docs):
编辑:什么是更快。非零和在哪里基本相同,但转置在这里是错误的(即使它在文档中提到):
In [15]: N = 5000
In [16]: a = np.random.random((N, N))
In [17]: %timeit np.nonzero(a > 0.5)
1 loops, best of 3: 470 ms per loop
In [18]: %timeit np.transpose(np.nonzero(a > 0.5)) # ooops
1 loops, best of 3: 2.56 s per loop
In [19]: %timeit np.where(a > 0.5)
1 loops, best of 3: 467 ms per loop
In [20]: %timeit np.column_stack(np.where(a > 0.5))
1 loops, best of 3: 653 ms per loop
#1
15
Use numpy.where
with numpy.column_stack
:
使用numpy.column_stack的numpy.where:
>>> np.column_stack(np.where(b))
array([[1, 2],
[2, 0],
[2, 1],
[2, 2]])
#2
5
An alternative to the answer of @Ashwini Chaudhary, is numpy.nonzero
@Ashwini Chaudhary答案的另一种选择是numpy.nonzero
>>> a = np.arange(9).reshape(3,3)
>>> b = a > 4
>>> np.nonzero(b)
(array([1, 2, 2, 2]), array([2, 0, 1, 2]))
>>> np.transpose(np.nonzero(b))
array([[1, 2],
[2, 0],
[2, 1],
[2, 2]])
EDIT: What is faster. nonzero
and where
are essentially equivalent, but transpose
turns out to be the wrong one here (even though it's mentioned in the docs):
编辑:什么是更快。非零和在哪里基本相同,但转置在这里是错误的(即使它在文档中提到):
In [15]: N = 5000
In [16]: a = np.random.random((N, N))
In [17]: %timeit np.nonzero(a > 0.5)
1 loops, best of 3: 470 ms per loop
In [18]: %timeit np.transpose(np.nonzero(a > 0.5)) # ooops
1 loops, best of 3: 2.56 s per loop
In [19]: %timeit np.where(a > 0.5)
1 loops, best of 3: 467 ms per loop
In [20]: %timeit np.column_stack(np.where(a > 0.5))
1 loops, best of 3: 653 ms per loop