从满足布尔条件的NumPy数组中获取(列,行)索引

时间:2021-11-03 21:21:56

I am working with a 2D NumPy array. I would like to get the (column, row) index, or (x, y) coordinate, if you prefer thinking that way, from my 2D array that meets a boolean condition.

我正在使用2D NumPy数组。我想得到(列,行)索引,或者(x,y)坐标,如果你更喜欢这样思考,从我的符合布尔条件的2D数组。

The best way I can explain what I am trying to do is via a trivial example:

我能解释我想要做的最好的方法是通过一个简单的例子:

>>> a = np.arange(9).reshape(3, 3)
>>> b = a > 4
>>> b
>>> array([[False, False, False],
           [False, False,  True],
           [ True,  True,  True]], dtype=bool)

At this point I now have a boolean array, indicating where a > 4.

此时我现在有一个布尔数组,指示a> 4的位置。

My goal at this point is grab the indexes of the boolean array where the value is True. For example, the indexes (1, 2), (2, 0), (2, 1), and (2, 2) all have a value of True.

我此时的目标是获取值为True的布尔数组的索引。例如,索引(1,2),(2,0),(2,1)和(2,2)都具有值True。

My end goal is to end up with a list of indexes:

我的最终目标是最终得到一个索引列表:

>>> indexes = [(1, 2), (2, 0), (2, 1), (2, 2)]

Again, I stress the point that the code above is a trivial example, but the application of what I'm trying to do could have arbitrary indexes where a > 4 and not something based on arange and reshape.

同样,我强调指出上面的代码是一个简单的例子,但我正在尝试做的应用可能有任意索引,其中a> 4而不是基于arange和reshape的东西。

2 个解决方案

#1


15  

Use numpy.where with numpy.column_stack:

使用numpy.column_stack的numpy.where:

>>> np.column_stack(np.where(b))
array([[1, 2],
       [2, 0],
       [2, 1],
       [2, 2]])

#2


5  

An alternative to the answer of @Ashwini Chaudhary, is numpy.nonzero

@Ashwini Chaudhary答案的另一种选择是numpy.nonzero

>>> a = np.arange(9).reshape(3,3)
>>> b = a > 4
>>> np.nonzero(b)
(array([1, 2, 2, 2]), array([2, 0, 1, 2]))

>>> np.transpose(np.nonzero(b))
array([[1, 2],
       [2, 0],
       [2, 1],
       [2, 2]])

EDIT: What is faster. nonzero and where are essentially equivalent, but transpose turns out to be the wrong one here (even though it's mentioned in the docs):

编辑:什么是更快。非零和在哪里基本相同,但转置在这里是错误的(即使它在文档中提到):

In [15]: N = 5000

In [16]: a = np.random.random((N, N))

In [17]: %timeit np.nonzero(a > 0.5)
1 loops, best of 3: 470 ms per loop

In [18]: %timeit np.transpose(np.nonzero(a > 0.5))     # ooops
1 loops, best of 3: 2.56 s per loop

In [19]: %timeit np.where(a > 0.5)
1 loops, best of 3: 467 ms per loop

In [20]: %timeit np.column_stack(np.where(a > 0.5))
1 loops, best of 3: 653 ms per loop

#1


15  

Use numpy.where with numpy.column_stack:

使用numpy.column_stack的numpy.where:

>>> np.column_stack(np.where(b))
array([[1, 2],
       [2, 0],
       [2, 1],
       [2, 2]])

#2


5  

An alternative to the answer of @Ashwini Chaudhary, is numpy.nonzero

@Ashwini Chaudhary答案的另一种选择是numpy.nonzero

>>> a = np.arange(9).reshape(3,3)
>>> b = a > 4
>>> np.nonzero(b)
(array([1, 2, 2, 2]), array([2, 0, 1, 2]))

>>> np.transpose(np.nonzero(b))
array([[1, 2],
       [2, 0],
       [2, 1],
       [2, 2]])

EDIT: What is faster. nonzero and where are essentially equivalent, but transpose turns out to be the wrong one here (even though it's mentioned in the docs):

编辑:什么是更快。非零和在哪里基本相同,但转置在这里是错误的(即使它在文档中提到):

In [15]: N = 5000

In [16]: a = np.random.random((N, N))

In [17]: %timeit np.nonzero(a > 0.5)
1 loops, best of 3: 470 ms per loop

In [18]: %timeit np.transpose(np.nonzero(a > 0.5))     # ooops
1 loops, best of 3: 2.56 s per loop

In [19]: %timeit np.where(a > 0.5)
1 loops, best of 3: 467 ms per loop

In [20]: %timeit np.column_stack(np.where(a > 0.5))
1 loops, best of 3: 653 ms per loop