找到一个numpy数组等于一个值列表的任何值的位置

时间:2021-08-29 13:13:26

I have an array of integers and want to find where that array is equal to any value in a list of multiple values. This can easily be done by treating each value individually, or by using multiple "or" statements in a loop, but I feel like there must be a better/faster way to do it. I'm actually dealing with arrays of size 4000x2000, but here is a simplified edition of the problem:

我有一个整数数组,我想找到数组在哪里等于一个多值列表中的任何值。通过单独处理每个值,或者在循环中使用多个“或”语句,可以很容易地做到这一点,但我觉得必须有更好更快的方法来完成它。我实际上是在处理大小为4000x2000的数组,但是这里有一个简化版本的问题:

fake=arange(9).reshape((3,3))
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
want=(fake==0)+(fake==2)+(fake==6)+(fake==8)
print want 
array([[ True, False,  True],
       [False, False, False],
       [ True, False,  True]], dtype=bool)

What I would like is a way to get want from a single command involving fake and the list of values [0,2,6,8]. I could write the command myself, but I'm assuming that there is a package that has this included already that would be significantly faster than if I just wrote a function with a loop in python.

我想要的是一种方法,从一个包含假的命令和一个值列表(0,2,6,8)中获取。我可以自己编写这个命令,但是我假设有一个包已经包含了这个包,它比用python编写带有循环的函数要快得多。

Thanks, -Adam

谢谢,亚当

2 个解决方案

#1


14  

The function numpy.in1d seems to do what you want. The only problems is that it only works on 1d arrays, so you should use it like this:

numpy的函数。in1d似乎做你想做的事。唯一的问题是它只适用于一维数组,所以您应该这样使用它:

In [9]: np.in1d(fake, [0,2,6,8]).reshape(fake.shape)
Out[9]: 
array([[ True, False,  True],
       [False, False, False],
       [ True, False,  True]], dtype=bool)

I have no clue why this is limited to 1d arrays only. Looking at its source code, it first seems to flatten the two arrays, after which it does some clever sorting tricks. But nothing would stop it from unflattening the result at the end again, like I had to do by hand here.

我不知道为什么这仅限于一维数组。看看它的源代码,它首先似乎把两个数组弄平了,然后再做一些聪明的排序技巧。但是没有什么能阻止它在最后再一次使结果变平,就像我不得不在这里手工做的那样。

#2


5  

@Bas's answer is the one you're probably looking for. But here's another way to do it, using numpy's vectorize trick:

@Bas的答案是你可能正在寻找的答案。这里还有另一种方法,使用numpy的矢量化技巧:

import numpy as np
S = set([0,2,6,8])

@np.vectorize
def contained(x):
    return x in S

contained(fake)
=> array([[ True, False,  True],
          [False, False, False],
          [ True, False,  True]], dtype=bool)

The con of this solution is that contained() is called for each element (i.e. in python-space), which makes this much slower than a pure-numpy solution.

这个解决方案的缺点是每个元素(例如在python-space中)都调用了contains(),这使得它比纯-numpy解决方案要慢得多。

#1


14  

The function numpy.in1d seems to do what you want. The only problems is that it only works on 1d arrays, so you should use it like this:

numpy的函数。in1d似乎做你想做的事。唯一的问题是它只适用于一维数组,所以您应该这样使用它:

In [9]: np.in1d(fake, [0,2,6,8]).reshape(fake.shape)
Out[9]: 
array([[ True, False,  True],
       [False, False, False],
       [ True, False,  True]], dtype=bool)

I have no clue why this is limited to 1d arrays only. Looking at its source code, it first seems to flatten the two arrays, after which it does some clever sorting tricks. But nothing would stop it from unflattening the result at the end again, like I had to do by hand here.

我不知道为什么这仅限于一维数组。看看它的源代码,它首先似乎把两个数组弄平了,然后再做一些聪明的排序技巧。但是没有什么能阻止它在最后再一次使结果变平,就像我不得不在这里手工做的那样。

#2


5  

@Bas's answer is the one you're probably looking for. But here's another way to do it, using numpy's vectorize trick:

@Bas的答案是你可能正在寻找的答案。这里还有另一种方法,使用numpy的矢量化技巧:

import numpy as np
S = set([0,2,6,8])

@np.vectorize
def contained(x):
    return x in S

contained(fake)
=> array([[ True, False,  True],
          [False, False, False],
          [ True, False,  True]], dtype=bool)

The con of this solution is that contained() is called for each element (i.e. in python-space), which makes this much slower than a pure-numpy solution.

这个解决方案的缺点是每个元素(例如在python-space中)都调用了contains(),这使得它比纯-numpy解决方案要慢得多。