Numpy阵列，如何选择满足多个条件的指标?

Suppose I have a numpy array x = [5, 2, 3, 1, 4, 5], y = ['f', 'o', 'o', 'b', 'a', 'r']. I want to select the elements in y corresponding to elements in x that are greater than 1 and less than 5.

假设我有一个numpy数组x =[5,2、3、1、4、5),y =[' f ',' o ',' o ',' b ',' ',' r ']。我要选择y中的元素对应于大于1小于5的元素。

I tried

我试着

x = array([5, 2, 3, 1, 4, 5])
y = array(['f','o','o','b','a','r'])
output = y[x > 1 & x < 5] # desired output is ['o','o','a']

but this doesn't work. How would I do this?

但这是行不通的。我该怎么做呢?

5 个解决方案

#1

138

Your expression works if you add parentheses:

如果你加上括号，你的表达式就会起作用。

>>> y[(1 < x) & (x < 5)]
array(['o', 'o', 'a'], 
      dtype='|S1')

#2

IMO OP does not actually want np.bitwise_and() (aka &) but actually wants np.logical_and() because they are comparing logical values such as True and False - see this SO post on logical vs. bitwise to see the difference.

IMO OP实际上并不想要np.bitwise_and()(又名&)，但实际上想要np.logical_and()，因为它们正在比较诸如True和False这样的逻辑值，所以在逻辑上和bitwise上都可以看到区别。

>>> x = array([5, 2, 3, 1, 4, 5])
>>> y = array(['f','o','o','b','a','r'])
>>> output = y[np.logical_and(x > 1, x < 5)] # desired output is ['o','o','a']
>>> output
array(['o', 'o', 'a'],
      dtype='|S1')

And equivalent way to do this is with np.all() by setting the axis argument appropriately.

通过适当地设置axis参数，这是与np.all()相同的方法。

>>> output = y[np.all([x > 1, x < 5], axis=0)] # desired output is ['o','o','a']
>>> output
array(['o', 'o', 'a'],
      dtype='|S1')

by the numbers:

的数字:

>>> %timeit (a < b) & (b < c)
The slowest run took 32.97 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.15 µs per loop

>>> %timeit np.logical_and(a < b, b < c)
The slowest run took 32.59 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.17 µs per loop

>>> %timeit np.all([a < b, b < c], 0)
The slowest run took 67.47 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 5.06 µs per loop

so using np.all() is slower, but & and logical_and are about the same.

所以使用np.all()比较慢，但是&和logical_and是差不多的。

#3

Add one detail to @J.F. Sebastian's and @Mark Mikofski's answers:
If one wants to get the corresponding indices (rather than the actual values of array), the following code will do:

将一个细节添加到@J.F。塞巴斯蒂安和@Mark Mikofski的回答:如果一个人想要得到相应的索引(而不是数组的实际值)，下面的代码将会这样做:

For satisfying multiple (all) conditions:

满足多个(所有)条件:

select_indices = np.where( np.logical_and( x > 1, x < 5) ) #   1 < x <5

For satisfying multiple (or) conditions:

满足多个(或)条件:

select_indices = np.where( np.logical_or( x < 1, x > 5 ) ) # x <1 or x >5

#4

Actually I would do it this way:

实际上我会这样做:

L1 is the index list of elements satisfying condition 1;(maybe you can use somelist.index(condition1) or np.where(condition1) to get L1.)

L1是满足条件1的元素的索引列表;(也许您可以使用somelist.index(condition1)或np.where(condition1)得到L1。

Similarly, you get L2, a list of elements satisfying condition 2;

类似地，你得到L2，一个满足条件2的元素的列表;

Then you find intersection using intersect(L1,L2).

然后你可以找到相交的相交点(L1,L2)。

You can also find intersection of multiple lists if you get multiple conditions to satisfy.

如果你有多个条件满足，你也可以找到多个列表的交集。

Then you can apply index in any other array, for example, x.

然后你可以在任何其他数组中应用索引，例如x。

#5

I like to use np.vectorize for such tasks. Consider the following:

我喜欢用np。vectorize等任务。考虑以下:

>>> # Arrays
>>> x = np.array([5, 2, 3, 1, 4, 5])
>>> y = np.array(['f','o','o','b','a','r'])

>>> # Function containing the constraints
>>> func = np.vectorize(lambda t: t>1 and t<5)

>>> # Call function on x
>>> y[func(x)]
>>> array(['o', 'o', 'a'], dtype='<U1')

The advantage is you can add many more types of constraints in the vectorized function.

优点是您可以在矢量化函数中添加更多类型的约束。

Hope it helps.

希望它可以帮助。

#1

138