Numpy int数组:查找多个目标int的索引

时间:2021-08-19 21:28:37

I have a large numpy array (dtype=int) and a set of numbers which I'd like to find in that array, e.g.,

我有一个大的numpy数组(dtype = int)和一组我想在该数组中找到的数字,例如,

import numpy as np
values = np.array([1, 2, 3, 1, 2, 4, 5, 6, 3, 2, 1])
searchvals = [3, 1]
# result = [0, 2, 3, 8, 10]

The result array doesn't have to be sorted.

结果数组不必排序。

Speed is an issue, and since both values and searchvals can be large,

速度是一个问题,因为值和搜索量都很大,

for searchval in searchvals:
    np.where(values == searchval)[0]

doesn't cut it.

不削减它。

Any hints?

3 个解决方案

#1


5  

Is this fast enough?

这够快吗?

>>> np.where(np.in1d(values, searchvals))
(array([ 0,  2,  3,  8, 10]),)

#2


1  

I would say using np.in1d would be the intuitive solution to solve such a case. Having said that, based on this solution here's an alternative with np.searchsorted -

我会说使用np.in1d将是解决这种情况的直观解决方案。话虽如此,基于此解决方案,这里是np.searchsorted的替代方案 -

sidx = np.argsort(searchvals)
left_idx = np.searchsorted(searchvals,values,sorter=sidx,side='left')
right_idx = np.searchsorted(searchvals,values,sorter=sidx,side='right')
out = np.where(left_idx != right_idx)[0]

#3


0  

Can you avoid numpy all together? List concatenation should be much faster than relying on numpy's methods. This will still work even if values needs to be a numpy array.

你能一起避免numpy吗?列表连接应该比依赖于numpy的方法快得多。即使值必须是一个numpy数组,这仍然有效。

result = []
for sv in searchvals:
    result += [i for i in range(len(values)) if values[i] == sv]

#1


5  

Is this fast enough?

这够快吗?

>>> np.where(np.in1d(values, searchvals))
(array([ 0,  2,  3,  8, 10]),)

#2


1  

I would say using np.in1d would be the intuitive solution to solve such a case. Having said that, based on this solution here's an alternative with np.searchsorted -

我会说使用np.in1d将是解决这种情况的直观解决方案。话虽如此,基于此解决方案,这里是np.searchsorted的替代方案 -

sidx = np.argsort(searchvals)
left_idx = np.searchsorted(searchvals,values,sorter=sidx,side='left')
right_idx = np.searchsorted(searchvals,values,sorter=sidx,side='right')
out = np.where(left_idx != right_idx)[0]

#3


0  

Can you avoid numpy all together? List concatenation should be much faster than relying on numpy's methods. This will still work even if values needs to be a numpy array.

你能一起避免numpy吗?列表连接应该比依赖于numpy的方法快得多。即使值必须是一个numpy数组,这仍然有效。

result = []
for sv in searchvals:
    result += [i for i in range(len(values)) if values[i] == sv]