I have a numpy array of roughly 3125000 entries the data is structured using the following dtype
我有一个大约3125000个条目的numpy数组,使用以下dtype构造数据
dt = np.dtype([('startPoint', '<u8' ), ('endPoint', '<u8')])
The data is from a file that has been previously sorted by endPoint before it is read into the array.
数据来自先前已被endPoint排序的文件,然后将其读入阵列。
I now need to search the array and check if it contains a particular endpoint and I'm doing this using a binary search using the following code
我现在需要搜索数组并检查它是否包含特定端点,并且我使用以下代码使用二进制搜索来执行此操作
def binarySearch(array, index):
lowPoint = 0
highpoint = len(array) - 1
while (lowPoint <= highpoint):
midPoint = int((lowPoint + highpoint) / 2)
if(index == array[midPoint]['endPoint']):
return midPoint
elif(index < array[midPoint]['endPoint']):
highpoint = midPoint - 1
else:
lowPoint = midPoint + 1
return -1
My question is is there a faster way to search for an entry in this array. As in is there a built in Numpy search that may be faster than my binary search.
我的问题是有更快的方法来搜索此数组中的条目。因为有一个内置的Numpy搜索可能比我的二进制搜索更快。
1 个解决方案
#1
1
Try numpy.searchsorted
, also you can use memory mapping if the array is too large. searchsorted is implemented as binary search.
尝试numpy.searchsorted,如果数组太大,你也可以使用内存映射。 searchsorted实现为二进制搜索。
#1
1
Try numpy.searchsorted
, also you can use memory mapping if the array is too large. searchsorted is implemented as binary search.
尝试numpy.searchsorted,如果数组太大,你也可以使用内存映射。 searchsorted实现为二进制搜索。