
时间:2022-01-10 18:06:35

I have a numpy array of roughly 3125000 entries the data is structured using the following dtype


dt = np.dtype([('startPoint', '<u8' ), ('endPoint', '<u8')])

The data is from a file that has been previously sorted by endPoint before it is read into the array.


I now need to search the array and check if it contains a particular endpoint and I'm doing this using a binary search using the following code


def binarySearch(array, index):
lowPoint = 0
highpoint = len(array) - 1

while (lowPoint <= highpoint):
    midPoint = int((lowPoint + highpoint) / 2)

    if(index == array[midPoint]['endPoint']):
        return midPoint

    elif(index < array[midPoint]['endPoint']):
        highpoint = midPoint - 1

        lowPoint = midPoint + 1

return -1

My question is is there a faster way to search for an entry in this array. As in is there a built in Numpy search that may be faster than my binary search.


1 个解决方案



Try numpy.searchsorted, also you can use memory mapping if the array is too large. searchsorted is implemented as binary search.

尝试numpy.searchsorted,如果数组太大,你也可以使用内存映射。 searchsorted实现为二进制搜索。



Try numpy.searchsorted, also you can use memory mapping if the array is too large. searchsorted is implemented as binary search.

尝试numpy.searchsorted,如果数组太大,你也可以使用内存映射。 searchsorted实现为二进制搜索。