比较两个numpy数组的最快方法

时间:2022-12-06 21:27:18

I have two arrays:

我有两个数组:

>>> import numpy as np
>>> a=np.array([2, 1, 3, 3, 3])
>>> b=np.array([1, 2, 3, 3, 3])

What is the fastest way of comparing these two arrays for equality of elements, regardless of the order?

比较这两个数组以获得元素相等性的最快方法是什么?

EDIT I measured for the execution times of the following functions:

编辑我测量了以下函数的执行时间:

def compare1():        #works only for arrays without redundant elements
    a=np.array([1,2,3,5,4])
    b=np.array([2,1,3,4,5])
    temp=0
    for i in a:
        temp+=len(np.where(b==i)[0])
    if temp==5:
            val=True
    else:
            val=False
    return 0

def compare2():
    a=np.array([1,2,3,3,3])
    b=np.array([2,1,3,3,3])
    val=np.all(np.sort(a)==np.sort(b))
    return 0

def compare3():                        #thx to ODiogoSilva
    a=np.array([1,2,3,3,3])
    b=np.array([2,1,3,3,3])
    val=set(a)==set(b)
    return 0

import numpy.lib.arraysetops as aso
def compare4():                        #thx to tom10
    a=np.array([1,2,3,3,3])
    b=np.array([2,1,3,3,3])
    val=len(aso.setdiff1d(a,b))==0
    return 0

The results are:

结果是:

>>> import timeit
>>> timeit.timeit(compare1,number=1000)
0.0166780948638916
>>> timeit.timeit(compare2,number=1000)
0.016178131103515625
>>> timeit.timeit(compare3,number=1000)
0.008063077926635742
>>> timeit.timeit(compare4,number=1000)
0.03257489204406738

Seems like the "set"-method by ODiogoSilva is the fastest.

似乎是“set”-由ODiogoSilva的方法是最快的。

Do you know other methods that I can test as well?

你知道我还可以测试的其他方法吗?

EDIT2 The runtime above was not the right measure for comparing arrays, as explained in a comment by user2357112.

上面的运行时并不是比较数组的正确度量,正如user2357112在注释中解释的那样。

#test.py
import numpy as np
import numpy.lib.arraysetops as aso

#without duplicates
N=10000
a=np.arange(N,0,step=-2)
b=np.arange(N,0,step=-2)

def compare1():
    temp=0
    for i in a:
        temp+=len(np.where(b==i)[0])
    if temp==len(a):
        val=True
    else:
        val=False
    return val
def compare2():
    val=np.all(np.sort(a)==np.sort(b))
    return val
def compare3():
    val=set(a)==set(b)
    return val
def compare4():
    val=len(aso.setdiff1d(a,b))==0
    return val

The output is:

的输出是:

>>> from test import *
>>> import timeit
>>> timeit.timeit(compare1,number=1000)
101.16708397865295
>>> timeit.timeit(compare2,number=1000)
0.09285593032836914
>>> timeit.timeit(compare3,number=1000)
1.425955057144165
>>> timeit.timeit(compare4,number=1000)
0.44780397415161133

Now compare2 is the fastest. Is there still a method that could outgun this?

现在比较2是最快的。还有什么方法能比这更有效吗?

2 个解决方案

#1


4  

Numpy as a collection of set operations.

Numpy作为一组集操作。

import numpy as np
import numpy.lib.arraysetops as aso

a=np.array([2, 1, 3, 3, 3])
b=np.array([1, 2, 3, 3, 3])

print aso.setdiff1d(a, b)

#2


1  

To see if both arrays contain the same kind of elements, in this case [1,2,3], you could do:

要查看两个数组是否包含相同类型的元素,在本例[1,2,3]中,您可以这样做:

import numpy as np
a=np.array([2, 1, 3, 3, 3])
b=np.array([1, 2, 3, 3, 3])

set(a) == set(b)
# True

#1


4  

Numpy as a collection of set operations.

Numpy作为一组集操作。

import numpy as np
import numpy.lib.arraysetops as aso

a=np.array([2, 1, 3, 3, 3])
b=np.array([1, 2, 3, 3, 3])

print aso.setdiff1d(a, b)

#2


1  

To see if both arrays contain the same kind of elements, in this case [1,2,3], you could do:

要查看两个数组是否包含相同类型的元素,在本例[1,2,3]中,您可以这样做:

import numpy as np
a=np.array([2, 1, 3, 3, 3])
b=np.array([1, 2, 3, 3, 3])

set(a) == set(b)
# True