如何在numpy [duplicate]中测试所有行是否不同

时间:2021-12-26 15:57:31

This question already has an answer here:

这个问题已经有了答案:

In numpy, is there a nice idiomatic way of testing if all rows are distinct in a 2d array?

在numpy中,是否有一种很好的惯用方法来测试二维数组中的所有行是否不同?

I thought I could do

我以为我能做到

len(np.unique(arr)) == len(arr)

but this doesn't work at all. For example,

但这根本不管用。例如,

arr = np.array([[1,2,3],[1,2,4]])
np.unique(arr)
Out[4]: array([1, 2, 3, 4])

1 个解决方案

#1


0  

You can calculate the correlation matrix and ask if only the diagonal elements are 1:

你可以计算相关矩阵,问是否只有对角线元素是1:

(np.corrcoef(M)==1).sum()==M.shape[0]


In [66]:

M = np.random.random((5,8))
In [72]:

(np.corrcoef(M)==1).sum()==M.shape[0]
Out[72]:
True

This if you want to do a similar thing for the columns:

如果你想对栏目做类似的事情:

(np.corrcoef(M, rowvar=0)==1).sum()==M.shape[1]

(np。rowvar corrcoef(M = 0)= = 1).sum()= = M.shape[1]

or without numpy at all:

或者完全没有numpy:

len(set(map(tuple,M)))==len(M)

Fiter out the unique rows and then test if the resultant is same as M is an overkill:

确定唯一的行,然后测试结果是否与M相同,这是过度的:

In [99]:

%%timeit

b = np.ascontiguousarray(M).view(np.dtype((np.void, M.dtype.itemsize * M.shape[1])))
_, idx = np.unique(b, return_index=True)

unique_M = M[idx]

unique_M.shape==M.shape
10000 loops, best of 3: 54.6 µs per loop
In [100]:

%timeit len(set(map(tuple,M)))==len(M)
10000 loops, best of 3: 24.9 µs per loop

#1


0  

You can calculate the correlation matrix and ask if only the diagonal elements are 1:

你可以计算相关矩阵,问是否只有对角线元素是1:

(np.corrcoef(M)==1).sum()==M.shape[0]


In [66]:

M = np.random.random((5,8))
In [72]:

(np.corrcoef(M)==1).sum()==M.shape[0]
Out[72]:
True

This if you want to do a similar thing for the columns:

如果你想对栏目做类似的事情:

(np.corrcoef(M, rowvar=0)==1).sum()==M.shape[1]

(np。rowvar corrcoef(M = 0)= = 1).sum()= = M.shape[1]

or without numpy at all:

或者完全没有numpy:

len(set(map(tuple,M)))==len(M)

Fiter out the unique rows and then test if the resultant is same as M is an overkill:

确定唯一的行,然后测试结果是否与M相同,这是过度的:

In [99]:

%%timeit

b = np.ascontiguousarray(M).view(np.dtype((np.void, M.dtype.itemsize * M.shape[1])))
_, idx = np.unique(b, return_index=True)

unique_M = M[idx]

unique_M.shape==M.shape
10000 loops, best of 3: 54.6 µs per loop
In [100]:

%timeit len(set(map(tuple,M)))==len(M)
10000 loops, best of 3: 24.9 µs per loop