This question already has an answer here:
这个问题已经有了答案:
- Find unique rows in numpy.array 20 answers
- 在numpy中找到唯一的行。20个答案数组
In numpy, is there a nice idiomatic way of testing if all rows are distinct in a 2d array?
在numpy中,是否有一种很好的惯用方法来测试二维数组中的所有行是否不同?
I thought I could do
我以为我能做到
len(np.unique(arr)) == len(arr)
but this doesn't work at all. For example,
但这根本不管用。例如,
arr = np.array([[1,2,3],[1,2,4]])
np.unique(arr)
Out[4]: array([1, 2, 3, 4])
1 个解决方案
#1
0
You can calculate the correlation matrix and ask if only the diagonal elements are 1
:
你可以计算相关矩阵,问是否只有对角线元素是1:
(np.corrcoef(M)==1).sum()==M.shape[0]
In [66]:
M = np.random.random((5,8))
In [72]:
(np.corrcoef(M)==1).sum()==M.shape[0]
Out[72]:
True
This if you want to do a similar thing for the columns:
如果你想对栏目做类似的事情:
(np.corrcoef(M, rowvar=0)==1).sum()==M.shape[1]
(np。rowvar corrcoef(M = 0)= = 1).sum()= = M.shape[1]
or without numpy
at all:
或者完全没有numpy:
len(set(map(tuple,M)))==len(M)
Fiter out the unique rows and then test if the resultant is same as M
is an overkill:
确定唯一的行,然后测试结果是否与M相同,这是过度的:
In [99]:
%%timeit
b = np.ascontiguousarray(M).view(np.dtype((np.void, M.dtype.itemsize * M.shape[1])))
_, idx = np.unique(b, return_index=True)
unique_M = M[idx]
unique_M.shape==M.shape
10000 loops, best of 3: 54.6 µs per loop
In [100]:
%timeit len(set(map(tuple,M)))==len(M)
10000 loops, best of 3: 24.9 µs per loop
#1
0
You can calculate the correlation matrix and ask if only the diagonal elements are 1
:
你可以计算相关矩阵,问是否只有对角线元素是1:
(np.corrcoef(M)==1).sum()==M.shape[0]
In [66]:
M = np.random.random((5,8))
In [72]:
(np.corrcoef(M)==1).sum()==M.shape[0]
Out[72]:
True
This if you want to do a similar thing for the columns:
如果你想对栏目做类似的事情:
(np.corrcoef(M, rowvar=0)==1).sum()==M.shape[1]
(np。rowvar corrcoef(M = 0)= = 1).sum()= = M.shape[1]
or without numpy
at all:
或者完全没有numpy:
len(set(map(tuple,M)))==len(M)
Fiter out the unique rows and then test if the resultant is same as M
is an overkill:
确定唯一的行,然后测试结果是否与M相同,这是过度的:
In [99]:
%%timeit
b = np.ascontiguousarray(M).view(np.dtype((np.void, M.dtype.itemsize * M.shape[1])))
_, idx = np.unique(b, return_index=True)
unique_M = M[idx]
unique_M.shape==M.shape
10000 loops, best of 3: 54.6 µs per loop
In [100]:
%timeit len(set(map(tuple,M)))==len(M)
10000 loops, best of 3: 24.9 µs per loop