I have two lists of lists of numpy arrays called A and B and I want to check that for each list inside A, there exists a list in B that is the same (contains the same arrays).
我有两个名为A和B的numpy数组列表,我想检查A中的每个列表,B中存在一个相同的列表(包含相同的数组)。
Here's an example.
A = [[np.array([5,2]), np.array([6,7,8])], [np.array([1,2,3])]]
B = [[np.array([1,2,3])], [np.array([6,7,8]), np.array([5,2])]]
这是一个例子。 A = [[np.array([5,2]),np.array([6,7,8])],[np.array([1,2,3])]] B = [[np。数组([1,2,3])],[np.array([6,7,8]),np.array([5,2])]]
Basically, I was wondering if there is a pythonic/elegant way to write a function that f(A, B) == True.
基本上,我想知道是否有一种pythonic /优雅方式来编写f(A,B)== True的函数。
Why it should be True?
A[0] = [np.array([5,2]), np.array([6,7,8])]. There is a matching list in B.
B[1] = [np.array([6,7,8]), np.array([5,2])]
A[0] and B[1] both contain exactly the same set of vectors: np.array([6,7,8]), np.array([5,2]).
为什么它应该是真的? A [0] = [np.array([5,2]),np.array([6,7,8])]。 B. B [1] = [np.array([6,7,8]),np.array([5,2])] A [0]和B [1]都有一个匹配列表同一组载体:np.array([6,7,8]),np.array([5,2])。
A[1] = [np.array([1,2,3])]. There is a matching list in B.
B[0] = [np.array([1,2,3])].
Therefor, return True.
A [1] = [np.array([1,2,3])]。 B. B [0] = [np.array([1,2,3])]中有匹配列表。因此,返回True。
Some context:
- A and B are two clusterings of the same data.
- A and B have the same number of clusters so A and B are the same length.
- A[0] is a list of arrays representing all the vectors that belong to the 0th cluster in the A clustering.
A和B是相同数据的两个聚类。
A和B具有相同数量的簇,因此A和B具有相同的长度。
A [0]是表示属于A聚类中第0个聚类的所有向量的数组列表。
Basically, I want to check whether A and B clustered the data into the same clusters. I'm not sure whether I can simply compare A[i] and B[i].
基本上,我想检查A和B是否将数据聚集到相同的集群中。我不确定我是否可以简单地比较A [i]和B [i]。
4 个解决方案
#1
2
Try use numpy.array_equal, you can use code like this:
尝试使用numpy.array_equal,你可以使用这样的代码:
>>> import numpy as np
>>> np.array_equal(np.array([[1,2],[2,1]]), np.array([[1,2],[2,1]]))
True
#2
0
Try the following:
请尝试以下方法:
all(A[i] == B[i] if len(B) == len(A) else False for i in range(len(B)))
>>> A = [[5, 2], [6, 7, 8]]
>>> B = [[5, 2], [6, 7, 8]]
>>> all(A[i] == B[i] if len(B) == len(A) else False for i in range(len(B)))
True
>>> A.append([56, 2])
>>> all(A[i] == B[i] if len(B) == len(A) else False for i in range(len(B)))
False
>>>
#3
0
From your latest modification it evident that your list elements are neither hashable nor ordered. A much simpler solution is to change your numpy.ndarray to list then you can sort the two lists for easier comparison. In case of A
and B
this means
从您的最新修改中可以看出,您的列表元素既不可编辑也不订购。一个更简单的解决方案是将numpy.ndarray更改为list然后您可以对这两个列表进行排序以便于比较。如果是A和B,这意味着
In [141]: A_sorted_list = sorted([sorted([list(j) for j in i]) for i in A])
In [142]: B_sorted_list = sorted([sorted([list(j) for j in i]) for i in B])
Then do the comparison between the two lists
然后进行两个列表之间的比较
In [143]: all([all(i==j) for i, j in zip(A_sorted_list, B_sorted_list)])
Out[143]: True
If changing the array into list is a problem, you can have a helper function for comparing clusters:
如果将数组更改为列表是一个问题,您可以使用辅助函数来比较集群:
def compare_clusters(cluster_A, cluster_B):
for aj in cluster_A:
aj_included = any([all(bj==aj) if len(bj)==len(aj) else False for bj in cluster_B])
if not aj_included:
return False
return True
the you can compare A
and B
as:
您可以将A和B比较为:
In [149]: all([any([compare_clusters(ai, bi) for ai in A]) for bi in B])
Out[149]: True
#4
0
Originally, I wanted to know if there was an elegant, Pythonic way to check if two lists A and B grouped the numpy arrays into the same lists, regardless of the order. I wanted to avoid converting the numpy arrays to lists just to make the comparison. However, based on the responses I received, it seems that converting the arrays to lists is the most elegant way. Here is my code after converting the arrays to lists using array.tolist() :
最初,我想知道是否有一种优雅的Pythonic方法来检查两个列表A和B是否将numpy数组分组到相同的列表中,而不管顺序如何。我想避免将numpy数组转换为列表来进行比较。但是,根据我收到的响应,似乎将数组转换为列表是最优雅的方式。这是使用array.tolist()将数组转换为列表后的代码:
for cluster in A:
if cluster not in B:
return False
return True
If anyone has improvements or criticism please comment.
Also, what is the overhead of converting an array to list using array.tolist()?
如果有人有改进或批评,请评论。另外,使用array.tolist()将数组转换为列表的开销是多少?
#1
2
Try use numpy.array_equal, you can use code like this:
尝试使用numpy.array_equal,你可以使用这样的代码:
>>> import numpy as np
>>> np.array_equal(np.array([[1,2],[2,1]]), np.array([[1,2],[2,1]]))
True
#2
0
Try the following:
请尝试以下方法:
all(A[i] == B[i] if len(B) == len(A) else False for i in range(len(B)))
>>> A = [[5, 2], [6, 7, 8]]
>>> B = [[5, 2], [6, 7, 8]]
>>> all(A[i] == B[i] if len(B) == len(A) else False for i in range(len(B)))
True
>>> A.append([56, 2])
>>> all(A[i] == B[i] if len(B) == len(A) else False for i in range(len(B)))
False
>>>
#3
0
From your latest modification it evident that your list elements are neither hashable nor ordered. A much simpler solution is to change your numpy.ndarray to list then you can sort the two lists for easier comparison. In case of A
and B
this means
从您的最新修改中可以看出,您的列表元素既不可编辑也不订购。一个更简单的解决方案是将numpy.ndarray更改为list然后您可以对这两个列表进行排序以便于比较。如果是A和B,这意味着
In [141]: A_sorted_list = sorted([sorted([list(j) for j in i]) for i in A])
In [142]: B_sorted_list = sorted([sorted([list(j) for j in i]) for i in B])
Then do the comparison between the two lists
然后进行两个列表之间的比较
In [143]: all([all(i==j) for i, j in zip(A_sorted_list, B_sorted_list)])
Out[143]: True
If changing the array into list is a problem, you can have a helper function for comparing clusters:
如果将数组更改为列表是一个问题,您可以使用辅助函数来比较集群:
def compare_clusters(cluster_A, cluster_B):
for aj in cluster_A:
aj_included = any([all(bj==aj) if len(bj)==len(aj) else False for bj in cluster_B])
if not aj_included:
return False
return True
the you can compare A
and B
as:
您可以将A和B比较为:
In [149]: all([any([compare_clusters(ai, bi) for ai in A]) for bi in B])
Out[149]: True
#4
0
Originally, I wanted to know if there was an elegant, Pythonic way to check if two lists A and B grouped the numpy arrays into the same lists, regardless of the order. I wanted to avoid converting the numpy arrays to lists just to make the comparison. However, based on the responses I received, it seems that converting the arrays to lists is the most elegant way. Here is my code after converting the arrays to lists using array.tolist() :
最初,我想知道是否有一种优雅的Pythonic方法来检查两个列表A和B是否将numpy数组分组到相同的列表中,而不管顺序如何。我想避免将numpy数组转换为列表来进行比较。但是,根据我收到的响应,似乎将数组转换为列表是最优雅的方式。这是使用array.tolist()将数组转换为列表后的代码:
for cluster in A:
if cluster not in B:
return False
return True
If anyone has improvements or criticism please comment.
Also, what is the overhead of converting an array to list using array.tolist()?
如果有人有改进或批评,请评论。另外,使用array.tolist()将数组转换为列表的开销是多少?