从python集中获取独特的元组

时间:2021-06-16 00:30:24

I currently have a set like the following:

我目前有一套如下:

{(a,b), (b,a), (c,b), (b,c)}

What I Would like to have is:

我想拥有的是:

{(a,b), (c,b)}

As you may notice the duplicate values have been removed completely so that two tuples never have the same elements inside regardless of order.

您可能会注意到已完全删除重复值,因此无论顺序如何,两个元组内部都不会包含相同的元素。

How can I tell the set to disregard the order of the elements in the tuple and just check the values between the tuples?

如何告诉集合忽略元组中元素的顺序,只检查元组之间的值?

3 个解决方案

#1


Okay, so you've got a set {c1, c2, c3, ...}, where each cN is itself a collection of some sort.

好的,所以你有一套{c1,c2,c3,...},其中每个cN本身就是某种类型的集合。

If you don't care about the order of the elements in cN, but do care that it is unique (disregarding order), then cN should be a frozenset1 rather than a tuple:

如果你不关心cN中元素的顺序,但是要注意它是唯一的(忽略顺序),那么cN应该是freezeset1而不是tuple:

>>> orig = {("a", "b"), ("b", "a"), ("c", "b"), ("b", "c")}
>>> uniq = {frozenset(c) for c in orig}
>>> uniq
{frozenset(['b', 'a']), frozenset(['b', 'c'])}

As a general rule, choosing an appropriate data type from those provided by Python is going to be more straightforward than defining and maintaining custom classes.

作为一般规则,从Python提供的数据类型中选择适当的数据类型将比定义和维护自定义类更直接。


1 It can't be a set, because as a member of a larger set it needs to be hashable.

1它不能是一个集合,因为作为较大集合的成员,它需要是可以清除的。

#2


Rather ugly, straightforward solution. You implement equality to treat (2, 3) and (3, 2) as the equal objects, you implement __hash__ to disallow equal members in set. You access members as in assertions below.

相当丑陋,直截了当的解决方案。实现相等性将(2,3)和(3,2)视为等于对象,实现__hash__以禁止set中的相等成员。您可以像下面的断言一样访问成员。

I'm unhappy with how hashing function looks, but anyway - it's just proof of concept. Hopefully you'll find more elegant solution to calculate it without collisions.

我对哈希函数的外观不满意,但无论如何 - 它只是概念的证明。希望你能找到更优雅的解决方案来计算它而不会发生碰撞。

class WhateverItIs(object):
    def __init__(self, a, b):
        self.a = a
        self.b = b
    def __eq__(self, other):
        return ((self.a == other.a and self.b == other.b) or
        (self.a == other.b and self.b == other.a))
    def __hash__(self):
        return hash(tuple(sorted((self.a, self.b))))

o1 = WhateverItIs(2, 3)
o2 = WhateverItIs(3, 2)
o3 = WhateverItIs(4, 3)

assert {o1, o2, o3} in [{o1, o3}, {o2, o3}]
assert o1 == o2
assert o1.a == 2
assert o1.b == 3
assert o2.a == 3
assert o2.b == 2
assert o3.a == 4
assert o3.b == 3

#3


>>> aa = [('a', 'b'), ('c', 'd'), ('b', 'a')] 
>>> seen = set() 
>>> a = [seen.add((x,y)) for x,y in aa if (x,y) and (y,x) not in seen ]
>>> list(seen)
[('a', 'b'), ('c', 'd')]

#1


Okay, so you've got a set {c1, c2, c3, ...}, where each cN is itself a collection of some sort.

好的,所以你有一套{c1,c2,c3,...},其中每个cN本身就是某种类型的集合。

If you don't care about the order of the elements in cN, but do care that it is unique (disregarding order), then cN should be a frozenset1 rather than a tuple:

如果你不关心cN中元素的顺序,但是要注意它是唯一的(忽略顺序),那么cN应该是freezeset1而不是tuple:

>>> orig = {("a", "b"), ("b", "a"), ("c", "b"), ("b", "c")}
>>> uniq = {frozenset(c) for c in orig}
>>> uniq
{frozenset(['b', 'a']), frozenset(['b', 'c'])}

As a general rule, choosing an appropriate data type from those provided by Python is going to be more straightforward than defining and maintaining custom classes.

作为一般规则,从Python提供的数据类型中选择适当的数据类型将比定义和维护自定义类更直接。


1 It can't be a set, because as a member of a larger set it needs to be hashable.

1它不能是一个集合,因为作为较大集合的成员,它需要是可以清除的。

#2


Rather ugly, straightforward solution. You implement equality to treat (2, 3) and (3, 2) as the equal objects, you implement __hash__ to disallow equal members in set. You access members as in assertions below.

相当丑陋,直截了当的解决方案。实现相等性将(2,3)和(3,2)视为等于对象,实现__hash__以禁止set中的相等成员。您可以像下面的断言一样访问成员。

I'm unhappy with how hashing function looks, but anyway - it's just proof of concept. Hopefully you'll find more elegant solution to calculate it without collisions.

我对哈希函数的外观不满意,但无论如何 - 它只是概念的证明。希望你能找到更优雅的解决方案来计算它而不会发生碰撞。

class WhateverItIs(object):
    def __init__(self, a, b):
        self.a = a
        self.b = b
    def __eq__(self, other):
        return ((self.a == other.a and self.b == other.b) or
        (self.a == other.b and self.b == other.a))
    def __hash__(self):
        return hash(tuple(sorted((self.a, self.b))))

o1 = WhateverItIs(2, 3)
o2 = WhateverItIs(3, 2)
o3 = WhateverItIs(4, 3)

assert {o1, o2, o3} in [{o1, o3}, {o2, o3}]
assert o1 == o2
assert o1.a == 2
assert o1.b == 3
assert o2.a == 3
assert o2.b == 2
assert o3.a == 4
assert o3.b == 3

#3


>>> aa = [('a', 'b'), ('c', 'd'), ('b', 'a')] 
>>> seen = set() 
>>> a = [seen.add((x,y)) for x,y in aa if (x,y) and (y,x) not in seen ]
>>> list(seen)
[('a', 'b'), ('c', 'd')]