要确保列表中的所有元素都是不同的,最python化的方法是什么?

时间:2022-05-10 22:23:25

I have a list in Python that I generate as part of the program. I have a strong assumption that these are all different, and I check this with an assertion.

我在Python中有一个列表,作为程序的一部分生成。我有一个强烈的假设,这些都是不同的,我用一个断言来验证。

This is the way I do it now:

这就是我现在的做法:

If there are two elements:

如果有两个要素:

try:
    assert(x[0] != x[1])
except:
    print debug_info
    raise Exception("throw to caller")

If there are three:

如果有三个:

try:
    assert(x[0] != x[1])
    assert(x[0] != x[2])
    assert(x[1] != x[2])
except:
    print debug_info
    raise Exception("throw to caller")

And if I ever have to do this with four elements I'll go crazy.

如果我要用四个元素来做这个,我会疯掉的。

Is there a better way to ensure that all the elements of the list are unique?

有没有更好的方法来确保列表中的所有元素都是唯一的?

7 个解决方案

#1


26  

Maybe something like this:

也许是这样的:

if len(x) == len(set(x)):
    print "all elements are unique"
else:
    print "elements are not unique"

#2


18  

The most popular answers are O(N) (good!-) but, as @Paul and @Mark point out, they require the list's items to be hashable. Both @Paul and @Mark's proposed approaches for unhashable items are general but take O(N squared) -- i.e., a lot.

最流行的答案是O(N)(好!-),但是,正如@Paul和@Mark指出的,他们要求列表的内容是可洗的。@Paul和@Mark对不可洗物品提出的方法都是通用的,但是以O(N²)为例。,很多。

If your list's items are not hashable but are comparable, you can do better... here's an approach that always work as fast as feasible given the nature of the list's items.

如果你的清单上的东西不是可以洗的,而是可以比较的,你可以做得更好……考虑到列表项的性质,这里有一种方法总是尽可能快地工作。

import itertools

def allunique(L):
  # first try sets -- fastest, if all items are hashable
  try:
    return len(L) == len(set(L))
  except TypeError:
    pass
  # next, try sort -- second fastest, if items are comparable
  try:
    L1 = sorted(L)
  except TypeError:
    pass
  else:
    return all(len(list(g))==1 for k, g in itertools.groupby(L1))
  # fall back to the slowest but most general approach
  return all(v not in L[i+1:] for i, L in enumerate(L))

This is O(N) where feasible (all items hashable), O(N log N) as the most frequent fallback (some items unhashable, but all comparable), O(N squared) where inevitable (some items unhashable, e.g. dicts, and some non-comparable, e.g. complex numbers).

这是O(N)在可行的情况下(所有物品都可以清洗),O(N log N)作为最常见的退路(有些物品不能清洗,但所有物品都可以比较),O(N²)在不可避免的情况下(有些物品不能清洗,如dicts,有些物品不能比较,如复数)。

Inspiration for this code comes from an old recipe by the great Tim Peters, which differed by actually producing a list of unique items (and also was so far ago that set was not around -- it had to use a dict...!-), but basically faced identical issues.

这段代码的灵感来自伟大的蒂姆•彼得斯(Tim Peters)的一份老菜谱,该菜谱的不同之处在于,它实际上生成了一份独一无二的条目列表(而且就在很久以前,这个集合还不存在——它必须使用一条法令……!),但基本上都面临着相同的问题。

#3


7  

How about this:

这个怎么样:

if len(x) != len(set(x)):
    raise Exception("throw to caller")

This assumes that elements in x are hashable.

这假设x中的元素是可洗的。

#4


2  

Hopefully all the items in your sequence are immutable -- if not, you will not be able to call set on the sequence.

希望序列中的所有项都是不可变的——如果不是,就不能调用序列上的set。

>>> set( ([1,2], [3,4]) )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

If you do have mutable items, you can't hash the items and you will pretty much have to repeatedly check through the list:

如果你有可变项,你不能对项进行哈希,你将不得不反复检查列表:

def isUnique(lst):
    for i,v in enumerate(lst):
        if v in lst[i+1:]:
            return False
    return True

>>> isUnique( ([1,2], [3,4]) )
True
>>> isUnique( ([1,2], [3,4], [1,2]) )
False

#5


1  

As you build the list you can check to see if the value already exists, e.g:

当您构建列表时,您可以检查该值是否已经存在,例如:

if x in y:
     raise Exception("Value %s already in y" % x)
else:
     y.append(x)

the benefit of this is that the *ing variable will be reported.

这样做的好处是会报告冲突变量。

#6


0  

You could process the list to create a known-to-be-unique copy:

您可以对列表进行处理,以创建一个已知的惟一副本:

def make_unique(seq): 
    t = type(seq) 
    seen = set()
    return t(c for c in seq if not (c in seen or seen.add(c)))

Or if the seq elements are not hashable:

或者如果seq元素不耐洗:

def unique1(seq):
    t = type(seq) 
    seen = [] 
    return t(c for c in seq if not (c in seen or seen.append(c)))

And this will keep the items in order (omitting duplicates, of course).

这将使项目保持有序(当然省略重复)。

#7


0  

I would use this:

我想用这个:

mylist = [1,2,3,4]
is_unique = all(mylist.count(x) == 1 for x in mylist)

#1


26  

Maybe something like this:

也许是这样的:

if len(x) == len(set(x)):
    print "all elements are unique"
else:
    print "elements are not unique"

#2


18  

The most popular answers are O(N) (good!-) but, as @Paul and @Mark point out, they require the list's items to be hashable. Both @Paul and @Mark's proposed approaches for unhashable items are general but take O(N squared) -- i.e., a lot.

最流行的答案是O(N)(好!-),但是,正如@Paul和@Mark指出的,他们要求列表的内容是可洗的。@Paul和@Mark对不可洗物品提出的方法都是通用的,但是以O(N²)为例。,很多。

If your list's items are not hashable but are comparable, you can do better... here's an approach that always work as fast as feasible given the nature of the list's items.

如果你的清单上的东西不是可以洗的,而是可以比较的,你可以做得更好……考虑到列表项的性质,这里有一种方法总是尽可能快地工作。

import itertools

def allunique(L):
  # first try sets -- fastest, if all items are hashable
  try:
    return len(L) == len(set(L))
  except TypeError:
    pass
  # next, try sort -- second fastest, if items are comparable
  try:
    L1 = sorted(L)
  except TypeError:
    pass
  else:
    return all(len(list(g))==1 for k, g in itertools.groupby(L1))
  # fall back to the slowest but most general approach
  return all(v not in L[i+1:] for i, L in enumerate(L))

This is O(N) where feasible (all items hashable), O(N log N) as the most frequent fallback (some items unhashable, but all comparable), O(N squared) where inevitable (some items unhashable, e.g. dicts, and some non-comparable, e.g. complex numbers).

这是O(N)在可行的情况下(所有物品都可以清洗),O(N log N)作为最常见的退路(有些物品不能清洗,但所有物品都可以比较),O(N²)在不可避免的情况下(有些物品不能清洗,如dicts,有些物品不能比较,如复数)。

Inspiration for this code comes from an old recipe by the great Tim Peters, which differed by actually producing a list of unique items (and also was so far ago that set was not around -- it had to use a dict...!-), but basically faced identical issues.

这段代码的灵感来自伟大的蒂姆•彼得斯(Tim Peters)的一份老菜谱,该菜谱的不同之处在于,它实际上生成了一份独一无二的条目列表(而且就在很久以前,这个集合还不存在——它必须使用一条法令……!),但基本上都面临着相同的问题。

#3


7  

How about this:

这个怎么样:

if len(x) != len(set(x)):
    raise Exception("throw to caller")

This assumes that elements in x are hashable.

这假设x中的元素是可洗的。

#4


2  

Hopefully all the items in your sequence are immutable -- if not, you will not be able to call set on the sequence.

希望序列中的所有项都是不可变的——如果不是,就不能调用序列上的set。

>>> set( ([1,2], [3,4]) )
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'

If you do have mutable items, you can't hash the items and you will pretty much have to repeatedly check through the list:

如果你有可变项,你不能对项进行哈希,你将不得不反复检查列表:

def isUnique(lst):
    for i,v in enumerate(lst):
        if v in lst[i+1:]:
            return False
    return True

>>> isUnique( ([1,2], [3,4]) )
True
>>> isUnique( ([1,2], [3,4], [1,2]) )
False

#5


1  

As you build the list you can check to see if the value already exists, e.g:

当您构建列表时,您可以检查该值是否已经存在,例如:

if x in y:
     raise Exception("Value %s already in y" % x)
else:
     y.append(x)

the benefit of this is that the *ing variable will be reported.

这样做的好处是会报告冲突变量。

#6


0  

You could process the list to create a known-to-be-unique copy:

您可以对列表进行处理,以创建一个已知的惟一副本:

def make_unique(seq): 
    t = type(seq) 
    seen = set()
    return t(c for c in seq if not (c in seen or seen.add(c)))

Or if the seq elements are not hashable:

或者如果seq元素不耐洗:

def unique1(seq):
    t = type(seq) 
    seen = [] 
    return t(c for c in seq if not (c in seen or seen.append(c)))

And this will keep the items in order (omitting duplicates, of course).

这将使项目保持有序(当然省略重复)。

#7


0  

I would use this:

我想用这个:

mylist = [1,2,3,4]
is_unique = all(mylist.count(x) == 1 for x in mylist)