Python:如何从列表中删除所有重复项。

时间:2021-07-31 08:03:41

How would I use python to check a list and delete all duplicates? I don't want to have to specify what the duplicate item is - I want the code to figure out if there are any and remove them if so, keeping only one instance of each. It also must work if there are multiple duplicates in a list.

如何使用python检查列表并删除所有重复项?我不需要指定重复项是什么——我希望代码能够弄清楚是否有任何重复项,如果有的话,只保留一个实例。如果列表中有多个重复项,它也必须工作。

For example, in my code below, the list lseparatedOrbList has 12 items - one is repeated six times, one is repeated five times, and there is only one instance of one. I want it to change the list so there are only three items - one of each, and in the same order they appeared before. I tried this:

例如,在下面的代码中,列表l分隔orblist有12个条目—一个重复6次,一个重复5次,并且只有一个实例。我希望它能改变列表,所以只有三个项目,一个是一个,按照之前的顺序。我试着这样的:

for i in lseparatedOrbList:
   for j in lseparatedOrblist:
        if lseparatedOrbList[i] == lseparatedOrbList[j]:
            lseparatedOrbList.remove(lseparatedOrbList[j])

But I get the error:

但是我得到了错误:

Traceback (most recent call last):
  File "qchemOutputSearch.py", line 123, in <module>
    for j in lseparatedOrblist:
NameError: name 'lseparatedOrblist' is not defined

I'm guessing because it's because I'm trying to loop through lseparatedOrbList while I loop through it, but I can't think of another way to do it.

我猜是因为我想在循环过程中循环,但我想不出另一种方法。

13 个解决方案

#1


55  

Just make a new list to populate, if the item for your list is not yet in the new list input it, else just move on to the next item in your original list.

只要创建一个新的列表来填充,如果你的列表中的条目还没有在新的列表中,那么你就可以进入你的原始列表中的下一个条目。

for i in mylist:
  if i not in newlist:
    newlist.append(i)

I think this is the correct syntax, but my python is a bit shaky, I hope you at least get the idea.

我认为这是正确的语法,但是我的python有点不稳定,我希望您至少理解这个想法。

#2


57  

Use set():

使用设置():

woduplicates = set(lseparatedOrblist)

Returns a set without duplicates. If you, for some reason, need a list back:

返回一个没有重复的集合。如果你出于某种原因需要一张清单:

woduplicates = list(set(lseperatedOrblist))

#3


25  

You can do this like that:

你可以这样做:

x = list(set(x))

Example: if you do something like that:

例子:如果你这样做:

x = [1,2,3,4,5,6,7,8,9,10,2,1,6,31,20]
x = list(set(x))
x

you will see the following result:

您将看到以下结果:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 31]

There is only one thing you should think of: the resulting list will not be ordered as the original one (will lose the order in the process).

您只需要考虑一件事:生成的列表将不会按照原来的顺序排列(在流程中会丢失订单)。

#4


15  

This should be faster and will preserve the original order:

这应该更快,并且将保持原来的顺序:

seen = {}
new_list = [seen.setdefault(x, x) for x in my_list if x not in seen]

If you don't care about order, you can just:

如果你不关心订单,你可以:

new_list = list(set(my_list))

#5


6  

This should do it for you:

这应该为你做:

new_list = list(set(old_list))

set will automatically remove duplicates. list will cast it back to a list.

设置将自动删除重复。列表会把它放回一个列表。

#6


5  

No, it's simply a typo, the "list" at the end must be capitalized. You can nest loops over the same variable just fine (although there's rarely a good reason to).

不,这只是一个输入错误,最后的“列表”必须大写。您可以在同一个变量上嵌套循环(尽管很少有好的理由)。

However, there are other problems with the code. For starters, you're iterating through lists, so i and j will be items not indices. Furthermore, you can't change a collection while iterating over it (well, you "can" in that it runs, but madness lies that way - for instance, you'll propably skip over items). And then there's the complexity problem, your code is O(n^2). Either convert the list into a set and back into a list (simple, but shuffles the remaining list items) or do something like this:

但是,代码还有其他问题。首先,您正在遍历列表,因此i和j将是项而不是索引。此外,您不能在迭代过程中更改集合(好吧,您可以在它运行时“可以”,但是疯狂是这样的——例如,您将可以跳过项目)。然后是复杂度问题,你的代码是O(n 2)要么将列表转换为一个集合,然后返回到一个列表中(简单,但会打乱其余的列表项),或者做如下的事情:

seen = set()
new_x = []
for x in xs:
    if x in seen:
        continue
    seen.add(x)
    new_xs.append(x)

Both solutions require the items to be hashable. If that's not possible, you'll probably have to stick with your current approach sans the mentioned problems.

这两种解决方案都要求这些项目是可清洗的。如果这是不可能的,你可能不得不坚持你目前的方法,没有提到的问题。

#7


3  

It's because you are missing a capital letter, actually.

这是因为你实际上漏掉了一个大写字母。

Purposely dedented:

故意取消缩进:

for i in lseparatedOrbList:   # capital 'L'
for j in lseparatedOrblist:   # lowercase 'l'

Though the more efficient way to do it would be to insert the contents into a set.

虽然更有效的方法是将内容插入到集合中。

If maintaining the list order matters (ie, it must be "stable"), check out the answers on this question

如果维护列表顺序很重要(例如,它必须是“稳定的”),请查看这个问题的答案。

#8


2  

Use set

使用设置

return list(set(result))

Use dict

使用dict类型

return dict.fromkeys(result).keys()

#9


2  

The easiest way is using set() function :

最简单的方法是使用set()函数:

new_list = list(set(your_list))

#10


1  

for unhashable lists. It is faster as it does not iterate about already checked entries.

unhashable列表。它的速度更快,因为它不会迭代已经检查过的条目。

def purge_dublicates(X):
    unique_X = []
    for i, row in enumerate(X):
        if row not in X[i + 1:]:
            unique_X.append(row)
    return unique_X

#11


0  

The modern way to do it that maintains the order is:

维持秩序的现代方法是:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(lseparatedOrbList))

as discussed by Raymond Hettinger (python core dev) in this answer. However the keys must be hashable (as is the case in your list I think)

正如Raymond Hettinger (python core dev)在这个答案中所讨论的那样。然而,钥匙必须是可洗的(我认为你的清单上是这样的)

#12


-1  

There is a faster way to fix this:

有一种更快的方法来解决这个问题:

list = [1, 1.0, 1.41, 1.73, 2, 2, 2.0, 2.24, 3, 3, 4, 4, 4, 5, 6, 6, 8, 8, 9, 10]
list2=[]

for value in list:
    try:
        list2.index(value)
    except:
        list2.append(value)
list.clear()
for value in list2:
    list.append(value)
list2.clear()
print(list)
print(list2)

#13


-1  

In this way one can delete a particular item which is present multiple times in a list : Try deleting all 5

通过这种方式,可以删除一个列表中多次出现的特定项:尝试删除所有5。

list1=[1,2,3,4,5,6,5,3,5,7,11,5,9,8,121,98,67,34,5,21]
print list1
n=input("item to be deleted : " )
for i in list1:
    if n in list1:
        list1.remove(n)
print list1

#1


55  

Just make a new list to populate, if the item for your list is not yet in the new list input it, else just move on to the next item in your original list.

只要创建一个新的列表来填充,如果你的列表中的条目还没有在新的列表中,那么你就可以进入你的原始列表中的下一个条目。

for i in mylist:
  if i not in newlist:
    newlist.append(i)

I think this is the correct syntax, but my python is a bit shaky, I hope you at least get the idea.

我认为这是正确的语法,但是我的python有点不稳定,我希望您至少理解这个想法。

#2


57  

Use set():

使用设置():

woduplicates = set(lseparatedOrblist)

Returns a set without duplicates. If you, for some reason, need a list back:

返回一个没有重复的集合。如果你出于某种原因需要一张清单:

woduplicates = list(set(lseperatedOrblist))

#3


25  

You can do this like that:

你可以这样做:

x = list(set(x))

Example: if you do something like that:

例子:如果你这样做:

x = [1,2,3,4,5,6,7,8,9,10,2,1,6,31,20]
x = list(set(x))
x

you will see the following result:

您将看到以下结果:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 31]

There is only one thing you should think of: the resulting list will not be ordered as the original one (will lose the order in the process).

您只需要考虑一件事:生成的列表将不会按照原来的顺序排列(在流程中会丢失订单)。

#4


15  

This should be faster and will preserve the original order:

这应该更快,并且将保持原来的顺序:

seen = {}
new_list = [seen.setdefault(x, x) for x in my_list if x not in seen]

If you don't care about order, you can just:

如果你不关心订单,你可以:

new_list = list(set(my_list))

#5


6  

This should do it for you:

这应该为你做:

new_list = list(set(old_list))

set will automatically remove duplicates. list will cast it back to a list.

设置将自动删除重复。列表会把它放回一个列表。

#6


5  

No, it's simply a typo, the "list" at the end must be capitalized. You can nest loops over the same variable just fine (although there's rarely a good reason to).

不,这只是一个输入错误,最后的“列表”必须大写。您可以在同一个变量上嵌套循环(尽管很少有好的理由)。

However, there are other problems with the code. For starters, you're iterating through lists, so i and j will be items not indices. Furthermore, you can't change a collection while iterating over it (well, you "can" in that it runs, but madness lies that way - for instance, you'll propably skip over items). And then there's the complexity problem, your code is O(n^2). Either convert the list into a set and back into a list (simple, but shuffles the remaining list items) or do something like this:

但是,代码还有其他问题。首先,您正在遍历列表,因此i和j将是项而不是索引。此外,您不能在迭代过程中更改集合(好吧,您可以在它运行时“可以”,但是疯狂是这样的——例如,您将可以跳过项目)。然后是复杂度问题,你的代码是O(n 2)要么将列表转换为一个集合,然后返回到一个列表中(简单,但会打乱其余的列表项),或者做如下的事情:

seen = set()
new_x = []
for x in xs:
    if x in seen:
        continue
    seen.add(x)
    new_xs.append(x)

Both solutions require the items to be hashable. If that's not possible, you'll probably have to stick with your current approach sans the mentioned problems.

这两种解决方案都要求这些项目是可清洗的。如果这是不可能的,你可能不得不坚持你目前的方法,没有提到的问题。

#7


3  

It's because you are missing a capital letter, actually.

这是因为你实际上漏掉了一个大写字母。

Purposely dedented:

故意取消缩进:

for i in lseparatedOrbList:   # capital 'L'
for j in lseparatedOrblist:   # lowercase 'l'

Though the more efficient way to do it would be to insert the contents into a set.

虽然更有效的方法是将内容插入到集合中。

If maintaining the list order matters (ie, it must be "stable"), check out the answers on this question

如果维护列表顺序很重要(例如,它必须是“稳定的”),请查看这个问题的答案。

#8


2  

Use set

使用设置

return list(set(result))

Use dict

使用dict类型

return dict.fromkeys(result).keys()

#9


2  

The easiest way is using set() function :

最简单的方法是使用set()函数:

new_list = list(set(your_list))

#10


1  

for unhashable lists. It is faster as it does not iterate about already checked entries.

unhashable列表。它的速度更快,因为它不会迭代已经检查过的条目。

def purge_dublicates(X):
    unique_X = []
    for i, row in enumerate(X):
        if row not in X[i + 1:]:
            unique_X.append(row)
    return unique_X

#11


0  

The modern way to do it that maintains the order is:

维持秩序的现代方法是:

>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(lseparatedOrbList))

as discussed by Raymond Hettinger (python core dev) in this answer. However the keys must be hashable (as is the case in your list I think)

正如Raymond Hettinger (python core dev)在这个答案中所讨论的那样。然而,钥匙必须是可洗的(我认为你的清单上是这样的)

#12


-1  

There is a faster way to fix this:

有一种更快的方法来解决这个问题:

list = [1, 1.0, 1.41, 1.73, 2, 2, 2.0, 2.24, 3, 3, 4, 4, 4, 5, 6, 6, 8, 8, 9, 10]
list2=[]

for value in list:
    try:
        list2.index(value)
    except:
        list2.append(value)
list.clear()
for value in list2:
    list.append(value)
list2.clear()
print(list)
print(list2)

#13


-1  

In this way one can delete a particular item which is present multiple times in a list : Try deleting all 5

通过这种方式,可以删除一个列表中多次出现的特定项:尝试删除所有5。

list1=[1,2,3,4,5,6,5,3,5,7,11,5,9,8,121,98,67,34,5,21]
print list1
n=input("item to be deleted : " )
for i in list1:
    if n in list1:
        list1.remove(n)
print list1