如何从列表中删除重复项?

时间:2022-01-11 12:54:22

I want to remove duplicates from a list but what I am doing is not working:

我想从列表中删除重复项,但我正在做的是不起作用:

List<Customer> listCustomer = new ArrayList<Customer>();    
for (Customer customer: tmpListCustomer)
{
  if (!listCustomer.contains(customer)) 
  {
    listCustomer.add(customer);
  }
 }

16 个解决方案

#1


45  

If that code doesn't work, you probably have not implemented equals(Object) on the Customer class appropriately.

如果该代码不起作用,您可能没有在Customer类上适当地实现equals(Object)。

Presumably there is some key (let us call it customerId) that uniquely identifies a customer; e.g.

据推测,有一些关键(我们称之为customerId)可以唯一地识别客户;例如

class Customer {
    private String customerId;
    ...

An appropriate definition of equals(Object) would look like this:

equals(Object)的适当定义如下所示:

    public boolean equals(Object obj) {
        if (obj == this) {
            return true;
        }
        if (!(obj instanceof Customer)) {
            return false;
        }
        Customer other = (Customer) obj;
        return this.customerId.equals(other.customerId);
    }

For completeness, you should also implement hashCode so that two Customer objects that are equal will return the same hash value. A matching hashCode for the above definition of equals would be:

为了完整性,您还应该实现hashCode,以便两个相等的Customer对象将返回相同的哈希值。上述equals定义的匹配hashCode将是:

    public int hashCode() {
        return customerId.hashCode();
    }

It is also worth noting that this is not an efficient way to remove duplicates if the list is large. (For a list with N customers, you will need to perform N*(N-1)/2 comparisons in the worst case; i.e. when there are no duplicates.) For a more efficient solution you should use something like a HashSet to do the duplicate checking.

还值得注意的是,如果列表很大,这不是删除重复项的有效方法。 (对于包含N个客户的列表,您需要在最坏的情况下执行N *(N-1)/ 2次比较;即,当没有重复时。)为了更有效的解决方案,您应该使用类似HashSet的东西来做重复检查。

#2


85  

Assuming you want to keep the current order and don't want a Set, perhaps the easiest is:

假设您想保留当前订单并且不想要Set,也许最简单的是:

List<Customer> depdupeCustomers =
    new ArrayList<>(new LinkedHashSet<>(customers));

If you want to change the original list:

如果要更改原始列表:

Set<Customer> depdupeCustomers = new LinkedHashSet<>(customers);
customers.clear();
customers.addAll(dedupeCustomers);

#3


19  

java 8 update
you can use stream of array as below:

java 8更新你可以使用数组流如下:

Arrays.stream(yourArray).distinct()
                    .collect(Collectors.toList());

#4


13  

Does Customer implement the equals() contract?

客户是否实施了equals()合同?

If it doesn't implement equals() and hashCode(), then listCustomer.contains(customer) will check to see if the exact same instance already exists in the list (By instance I mean the exact same object--memory address, etc). If what you are looking for is to test whether or not the same Customer( perhaps it's the same customer if they have the same customer name, or customer number) is in the list already, then you would need to override equals() to ensure that it checks whether or not the relevant fields(e.g. customer names) match.

如果它没有实现equals()和hashCode(),那么listCustomer.contains(customer)将检查列表中是否已经存在完全相同的实例(实例我的意思是完全相同的对象 - 内存地址等)。如果您正在寻找的是测试同一客户(如果他们拥有相同的客户名称或客户编号,可能是同一客户)是否已在列表中,那么您需要覆盖equals()以确保它检查相关字段(例如客户名称)是否匹配。

Note: Don't forget to override hashCode() if you are going to override equals()! Otherwise, you might get trouble with your HashMaps and other data structures. For a good coverage of why this is and what pitfalls to avoid, consider having a look at Josh Bloch's Effective Java chapters on equals() and hashCode() (The link only contains iformation about why you must implement hashCode() when you implement equals(), but there is good coverage about how to override equals() too).

注意:如果要覆盖equals(),请不要忘记覆盖hashCode()!否则,您可能会遇到HashMaps和其他数据结构的问题。为了更好地了解为什么会出现这种情况以及要避免哪些陷阱,请考虑查看Josh Bloch关于equals()和hashCode()的Effective Java章节(该链接仅包含有关为什么必须在实现equals时实现hashCode()的iformation (),但是如何覆盖equals()也有很好的报道。

By the way, is there an ordering restriction on your set? If there isn't, a slightly easier way to solve this problem is use a Set<Customer> like so:

顺便问一下,您的套装是否有订购限制?如果没有,解决此问题的一种稍微简单的方法是使用Set ,如下所示:

Set<Customer> noDups = new HashSet<Customer>();
noDups.addAll(tmpListCustomer);
return new ArrayList<Customer>(noDups);

Which will nicely remove duplicates for you, since Sets don't allow duplicates. However, this will lose any ordering that was applied to tmpListCustomer, since HashSet has no explicit ordering (You can get around that by using a TreeSet, but that's not exactly related to your question). This can simplify your code a little bit.

哪个会很好地删除重复项,因为集合不允许重复。但是,这将丢失应用于tmpListCustomer的任何排序,因为HashSet没有明确的排序(您可以通过使用TreeSet绕过它,但这与您的问题不完全相关)。这可以简化您的代码。

#5


12  

List → Set → List (distinct)

Just add all your elements to a Set: it does not allow it's elements to be repeated. If you need a list afterwards, use new ArrayList(theSet) constructor afterwards (where theSet is your resulting set).

只需将所有元素添加到Set中:它不允许重复元素。如果之后需要列表,则之后使用新的ArrayList(theSet)构造函数(其中set是您的结果集)。

#6


8  

I suspect you might not have Customer.equals() implemented properly (or at all).

我怀疑你可能没有正确实现Customer.equals()(或根本没有)。

List.contains() uses equals() to verify whether any of its elements is identical to the object passed as parameter. However, the default implementation of equals tests for physical identity, not value identity. So if you have not overwritten it in Customer, it will return false for two distinct Customer objects having identical state.

List.contains()使用equals()来验证它的任何元素是否与作为参数传递的对象相同。但是,默认实现equals测试物理身份,而不是值身份。因此,如果您未在Customer中覆盖它,则对于具有相同状态的两个不同Customer对象,它将返回false。

Here are the nitty-gritty details of how to implement equals (and hashCode, which is its pair - you must practically always implement both if you need to implement either of them). Since you haven't shown us the Customer class, it is difficult to give more concrete advice.

以下是如何实现equals的详细信息(和hashCode,它是它的对 - 如果你需要实现它们中的任何一个,你必须实际上总是实现它们)。由于您尚未向我们展示Customer类,因此很难提供更具体的建议。

As others have noted, you are better off using a Set rather than doing the job by hand, but even for that, you still need to implement those methods.

正如其他人所指出的那样,你最好使用Set而不是手工完成工作,但即便如此,你仍然需要实现这些方法。

#7


5  

The "contains" method searched for whether the list contains an entry that returns true from Customer.equals(Object o). If you have not overridden equals(Object) in Customer or one of its parents then it will only search for an existing occurrence of the same object. It may be this was what you wanted, in which case your code should work. But if you were looking for not having two objects both representing the same customer, then you need to override equals(Object) to return true when that is the case.

“contains”方法搜索列表是否包含从Customer.equals(Object o)返回true的条目。如果您没有在Customer或其父项之一中重写equals(Object),那么它将仅搜索同一对象的现有匹配项。这可能是你想要的,在这种情况下你的代码应该工作。但是,如果您正在寻找没有两个对象同时代表同一个客户,那么在这种情况下,您需要覆盖equals(Object)以返回true。

It is also true that using one of the implementations of Set instead of List would give you duplicate removal automatically, and faster (for anything other than very small Lists). You will still need to provide code for equals.

使用Set而不是List的实现之一也可以自动,并且更快地(除了非常小的列表之外的任何其他内容)提供重复删除。您仍然需要提供equals代码。

You should also override hashCode() when you override equals().

覆盖equals()时,还应覆盖hashCode()。

#8


5  

private void removeTheDuplicates(List<Customer>myList) {
    for(ListIterator<Customer>iterator = myList.listIterator(); iterator.hasNext();) {
        Customer customer = iterator.next();
        if(Collections.frequency(myList, customer) > 1) {
            iterator.remove();
        }
    }
    System.out.println(myList.toString());

}

#9


3  

Two suggestions:

两个建议:

  • Use a HashSet instead of an ArrayList. This will speed up the contains() checks considerably if you have a long list

    使用HashSet而不是ArrayList。如果你有一个很长的列表,这将大大加快contains()检查

  • Make sure Customer.equals() and Customer.hashCode() are implemented properly, i.e. they should be based on the combined values of the underlying fields in the customer object.

    确保Customer.equals()和Customer.hashCode()正确实现,即它们应基于客户对象中基础字段的组合值。

#10


3  

Nearly all of the above answers are right but what I suggest is to use a Map or Set while creating the related list, not after to gain performance. Because converting a list to a Set or Map and then reconverting it to a List again is a trivial work.

几乎所有上述答案都是正确的,但我建议在创建相关列表时使用Map或Set,而不是在获得性能之后。因为将列表转换为Set或Map然后再将其重新转换为List是一项微不足道的工作。

Sample Code:

示例代码:

Set<String> stringsSet = new LinkedHashSet<String>();//A Linked hash set 
//prevents the adding order of the elements
for (String string: stringsList) {
    stringsSet.add(string);
}
return new ArrayList<String>(stringsSet);

#11


1  

As others have mentioned, you are probably not implementing equals() correctly.

正如其他人所提到的,您可能没有正确实现equals()。

However, you should also note that this code is considered quite inefficient, since the runtime could be the number of elements squared.

但是,您还应注意,此代码被认为效率很低,因为运行时可能是平方元素的数量。

You might want to consider using a Set structure instead of a List instead, or building a Set first and then turning it into a list.

您可能需要考虑使用Set结构而不是List,或者首先构建Set然后将其转换为列表。

#12


1  

The cleanest way is:

最干净的方式是:

List<XXX> lstConsultada = dao.findByPropertyList(YYY);
List<XXX> lstFinal = new ArrayList<XXX>(new LinkedHashSet<GrupoOrigen>(XXX));

and override hascode and equals over the Id's properties of each entity

并覆盖每个实体的Id属性的hascode和equals

#13


1  

IMHO best way how to do it these days:

恕我直言如何做这些天最好的方法:

Suppose you have a Collection "dups" and you want to create another Collection containing the same elements but with all duplicates eliminated. The following one-liner does the trick.

假设您有一个Collection“dups”,并且您想要创建另一个包含相同元素的Collection,但删除了所有重复项。以下单行就可以了。

Collection<collectionType> noDups = new HashSet<collectionType>(dups);

It works by creating a Set which, by definition, cannot contain duplicates.

它的工作原理是创建一个Set,根据定义,它不能包含重复项。

Based on oracle doc.

基于oracle doc。

#14


1  

Using java 8 stream api.

使用java 8 stream api。

    List<String> list = new ArrayList<>();
    list.add("one");
    list.add("one");
    list.add("two");
    System.out.println(list);
    Collection<String> c = list.stream().collect(Collectors.toSet());
    System.out.println(c);

Output:

输出:

Before values : [one, one, two]

价值观之前:[一,二,二]

After Values : [one, two]

价值观之后:[一,二]

#15


0  

The correct answer for Java is use a Set. If you already have a List<Customer> and want to de duplicate it

Java的正确答案是使用Set。如果您已经有List 并想要复制它

Set<Customer> s = new HashSet<Customer>(listCustomer);

Otherise just use a Set implemenation HashSet, TreeSet directly and skip the List construction phase.

其他只是直接使用Set实现HashSet,TreeSet并跳过List构建阶段。

You will need to override hashCode() and equals() on your domain classes that are put in the Set as well to make sure that the behavior you want actually what you get. equals() can be as simple as comparing unique ids of the objects to as complex as comparing every field. hashCode() can be as simple as returning the hashCode() of the unique id' String representation or the hashCode().

您将需要覆盖放在Set中的域类的hashCode()和equals(),以确保您实际所需的行为。 equals()可以像比较对象的唯一ID一样简单,就像比较每个字段一样复杂。 hashCode()可以像返回唯一id'String表示或hashCode()的hashCode()一样简单。

#16


-2  

Class removeduplicates 
{
    public static void main(string args[[])
    {   
        int I;
        for(int =0;i'<10;I++)
        {
            system.out.println(+i);
            if([]I=[j])
            {
                system.out.println(1,2,3,1,1,1,2,2,2)
            }
        }
    }
}

#1


45  

If that code doesn't work, you probably have not implemented equals(Object) on the Customer class appropriately.

如果该代码不起作用,您可能没有在Customer类上适当地实现equals(Object)。

Presumably there is some key (let us call it customerId) that uniquely identifies a customer; e.g.

据推测,有一些关键(我们称之为customerId)可以唯一地识别客户;例如

class Customer {
    private String customerId;
    ...

An appropriate definition of equals(Object) would look like this:

equals(Object)的适当定义如下所示:

    public boolean equals(Object obj) {
        if (obj == this) {
            return true;
        }
        if (!(obj instanceof Customer)) {
            return false;
        }
        Customer other = (Customer) obj;
        return this.customerId.equals(other.customerId);
    }

For completeness, you should also implement hashCode so that two Customer objects that are equal will return the same hash value. A matching hashCode for the above definition of equals would be:

为了完整性,您还应该实现hashCode,以便两个相等的Customer对象将返回相同的哈希值。上述equals定义的匹配hashCode将是:

    public int hashCode() {
        return customerId.hashCode();
    }

It is also worth noting that this is not an efficient way to remove duplicates if the list is large. (For a list with N customers, you will need to perform N*(N-1)/2 comparisons in the worst case; i.e. when there are no duplicates.) For a more efficient solution you should use something like a HashSet to do the duplicate checking.

还值得注意的是,如果列表很大,这不是删除重复项的有效方法。 (对于包含N个客户的列表,您需要在最坏的情况下执行N *(N-1)/ 2次比较;即,当没有重复时。)为了更有效的解决方案,您应该使用类似HashSet的东西来做重复检查。

#2


85  

Assuming you want to keep the current order and don't want a Set, perhaps the easiest is:

假设您想保留当前订单并且不想要Set,也许最简单的是:

List<Customer> depdupeCustomers =
    new ArrayList<>(new LinkedHashSet<>(customers));

If you want to change the original list:

如果要更改原始列表:

Set<Customer> depdupeCustomers = new LinkedHashSet<>(customers);
customers.clear();
customers.addAll(dedupeCustomers);

#3


19  

java 8 update
you can use stream of array as below:

java 8更新你可以使用数组流如下:

Arrays.stream(yourArray).distinct()
                    .collect(Collectors.toList());

#4


13  

Does Customer implement the equals() contract?

客户是否实施了equals()合同?

If it doesn't implement equals() and hashCode(), then listCustomer.contains(customer) will check to see if the exact same instance already exists in the list (By instance I mean the exact same object--memory address, etc). If what you are looking for is to test whether or not the same Customer( perhaps it's the same customer if they have the same customer name, or customer number) is in the list already, then you would need to override equals() to ensure that it checks whether or not the relevant fields(e.g. customer names) match.

如果它没有实现equals()和hashCode(),那么listCustomer.contains(customer)将检查列表中是否已经存在完全相同的实例(实例我的意思是完全相同的对象 - 内存地址等)。如果您正在寻找的是测试同一客户(如果他们拥有相同的客户名称或客户编号,可能是同一客户)是否已在列表中,那么您需要覆盖equals()以确保它检查相关字段(例如客户名称)是否匹配。

Note: Don't forget to override hashCode() if you are going to override equals()! Otherwise, you might get trouble with your HashMaps and other data structures. For a good coverage of why this is and what pitfalls to avoid, consider having a look at Josh Bloch's Effective Java chapters on equals() and hashCode() (The link only contains iformation about why you must implement hashCode() when you implement equals(), but there is good coverage about how to override equals() too).

注意:如果要覆盖equals(),请不要忘记覆盖hashCode()!否则,您可能会遇到HashMaps和其他数据结构的问题。为了更好地了解为什么会出现这种情况以及要避免哪些陷阱,请考虑查看Josh Bloch关于equals()和hashCode()的Effective Java章节(该链接仅包含有关为什么必须在实现equals时实现hashCode()的iformation (),但是如何覆盖equals()也有很好的报道。

By the way, is there an ordering restriction on your set? If there isn't, a slightly easier way to solve this problem is use a Set<Customer> like so:

顺便问一下,您的套装是否有订购限制?如果没有,解决此问题的一种稍微简单的方法是使用Set ,如下所示:

Set<Customer> noDups = new HashSet<Customer>();
noDups.addAll(tmpListCustomer);
return new ArrayList<Customer>(noDups);

Which will nicely remove duplicates for you, since Sets don't allow duplicates. However, this will lose any ordering that was applied to tmpListCustomer, since HashSet has no explicit ordering (You can get around that by using a TreeSet, but that's not exactly related to your question). This can simplify your code a little bit.

哪个会很好地删除重复项,因为集合不允许重复。但是,这将丢失应用于tmpListCustomer的任何排序,因为HashSet没有明确的排序(您可以通过使用TreeSet绕过它,但这与您的问题不完全相关)。这可以简化您的代码。

#5


12  

List → Set → List (distinct)

Just add all your elements to a Set: it does not allow it's elements to be repeated. If you need a list afterwards, use new ArrayList(theSet) constructor afterwards (where theSet is your resulting set).

只需将所有元素添加到Set中:它不允许重复元素。如果之后需要列表,则之后使用新的ArrayList(theSet)构造函数(其中set是您的结果集)。

#6


8  

I suspect you might not have Customer.equals() implemented properly (or at all).

我怀疑你可能没有正确实现Customer.equals()(或根本没有)。

List.contains() uses equals() to verify whether any of its elements is identical to the object passed as parameter. However, the default implementation of equals tests for physical identity, not value identity. So if you have not overwritten it in Customer, it will return false for two distinct Customer objects having identical state.

List.contains()使用equals()来验证它的任何元素是否与作为参数传递的对象相同。但是,默认实现equals测试物理身份,而不是值身份。因此,如果您未在Customer中覆盖它,则对于具有相同状态的两个不同Customer对象,它将返回false。

Here are the nitty-gritty details of how to implement equals (and hashCode, which is its pair - you must practically always implement both if you need to implement either of them). Since you haven't shown us the Customer class, it is difficult to give more concrete advice.

以下是如何实现equals的详细信息(和hashCode,它是它的对 - 如果你需要实现它们中的任何一个,你必须实际上总是实现它们)。由于您尚未向我们展示Customer类,因此很难提供更具体的建议。

As others have noted, you are better off using a Set rather than doing the job by hand, but even for that, you still need to implement those methods.

正如其他人所指出的那样,你最好使用Set而不是手工完成工作,但即便如此,你仍然需要实现这些方法。

#7


5  

The "contains" method searched for whether the list contains an entry that returns true from Customer.equals(Object o). If you have not overridden equals(Object) in Customer or one of its parents then it will only search for an existing occurrence of the same object. It may be this was what you wanted, in which case your code should work. But if you were looking for not having two objects both representing the same customer, then you need to override equals(Object) to return true when that is the case.

“contains”方法搜索列表是否包含从Customer.equals(Object o)返回true的条目。如果您没有在Customer或其父项之一中重写equals(Object),那么它将仅搜索同一对象的现有匹配项。这可能是你想要的,在这种情况下你的代码应该工作。但是,如果您正在寻找没有两个对象同时代表同一个客户,那么在这种情况下,您需要覆盖equals(Object)以返回true。

It is also true that using one of the implementations of Set instead of List would give you duplicate removal automatically, and faster (for anything other than very small Lists). You will still need to provide code for equals.

使用Set而不是List的实现之一也可以自动,并且更快地(除了非常小的列表之外的任何其他内容)提供重复删除。您仍然需要提供equals代码。

You should also override hashCode() when you override equals().

覆盖equals()时,还应覆盖hashCode()。

#8


5  

private void removeTheDuplicates(List<Customer>myList) {
    for(ListIterator<Customer>iterator = myList.listIterator(); iterator.hasNext();) {
        Customer customer = iterator.next();
        if(Collections.frequency(myList, customer) > 1) {
            iterator.remove();
        }
    }
    System.out.println(myList.toString());

}

#9


3  

Two suggestions:

两个建议:

  • Use a HashSet instead of an ArrayList. This will speed up the contains() checks considerably if you have a long list

    使用HashSet而不是ArrayList。如果你有一个很长的列表,这将大大加快contains()检查

  • Make sure Customer.equals() and Customer.hashCode() are implemented properly, i.e. they should be based on the combined values of the underlying fields in the customer object.

    确保Customer.equals()和Customer.hashCode()正确实现,即它们应基于客户对象中基础字段的组合值。

#10


3  

Nearly all of the above answers are right but what I suggest is to use a Map or Set while creating the related list, not after to gain performance. Because converting a list to a Set or Map and then reconverting it to a List again is a trivial work.

几乎所有上述答案都是正确的,但我建议在创建相关列表时使用Map或Set,而不是在获得性能之后。因为将列表转换为Set或Map然后再将其重新转换为List是一项微不足道的工作。

Sample Code:

示例代码:

Set<String> stringsSet = new LinkedHashSet<String>();//A Linked hash set 
//prevents the adding order of the elements
for (String string: stringsList) {
    stringsSet.add(string);
}
return new ArrayList<String>(stringsSet);

#11


1  

As others have mentioned, you are probably not implementing equals() correctly.

正如其他人所提到的,您可能没有正确实现equals()。

However, you should also note that this code is considered quite inefficient, since the runtime could be the number of elements squared.

但是,您还应注意,此代码被认为效率很低,因为运行时可能是平方元素的数量。

You might want to consider using a Set structure instead of a List instead, or building a Set first and then turning it into a list.

您可能需要考虑使用Set结构而不是List,或者首先构建Set然后将其转换为列表。

#12


1  

The cleanest way is:

最干净的方式是:

List<XXX> lstConsultada = dao.findByPropertyList(YYY);
List<XXX> lstFinal = new ArrayList<XXX>(new LinkedHashSet<GrupoOrigen>(XXX));

and override hascode and equals over the Id's properties of each entity

并覆盖每个实体的Id属性的hascode和equals

#13


1  

IMHO best way how to do it these days:

恕我直言如何做这些天最好的方法:

Suppose you have a Collection "dups" and you want to create another Collection containing the same elements but with all duplicates eliminated. The following one-liner does the trick.

假设您有一个Collection“dups”,并且您想要创建另一个包含相同元素的Collection,但删除了所有重复项。以下单行就可以了。

Collection<collectionType> noDups = new HashSet<collectionType>(dups);

It works by creating a Set which, by definition, cannot contain duplicates.

它的工作原理是创建一个Set,根据定义,它不能包含重复项。

Based on oracle doc.

基于oracle doc。

#14


1  

Using java 8 stream api.

使用java 8 stream api。

    List<String> list = new ArrayList<>();
    list.add("one");
    list.add("one");
    list.add("two");
    System.out.println(list);
    Collection<String> c = list.stream().collect(Collectors.toSet());
    System.out.println(c);

Output:

输出:

Before values : [one, one, two]

价值观之前:[一,二,二]

After Values : [one, two]

价值观之后:[一,二]

#15


0  

The correct answer for Java is use a Set. If you already have a List<Customer> and want to de duplicate it

Java的正确答案是使用Set。如果您已经有List 并想要复制它

Set<Customer> s = new HashSet<Customer>(listCustomer);

Otherise just use a Set implemenation HashSet, TreeSet directly and skip the List construction phase.

其他只是直接使用Set实现HashSet,TreeSet并跳过List构建阶段。

You will need to override hashCode() and equals() on your domain classes that are put in the Set as well to make sure that the behavior you want actually what you get. equals() can be as simple as comparing unique ids of the objects to as complex as comparing every field. hashCode() can be as simple as returning the hashCode() of the unique id' String representation or the hashCode().

您将需要覆盖放在Set中的域类的hashCode()和equals(),以确保您实际所需的行为。 equals()可以像比较对象的唯一ID一样简单,就像比较每个字段一样复杂。 hashCode()可以像返回唯一id'String表示或hashCode()的hashCode()一样简单。

#16


-2  

Class removeduplicates 
{
    public static void main(string args[[])
    {   
        int I;
        for(int =0;i'<10;I++)
        {
            system.out.println(+i);
            if([]I=[j])
            {
                system.out.println(1,2,3,1,1,1,2,2,2)
            }
        }
    }
}