是否有可能在java中创建类似于Comparator的东西但是用于实现自定义equals()和hashCode()

时间:2020-12-10 16:47:39

I have an array of objects and I want to concatenate it with another array of objects, except that objects that have same id's. That objects are used in many places in the system and don't have hash code or equals implemented. So I don't want to implement hashCode() and equals(), cause I'm afraid to break something somewhere in the system where that objects are used and I don't know about that.

我有一个对象数组,我想将它与另一个对象数组连接起来,除了具有相同id的对象。这些对象在系统中的许多地方使用,并且没有实现哈希码或等号。所以我不想实现hashCode()和equals(),因为我害怕在使用该对象的系统中某处破坏某些东西而我不知道。

I want to put all that objects in a set, but somehow make the objects use custom hashCode() and equals(). Something like custom Comparator, but for equals.

我想将所有对象放在一个集合中,但不知何故使对象使用自定义hashCode()和equals()。像自定义比较器,但等于平等。

7 个解决方案

#1


26  

Yes it is possible to do such a thing. But it won't allow you to put your objects into a HashMap, HashSet, etc. That's because the standard collection classes expect key objects to provide the equals and hashCode methods. (That's the way they are designed to work ...)

是的,有可能做这样的事情。但它不允许您将对象放入HashMap,HashSet等。这是因为标准集合类期望关键对象提供equals和hashCode方法。 (这就是他们设计工作的方式......)

Alternatives:

备择方案:

  1. Implement a wrapper class that holds an instance of the real class, and provides its own implementation of equals and hashCode.

    实现一个包含真实类实例的包装类,并提供自己的equals和hashCode实现。

  2. Implement your own hashtable-based classes which can use a "hashable" object to provide equals and hashcode functionality.

    实现自己的基于哈希表的类,可以使用“hashable”对象来提供equals和hashcode功能。

  3. Bite the bullet and implement equals and hashCode overrides on the relevant classes.

    咬紧牙关并在相关类上实现equals和hashCode覆盖。

In fact, the 3rd option is probably the best, because your codebase most likely needs to to be using a consistent notion of what it means for these objects to be equal. There are other things that suggest that your code needs an overhaul. For instance, the fact that it is currently using an array of objects instead of a Set implementation to represent what is apparently supposed to be a set.

实际上,第三种选择可能是最好的,因为你的代码库很可能需要使用一致的概念来确定这些对象的平等性。还有其他一些事情表明您的代码需要进行大修。例如,它当前正在使用对象数组而不是Set实现来表示显然应该是一个集合的事实。

On the other hand, maybe there was/is some real (or imagined) performance reason for the current implementation; e.g. reduction of memory usage. In that case, you should probably write a bunch of helper methods for doing operations like concatenating 2 sets represented as arrays.

另一方面,也许当前实现存在/是某种真实(或想象)的性能原因;例如减少内存使用量。在这种情况下,您应该编写一堆辅助方法来执行操作,例如连接表示为数组的2个集合。

#2


13  

90% of the time when a user wants an equivalence relation there is already a more straightforward solution. You want to de-duplicate a bunch of things based on ids only? Can you just put them all into a Map with the ids as keys, then get the values() collection of that?

当用户想要等价关系时,有90%的时间已经有了更简单的解决方案。你想只根据id去除一堆东西吗?你可以把它们全部放入一个以ids为键的Map中,然后得到那个的values()集合吗?

#3


10  

HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.

HashingStrategy是您正在寻找的概念。它是一个策略接口,允许您定义equals和hashcode的自定义实现。

public interface HashingStrategy<E>
{
    int computeHashCode(E object);
    boolean equals(E object1, E object2);
}

As others have pointed out, you can't use a HashingStrategy with the built in HashSet or HashMap. Eclipse Collections includes a set called UnifiedSetWithHashingStrategy and a map called UnifiedMapWithHashingStrategy.

正如其他人指出的那样,你不能将HashingStrategy与内置的HashSet或HashMap一起使用。 Eclipse Collections包含一个名为UnifiedSetWithHashingStrategy的集合和一个名为UnifiedMapWithHashingStrategy的映射。

Let's look at an example. Here's a simple Data class we can use in a UnifiedSetWithHashingStrategy.

我们来看一个例子。这是一个我们可以在UnifiedSetWithHashingStrategy中使用的简单数据类。

public class Data
{
    private final int id;

    public Data(int id)
    {
        this.id = id;
    }

    public int getId()
    {
        return id;
    }

    // No equals or hashcode
}

Here's how you might set up a UnifiedSetWithHashingStrategy and use it.

以下是设置UnifiedSetWithHashingStrategy并使用它的方法。

java.util.Set<Data> set =
  new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
Assert.assertTrue(set.add(new Data(1)));

// contains returns true even without hashcode and equals
Assert.assertTrue(set.contains(new Data(1)));

// Second call to add() doesn't do anything and returns false
Assert.assertFalse(set.add(new Data(1)));

Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.

为什么不使用地图? UnifiedSetWithHashingStrategy使用UnifiedMap的一半内存,以及HashMap内存的四分之一。有时你没有方便的密钥,必须创建一个合成的密钥,就像一个元组。这会浪费更多的记忆。

How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to MutableSet, so it also implements a form of get().

我们如何执行查找?请记住,Sets包含contains(),但不包含get()。除了MutableSet之外,UnifiedSetWithHashingStrategy还实现了Pool,因此它还实现了get()的形式。

Note: I am a committer for Eclipse Collections.

注意:我是Eclipse Collections的提交者。

#4


4  

Of course you can create some external object providing an equality comparison and a HashCode. But the build-in collections of Java do not use such an object for their comparisons/lookup.

当然,您可以创建一些外部对象,提供相等比较和HashCode。但Java的内置集合不会使用这样的对象进行比较/查找。

I once did create an interface like this in my package-collection (just freshly translated to english):

我曾经在我的包集合中创建了这样的界面(刚刚翻译成英文):

public interface HashableEquivalenceRelation {

    /**
     * Returns true if two objects are considered equal.
     *
     * This should form an equivalence relation, meaning it
     * should fulfill these properties:
     *  <ul>
     *    <li>Reflexivity:  {@code areEqual(o, o)}
     *            should always return true.</li>
     *    <li>Symmetry: {@code areEqual(o1,o2) == areEqual(o2,o1)}
     *            for all objects o1 and o2</li>
     *    <li>Transitivity: If {@code areEqual(o1, o2)} and {@code areEqual(o2,o3)},
     *            then {@code areEqual(o1,o3}} should hold too.</li>
     *  </ul>
     * Additionally, the relation should be temporary consistent, i.e. the
     * result of this method for the same two objects should not change as
     * long as the objects do not change significantly (the precise meaning of
     * <em>change significantly</em> is dependent on the implementation).
     *
     * Also, if {@code areEqual(o1, o2)} holds true, then {@code hashCode(o1) == hashCode(o2)}
     * must be true too.
     */
    public boolean areEqual(Object o1, Object o2);

    /**
     * Returns a hashCode for an arbitrary object.
     *
     * This should be temporary consistent, i.e. the result for the same
     * objects should not change as long as the object does not change significantly
     * (with change significantly having the same meaning as for {@link areEqual}).
     *
     * Also, if {@code areEqual(o1, o2)} holds true, then {@code hashCode(o1) == hashCode(o2)}
     * must be true too.
     */
    public int hashCode(Object o);

}

Than I had a group of interfaces CustomCollection, CustomSet, CustomList, CustomMap, etc. defined like the interfaces in java.util, but using such an equivalence relation for all the methods instead of the build-in relation given by Object.equals. I had some default implementations, too:

比我有一组接口CustomCollection,CustomSet,CustomList,CustomMap等定义像java.util中的接口,但是对所有方法使用这种等价关系而不是Object.equals给出的内置关系。我也有一些默认实现:

/**
 * The equivalence relation induced by Object#equals.
 */
public final static EquivalenceRelation DEFAULT =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return
                o1 == o2 ||
                o1 != null &&
                o1.equals(o2);
        }
        public int hashCode(Object ob)
        {
            return
                ob == null?
                0 :
                ob.hashCode();
        }
        public String toString() { return "<DEFAULT>"; }
    };

/**
 * The equivalence relation induced by {@code ==}.
 * (The hashCode used is {@link System#identityHashCode}.)
 */
public final static EquivalenceRelation IDENTITY =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) { return o1 == o2; }
        public int hashCode(Object ob) { return System.identityHashCode(ob); }
        public String toString() { return "<IDENTITY>"; }
    };

/**
 * The all-relation: every object is equivalent to every other one.
 */
public final static EquivalenceRelation ALL =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) { return true; }
        public int hashCode(Object ob) { return 0; }
        public String toString() { return "<ALL>"; }
    };

/**
 * An equivalence relation partitioning the references
 * in two groups: the null reference and any other reference.
 */
public final static EquivalenceRelation NULL_OR_NOT_NULL =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return (o1 == null && o2 == null) ||
                (o1 != null && o2 != null);
        }
        public int hashCode(Object o) { return o == null ? 0 : 1; }
        public String toString() { return "<NULL_OR_NOT_NULL>"; }
    };

/**
 * Two objects are equivalent if they are of the same (actual) class.
 */
public final static EquivalenceRelation SAME_CLASS =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return o1 == o2 || o1 != null && o2 != null &&
                o1.getClass() == o2.getClass();
        }
        public int hashCode(Object o) { return o == null ? 0 : o.getClass().hashCode(); }
        public String toString() { return "<SAME_CLASS>"; }
    };


/**
 * Compares strings ignoring case.
 * Other objects give a {@link ClassCastException}.
 */
public final static EquivalenceRelation STRINGS_IGNORE_CASE =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return o1 == null ?
                o2 == null :
                ((String)o1).equalsIgnoreCase((String)o2);
        }
        public int hashCode(Object o)
        {
            return o == null ? -12345 : ((String)o).toUpperCase().hashCode();
        }
        public String toString() { return "<STRINGS_IGNORE_CASE>"; }
    };


/**
 * Compares {@link CharSequence} implementations by content.
 * Other object give a {@link ClassCastException}.
 */
public final static EquivalenceRelation CHAR_SEQUENCE_CONTENT =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) 
        {
            CharSequence seq1 = (CharSequence)o1;
            CharSequence seq2 = (CharSequence)o2;
            if (seq1 == null ^ seq2 == null) // nur eins von beiden null
                return false;
            if (seq1 == seq2)   // umfasst auch den Fall null == null
                return true;
            int size = seq1.length();
            if (seq2.length() != size)
                return false;
            for (int i = 0; i < size; i++)
                {
                    if (seq1.charAt(i) != seq2.charAt(i))
                        return false;
                }
            return true;
        }
        /**
         * Entrspricht String.hashCode
         */
        public int hashCode(Object o)
        {
            CharSequence sequence = (CharSequence)o;
            if (sequence == null)
                return 0;
            int hash = 0;
            int size = sequence.length();
            for (int i = 0; i < size; i++)
                {
                    hash = hash * 31 + sequence.charAt(i);
                }
            return hash;
        }
    };

#5


1  

Would using a TreeSet help here? A TreeSet actually performs ordering and Set based behavior using compare/compareTo and allows you to define a custom Comparator for use in one of the constructors.

在这里使用TreeSet会有帮助吗? TreeSet实际上使用compare / compareTo执行排序和基于Set的行为,并允许您定义一个自定义Comparator以在其中一个构造函数中使用。

#6


1  

Just had this problem and worked up a simple solution. Not sure how memory-intensive it is; I'm sure people can refine it down the line.

刚刚遇到这个问题并制定了一个简单的解决方案。不确定它的内存密集程度;我相信人们可以改进它。

When the Comparator returns 0, the elements match.

当比较器返回0时,元素匹配。

public static <E> Set<E> filterSet(Set<E> set, Comparator<E> comparator){
    Set<E> output = new HashSet<E>();
    for(E eIn : set){
        boolean add = true;
        for(E eOut : output){
            if(comparator.compare(eIn, eOut) == 0){
                add = false;
                break;
            }
        }
        if(add) output.add(eIn);
    }
    return output;
}

My use case was that I needed to filter out duplicate URLs, as in URLs that point to the same document. The URL object has a samePage() method that will return true if everything except the fragment are the same.

我的用例是我需要过滤掉重复的URL,就像指向同一文档的URL一样。 URL对象具有samePage()方法,如果除片段之外的所有内容都相同,则返回true。

filtered = Misc.filterSet(filtered, (a, b) -> a.sameFile(b) ? 0 : 1);

#7


0  

You will not succeed doing your de-duplicating concatenation with a Comparator. Presumably you are looking to do something like this:

你不会成功地与比较器进行重复数据删除连接。大概你想要做这样的事情:

List<Object> list = new ArrayList<Object>();
list.addAll( a );
list.addAll( b );
Collections.sort( list, new MyCustomComparator() );

The problem is that Comparator needs to compare not just for equals/not-equals, but also for relative order. Given objects x and y that are not equal, you have to answer if one is greater than the other. You will not be able to do that, since you aren't actually trying to compare the objects. If you don't give a consistent answer, you will send the sorting algorithm into an infinite loop.

问题是比较器不仅需要比较equals / not-equals,还需要比较相对顺序。如果对象x和y不相等,则必须回答一个是否大于另一个。你将无法做到这一点,因为你实际上并没有尝试比较对象。如果您没有给出一致的答案,您将把排序算法发送到无限循环中。

I do have a solution for you. Java has a class called LinkedHashSet, whose virtue is that it doesn't allow duplicates to be inserted, but maintains insertion order. Rather than implementing a comparator, implement a wrapper class to hold the actual object and implement hashCode/equals.

我确实有一个解决方案。 Java有一个名为LinkedHashSet的类,其优点是它不允许插入重复项,但保持插入顺序。而不是实现比较器,实现一个包装类来保存实际对象并实现hashCode / equals。

#1


26  

Yes it is possible to do such a thing. But it won't allow you to put your objects into a HashMap, HashSet, etc. That's because the standard collection classes expect key objects to provide the equals and hashCode methods. (That's the way they are designed to work ...)

是的,有可能做这样的事情。但它不允许您将对象放入HashMap,HashSet等。这是因为标准集合类期望关键对象提供equals和hashCode方法。 (这就是他们设计工作的方式......)

Alternatives:

备择方案:

  1. Implement a wrapper class that holds an instance of the real class, and provides its own implementation of equals and hashCode.

    实现一个包含真实类实例的包装类,并提供自己的equals和hashCode实现。

  2. Implement your own hashtable-based classes which can use a "hashable" object to provide equals and hashcode functionality.

    实现自己的基于哈希表的类,可以使用“hashable”对象来提供equals和hashcode功能。

  3. Bite the bullet and implement equals and hashCode overrides on the relevant classes.

    咬紧牙关并在相关类上实现equals和hashCode覆盖。

In fact, the 3rd option is probably the best, because your codebase most likely needs to to be using a consistent notion of what it means for these objects to be equal. There are other things that suggest that your code needs an overhaul. For instance, the fact that it is currently using an array of objects instead of a Set implementation to represent what is apparently supposed to be a set.

实际上,第三种选择可能是最好的,因为你的代码库很可能需要使用一致的概念来确定这些对象的平等性。还有其他一些事情表明您的代码需要进行大修。例如,它当前正在使用对象数组而不是Set实现来表示显然应该是一个集合的事实。

On the other hand, maybe there was/is some real (or imagined) performance reason for the current implementation; e.g. reduction of memory usage. In that case, you should probably write a bunch of helper methods for doing operations like concatenating 2 sets represented as arrays.

另一方面,也许当前实现存在/是某种真实(或想象)的性能原因;例如减少内存使用量。在这种情况下,您应该编写一堆辅助方法来执行操作,例如连接表示为数组的2个集合。

#2


13  

90% of the time when a user wants an equivalence relation there is already a more straightforward solution. You want to de-duplicate a bunch of things based on ids only? Can you just put them all into a Map with the ids as keys, then get the values() collection of that?

当用户想要等价关系时,有90%的时间已经有了更简单的解决方案。你想只根据id去除一堆东西吗?你可以把它们全部放入一个以ids为键的Map中,然后得到那个的values()集合吗?

#3


10  

HashingStrategy is the concept you're looking for. It's a strategy interface that allows you to define custom implementations of equals and hashcode.

HashingStrategy是您正在寻找的概念。它是一个策略接口,允许您定义equals和hashcode的自定义实现。

public interface HashingStrategy<E>
{
    int computeHashCode(E object);
    boolean equals(E object1, E object2);
}

As others have pointed out, you can't use a HashingStrategy with the built in HashSet or HashMap. Eclipse Collections includes a set called UnifiedSetWithHashingStrategy and a map called UnifiedMapWithHashingStrategy.

正如其他人指出的那样,你不能将HashingStrategy与内置的HashSet或HashMap一起使用。 Eclipse Collections包含一个名为UnifiedSetWithHashingStrategy的集合和一个名为UnifiedMapWithHashingStrategy的映射。

Let's look at an example. Here's a simple Data class we can use in a UnifiedSetWithHashingStrategy.

我们来看一个例子。这是一个我们可以在UnifiedSetWithHashingStrategy中使用的简单数据类。

public class Data
{
    private final int id;

    public Data(int id)
    {
        this.id = id;
    }

    public int getId()
    {
        return id;
    }

    // No equals or hashcode
}

Here's how you might set up a UnifiedSetWithHashingStrategy and use it.

以下是设置UnifiedSetWithHashingStrategy并使用它的方法。

java.util.Set<Data> set =
  new UnifiedSetWithHashingStrategy<>(HashingStrategies.fromFunction(Data::getId));
Assert.assertTrue(set.add(new Data(1)));

// contains returns true even without hashcode and equals
Assert.assertTrue(set.contains(new Data(1)));

// Second call to add() doesn't do anything and returns false
Assert.assertFalse(set.add(new Data(1)));

Why not just use a Map? UnifiedSetWithHashingStrategy uses half the memory of a UnifiedMap, and one quarter the memory of a HashMap. And sometimes you don't have a convenient key and have to create a synthetic one, like a tuple. That can waste more memory.

为什么不使用地图? UnifiedSetWithHashingStrategy使用UnifiedMap的一半内存,以及HashMap内存的四分之一。有时你没有方便的密钥,必须创建一个合成的密钥,就像一个元组。这会浪费更多的记忆。

How do we perform lookups? Remember that Sets have contains(), but not get(). UnifiedSetWithHashingStrategy implements Pool in addition to MutableSet, so it also implements a form of get().

我们如何执行查找?请记住,Sets包含contains(),但不包含get()。除了MutableSet之外,UnifiedSetWithHashingStrategy还实现了Pool,因此它还实现了get()的形式。

Note: I am a committer for Eclipse Collections.

注意:我是Eclipse Collections的提交者。

#4


4  

Of course you can create some external object providing an equality comparison and a HashCode. But the build-in collections of Java do not use such an object for their comparisons/lookup.

当然,您可以创建一些外部对象,提供相等比较和HashCode。但Java的内置集合不会使用这样的对象进行比较/查找。

I once did create an interface like this in my package-collection (just freshly translated to english):

我曾经在我的包集合中创建了这样的界面(刚刚翻译成英文):

public interface HashableEquivalenceRelation {

    /**
     * Returns true if two objects are considered equal.
     *
     * This should form an equivalence relation, meaning it
     * should fulfill these properties:
     *  <ul>
     *    <li>Reflexivity:  {@code areEqual(o, o)}
     *            should always return true.</li>
     *    <li>Symmetry: {@code areEqual(o1,o2) == areEqual(o2,o1)}
     *            for all objects o1 and o2</li>
     *    <li>Transitivity: If {@code areEqual(o1, o2)} and {@code areEqual(o2,o3)},
     *            then {@code areEqual(o1,o3}} should hold too.</li>
     *  </ul>
     * Additionally, the relation should be temporary consistent, i.e. the
     * result of this method for the same two objects should not change as
     * long as the objects do not change significantly (the precise meaning of
     * <em>change significantly</em> is dependent on the implementation).
     *
     * Also, if {@code areEqual(o1, o2)} holds true, then {@code hashCode(o1) == hashCode(o2)}
     * must be true too.
     */
    public boolean areEqual(Object o1, Object o2);

    /**
     * Returns a hashCode for an arbitrary object.
     *
     * This should be temporary consistent, i.e. the result for the same
     * objects should not change as long as the object does not change significantly
     * (with change significantly having the same meaning as for {@link areEqual}).
     *
     * Also, if {@code areEqual(o1, o2)} holds true, then {@code hashCode(o1) == hashCode(o2)}
     * must be true too.
     */
    public int hashCode(Object o);

}

Than I had a group of interfaces CustomCollection, CustomSet, CustomList, CustomMap, etc. defined like the interfaces in java.util, but using such an equivalence relation for all the methods instead of the build-in relation given by Object.equals. I had some default implementations, too:

比我有一组接口CustomCollection,CustomSet,CustomList,CustomMap等定义像java.util中的接口,但是对所有方法使用这种等价关系而不是Object.equals给出的内置关系。我也有一些默认实现:

/**
 * The equivalence relation induced by Object#equals.
 */
public final static EquivalenceRelation DEFAULT =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return
                o1 == o2 ||
                o1 != null &&
                o1.equals(o2);
        }
        public int hashCode(Object ob)
        {
            return
                ob == null?
                0 :
                ob.hashCode();
        }
        public String toString() { return "<DEFAULT>"; }
    };

/**
 * The equivalence relation induced by {@code ==}.
 * (The hashCode used is {@link System#identityHashCode}.)
 */
public final static EquivalenceRelation IDENTITY =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) { return o1 == o2; }
        public int hashCode(Object ob) { return System.identityHashCode(ob); }
        public String toString() { return "<IDENTITY>"; }
    };

/**
 * The all-relation: every object is equivalent to every other one.
 */
public final static EquivalenceRelation ALL =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) { return true; }
        public int hashCode(Object ob) { return 0; }
        public String toString() { return "<ALL>"; }
    };

/**
 * An equivalence relation partitioning the references
 * in two groups: the null reference and any other reference.
 */
public final static EquivalenceRelation NULL_OR_NOT_NULL =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return (o1 == null && o2 == null) ||
                (o1 != null && o2 != null);
        }
        public int hashCode(Object o) { return o == null ? 0 : 1; }
        public String toString() { return "<NULL_OR_NOT_NULL>"; }
    };

/**
 * Two objects are equivalent if they are of the same (actual) class.
 */
public final static EquivalenceRelation SAME_CLASS =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return o1 == o2 || o1 != null && o2 != null &&
                o1.getClass() == o2.getClass();
        }
        public int hashCode(Object o) { return o == null ? 0 : o.getClass().hashCode(); }
        public String toString() { return "<SAME_CLASS>"; }
    };


/**
 * Compares strings ignoring case.
 * Other objects give a {@link ClassCastException}.
 */
public final static EquivalenceRelation STRINGS_IGNORE_CASE =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2)
        {
            return o1 == null ?
                o2 == null :
                ((String)o1).equalsIgnoreCase((String)o2);
        }
        public int hashCode(Object o)
        {
            return o == null ? -12345 : ((String)o).toUpperCase().hashCode();
        }
        public String toString() { return "<STRINGS_IGNORE_CASE>"; }
    };


/**
 * Compares {@link CharSequence} implementations by content.
 * Other object give a {@link ClassCastException}.
 */
public final static EquivalenceRelation CHAR_SEQUENCE_CONTENT =
    new EquivalenceRelation() {
        public boolean areEqual(Object o1, Object o2) 
        {
            CharSequence seq1 = (CharSequence)o1;
            CharSequence seq2 = (CharSequence)o2;
            if (seq1 == null ^ seq2 == null) // nur eins von beiden null
                return false;
            if (seq1 == seq2)   // umfasst auch den Fall null == null
                return true;
            int size = seq1.length();
            if (seq2.length() != size)
                return false;
            for (int i = 0; i < size; i++)
                {
                    if (seq1.charAt(i) != seq2.charAt(i))
                        return false;
                }
            return true;
        }
        /**
         * Entrspricht String.hashCode
         */
        public int hashCode(Object o)
        {
            CharSequence sequence = (CharSequence)o;
            if (sequence == null)
                return 0;
            int hash = 0;
            int size = sequence.length();
            for (int i = 0; i < size; i++)
                {
                    hash = hash * 31 + sequence.charAt(i);
                }
            return hash;
        }
    };

#5


1  

Would using a TreeSet help here? A TreeSet actually performs ordering and Set based behavior using compare/compareTo and allows you to define a custom Comparator for use in one of the constructors.

在这里使用TreeSet会有帮助吗? TreeSet实际上使用compare / compareTo执行排序和基于Set的行为,并允许您定义一个自定义Comparator以在其中一个构造函数中使用。

#6


1  

Just had this problem and worked up a simple solution. Not sure how memory-intensive it is; I'm sure people can refine it down the line.

刚刚遇到这个问题并制定了一个简单的解决方案。不确定它的内存密集程度;我相信人们可以改进它。

When the Comparator returns 0, the elements match.

当比较器返回0时,元素匹配。

public static <E> Set<E> filterSet(Set<E> set, Comparator<E> comparator){
    Set<E> output = new HashSet<E>();
    for(E eIn : set){
        boolean add = true;
        for(E eOut : output){
            if(comparator.compare(eIn, eOut) == 0){
                add = false;
                break;
            }
        }
        if(add) output.add(eIn);
    }
    return output;
}

My use case was that I needed to filter out duplicate URLs, as in URLs that point to the same document. The URL object has a samePage() method that will return true if everything except the fragment are the same.

我的用例是我需要过滤掉重复的URL,就像指向同一文档的URL一样。 URL对象具有samePage()方法,如果除片段之外的所有内容都相同,则返回true。

filtered = Misc.filterSet(filtered, (a, b) -> a.sameFile(b) ? 0 : 1);

#7


0  

You will not succeed doing your de-duplicating concatenation with a Comparator. Presumably you are looking to do something like this:

你不会成功地与比较器进行重复数据删除连接。大概你想要做这样的事情:

List<Object> list = new ArrayList<Object>();
list.addAll( a );
list.addAll( b );
Collections.sort( list, new MyCustomComparator() );

The problem is that Comparator needs to compare not just for equals/not-equals, but also for relative order. Given objects x and y that are not equal, you have to answer if one is greater than the other. You will not be able to do that, since you aren't actually trying to compare the objects. If you don't give a consistent answer, you will send the sorting algorithm into an infinite loop.

问题是比较器不仅需要比较equals / not-equals,还需要比较相对顺序。如果对象x和y不相等,则必须回答一个是否大于另一个。你将无法做到这一点,因为你实际上并没有尝试比较对象。如果您没有给出一致的答案,您将把排序算法发送到无限循环中。

I do have a solution for you. Java has a class called LinkedHashSet, whose virtue is that it doesn't allow duplicates to be inserted, but maintains insertion order. Rather than implementing a comparator, implement a wrapper class to hold the actual object and implement hashCode/equals.

我确实有一个解决方案。 Java有一个名为LinkedHashSet的类,其优点是它不允许插入重复项,但保持插入顺序。而不是实现比较器,实现一个包装类来保存实际对象并实现hashCode / equals。