重写equals为何总要重写hashCode

时间:2022-06-13 16:02:46

  • Object中的定义
/**

     * Returns a hash code value for the object. This method is

     * supported for the benefit of hash tables such as those provided by

     * {@link java.util.HashMap}.

     * <p>

     * The general contract of {@code hashCode} is:

     *

     * 每当在Java应用程序的执行过程中多次调用同一个对象时,

     * @code hashCode方法必须始终返回相同的整数,

     * 前提是在@code中使用的信息没有被修改。

     * 这个整数不需要保持一致,从一个应用程序的执行到同一个应用程序的另一个执行。  

     * <li>Whenever it is invoked on the same object more than once during

     *     an execution of a Java application, the {@code hashCode} method

     *     must consistently return the same integer, provided no information

     *     used in {@code equals} comparisons on the object is modified.

     *     This integer need not remain consistent from one execution of an

     *     application to another execution of the same application.

     *

     * 如果根据@code equals(Object)方法,两个对象是相等的,

     * 那么在这两个对象上调用@code hashCode方法必须产生相同的整数结果。

     * <li>If two objects are equal according to the {@code equals(Object)}

     *     method, then calling the {@code hashCode} method on each of

     *     the two objects must produce the same integer result.

     *

     * 如果两个对象是不相等的,这是不需要的

     * 根据@link java.lang.object equals(java.lang.object)方法,

     * 然后在每一个上调用@code hashCode方法,两个对象必须产生不同的整数结果。

     * 然而,程序员应该意识到产生不同的整数结果对于不相等的对象,可以提高散列表的性能。

     * <li>It is <em>not</em> required that if two objects are unequal

     *     according to the {@link java.lang.Object#equals(java.lang.Object)}

     *     method, then calling the {@code hashCode} method on each of the

     *     two objects must produce distinct integer results.  However, the

     *     programmer should be aware that producing distinct integer results

     *     for unequal objects may improve the performance of hash tables.

     * </ul>

     * <p>

     * As much as is reasonably practical, the hashCode method defined by

     * class {@code Object} does return distinct integers for distinct

     * objects. (This is typically implemented by converting the internal

     * address of the object into an integer, but this implementation

     * technique is not required by the

     * Java™ programming language.)

     *

     * @return  a hash code value for this object.

     * @see     java.lang.Object#equals(java.lang.Object)

     * @see     java.lang.System#identityHashCode

     */

    public native int hashCode();

 

    /**

     * Indicates whether some other object is "equal to" this one.

     * <p>

     * The {@code equals} method implements an equivalence relation

     * on non-null object references:

     * <ul>

     * <li>It is <i>reflexive</i>: for any non-null reference value

     *     {@code x}, {@code x.equals(x)} should return

     *     {@code true}.

     * <li>It is <i>symmetric</i>: for any non-null reference values

     *     {@code x} and {@code y}, {@code x.equals(y)}

     *     should return {@code true} if and only if

     *     {@code y.equals(x)} returns {@code true}.

     * <li>It is <i>transitive</i>: for any non-null reference values

     *     {@code x}, {@code y}, and {@code z}, if

     *     {@code x.equals(y)} returns {@code true} and

     *     {@code y.equals(z)} returns {@code true}, then

     *     {@code x.equals(z)} should return {@code true}.

     * <li>It is <i>consistent</i>: for any non-null reference values

     *     {@code x} and {@code y}, multiple invocations of

     *     {@code x.equals(y)} consistently return {@code true}

     *     or consistently return {@code false}, provided no

     *     information used in {@code equals} comparisons on the

     *     objects is modified.

     * <li>For any non-null reference value {@code x},

     *     {@code x.equals(null)} should return {@code false}.

     * </ul>

     * <p>

     * The {@code equals} method for class {@code Object} implements

     * the most discriminating possible equivalence relation on objects;

     * that is, for any non-null reference values {@code x} and

     * {@code y}, this method returns {@code true} if and only

     * if {@code x} and {@code y} refer to the same object

     * ({@code x == y} has the value {@code true}).

     *

     * 请注意,每当这个方法被覆盖时,通常都需要覆盖@code hashCode方法,

     * 以便维护@code hashCode方法的一般契约,

     * 该方法规定相等对象必须具有相等的散列码。

     * Note that it is generally necessary to override the {@code hashCode}

     * method whenever this method is overridden, so as to maintain the

     * general contract for the {@code hashCode} method, which states

     * that equal objects must have equal hash codes.

     *

     * @param   obj   the reference object with which to compare.

     * @return  {@code true} if this object is the same as the obj

     *          argument; {@code false} otherwise.

     * @see     #hashCode()

     * @see     java.util.HashMap

     */

    public boolean equals(Object obj) {

        return (this == obj);

}

 

  • 重写equals一定要重写hashCode

Object。hashCode的通用约定,我们可以直观地看出:覆盖equals必须覆盖hashCode。

如果没有覆盖hashCode会违反Oject.hashCode通用约定的第二条:相等的对象必须具有相等的散列码,导致该类无法结合所有基于散列的集合(HashMap、HashSet和Hashtable)一起正常运作。同时,不相等的对象产生截然不同的hash结果,有利于提高散列表的性能。

equals与hashCode都用鉴别是否是同一对象,前者比较行为比较为全面复杂(意味着效率低),后者只要用哈希值比较就行(效率高)。但是利用hashCode比较不是绝对可靠的。所以为提高效率: 一般先利用hashCode去比较,如果不一样,那么它们就不相等;否则,再用equals比较。这样会大大提高比较的效率。

注:不同对象的hashCode一定不同吗?不一定(这可能是因为:生成哈希值的算法可能存在问题或哈希值的长度限制)

 

举个例子,

public final class PhoneNumber {
	private final short areaCode;
	private final short prefix;
	private final short lineNumber;
	
	public PhoneNumber(int areaCode, int prefix, int lineNumber) {
		rangeCheck(areaCode,999,"area code");
		rangeCheck(prefix,999,"prefix");
		rangeCheck(lineNumber,9999,"line number");
		
		this.areaCode = (short)areaCode;
		this.prefix = (short)prefix;
		this.lineNumber = (short)lineNumber;
	}
	
	private static void rangeCheck(int arg,int max,String name){
		if(arg<0||arg>max)
		{
			throw new IllegalArgumentException(name+":"+arg);
		}
	}
	
	/**
	 * 这个hashCode是合法的,它确保了相等的对象具有同样的散列码
	 * 同时,这种定义极为恶劣,因为它使每个对象都具有相同的散列码。
	 * 这样会使每个对象都被映射到同一个散列桶中,使散列表退化为链表(本该线性时间运行变为了平方时间运行)
	 * 注:此处这样写仅为方便地演示具体事例。
	 */
//	@Override
//	public int hashCode()
//	{
//		return 42;
//	}
	
	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if(!(obj instanceof PhoneNumber))
			return false;
		PhoneNumber pn=(PhoneNumber)obj;
		return pn.lineNumber==lineNumber 
				&&pn.prefix==prefix
				&&pn.areaCode==areaCode;		
	}
	
}

import java.util.HashMap;
import java.util.Map;

public class Test {
	public static void main(String[] args) {
		Map<PhoneNumber, String> m=new HashMap<PhoneNumber, String>();
		
		m.put(new PhoneNumber(707, 867,5309),"Jenny");
		System.out.println(m.get(new PhoneNumber(707, 867,5309)));
	}

}

 

当我们期待通过m.get(new PhoneNumber(707, 867,5309))返回Jenny的时候,实际输出结果为null。为什么?因为没有重写hashCode方法,导致相等的实例的散列码不同。put方法将电话号码对象存放在一个散列桶中,get方法却在另一个散列桶中查找这个电话号码。(即使这两个正好放在一个桶中,get方法也会返回null。因为HashMap有一项优化,可以将每个项相关联的散列码缓存起来,如果散列码不匹配,也不必检验对象的等同性)

如何修正这个问题?只需在PhoneNumber 类中重写一个合适的HashCode。

我们将PhoneNumber 中的HashCode取消注释,再运行效果就出来了。


参考文献《Effective Java》