Java String中的hashCode函数

时间:2021-12-26 16:22:54

String 类中的hash函数如下:

    /**
     * Returns a hash code for this string. The hash code for a
     * <code>String</code> object is computed as
     * <blockquote><pre>
     * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
     * </pre></blockquote>
     * using <code>int</code> arithmetic, where <code>s[i]</code> is the
     * <i>i</i>th character of the string, <code>n</code> is the length of
     * the string, and <code>^</code> indicates exponentiation.
     * (The hash value of the empty string is zero.)
     *
     * @return  a hash code value for this object.
     */
    public int hashCode() {
	int h = hash;
	if (h == 0) {
	    int off = offset;
	    char val[] = value;
	    int len = count;

            for (int i = 0; i < len; i++) {
                h = 31*h + val[off++];
            }
            hash = h;
        }
        return h;
    }

 

上面的代码可见简洁和高效。JDK的源代码。

 

这个hash算法的名称叫:BKDR Hash Function

 

This hash function comes from Brian Kernighan and Dennis Ritchie's book "The C Programming Language". It is a simple hash function using a strange set of possible seeds which all constitute a pattern of 31....31...31 etc, it seems to be very similar to the DJB hash function.

 

算法来自于Brian Kernighan and Dennis Ritchie合著的C语言编程一书。函数用一个特殊的种子集合构建哈希函数

 

该算法代码:

   public long BKDRHash(String str)
   {
      long seed = 131; // 31 131 1313 13131 131313 etc..
      long hash = 0;

      for(int i = 0; i < str.length(); i++)
      {
         hash = (hash * seed) + str.charAt(i);
      }

      return hash;
   }
   /* End Of BKDR Hash Function */

 

参考路径:http://www.partow.net/programming/hashfunctions/index.html


具体的计算公式是:s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1],注意^不是异或,是乘方(http://*.com/questions/822363/proof-why-does-java-lang-string-hashcodes-implementation-match-its-documenta)。

I'm not sure if you missed where it says "^ indicates exponentiation" (not xor) in that documentation.

Each time through the loop, the previous value of hash is multipled by 31 again before being added to the next element of value.

One could prove these things are equal by induction, but I think an example might be more clear:

Say we're dealing with a 4-char string. Let's unroll the loop:

hash = 0; hash = 31 * hash + value[0]; hash = 31 * hash + value[1]; hash = 31 * hash + value[2]; hash = 31 * hash + value[3];

Now combine these into one statement by substituting each value of hash into the following statement:

hash = 31 * (31 * (31 * (31 * 0 + value[0]) + value[1]) + value[2]) + value[3];

31 * 0 is 0, so simplify:

hash = 31 * (31 * (31 * value[0] + value[1]) + value[2]) + value[3];

Now multiply the two inner terms by that second 31:

hash = 31 * (31 * 31 * value[0] + 31 * value[1] + value[2]) + value[3];

Now multiply the three inner terms by that first 31:

hash = 31 * 31 * 31 * value[0] + 31 * 31 * value[1] + 31 * value[2] + value[3];

and convert to exponents (not really Java anymore):

hash = 31^3 * value[0] + 31^2 * value[1] + 31^1 * value[2] + value[3];