概述

HashMap是一个散列表，是基于拉链法实现的。这个类继承了Map接口，Map接口提供了所有的哈希操作，比如set()、put()、remove()等，并且允许操作的键值对为null。HashMap跟Hashtable基本相同，区别是HashMap是非同步的并且允许键值对为null。HashMap不保证映射的顺序，特别是不保证该顺序恒久不变。

在用到的哈希函数均匀性比较好的前提下，基本操作比如put和get的时间都是O(1)的。处理哈希冲突所需的时间与哈希表的容量（桶的数目）加上表的大小（存储的键值映射对）之和成正比。所以在初始化哈希表时，初始容量不要太大（装载因子不要太小）。

影响HashMap性能的两个比较重要的参数是初始容量（initial capacity）和装载因子（load factor）。容量（capacity）是哈希表中桶的数量，初始容量就是哈希表创建时桶的数量。装载因子（load factor）是一个参数，它用来衡量哈希表满和空的程度，它在公式上等于size(键值对数量)/capacity(容量)。当装载因子大于某个值时，哈希表会进行重构（rehash），一般会将桶的数目加倍，哈希表的初始容量一般设为2的幂。

负载因子一般设为0.75。这是时间费用和空间费用的比较好的权衡。负载因子是哈希表在其容量自动增加之前可以达到多满的一种尺度。当哈希表中的条目数超出了加载因子与当前容量的乘积时，则要对该哈希表进行 rehash 操作（即重建内部数据结构），从而哈希表将具有大约两倍的桶数。

HashMap是非同步的，如果有多个线程并发的的访问，并且只是有个线程在结构上改变哈希表，必须增加额外的同步策略。这里所说的在结构上该表哈希表是指增加或删除一个或多个键值对，只是改变某个关键字所对应的值并不是指改变哈希表结构。

HashMap的数据结构

在Java编程语言中，最基本的结构就是两种，一个是数组，一个是模拟指针（引用），所有的数据结构都可以用这两个结构来构造，HashMap也不例外。HashMap实际上是一个“链表散列”的数据结构，即数组和链表的结合体。这是因为HashMap使用拉链法来解决哈希冲突的。

从构造函数可以看出，HashMap底层就是一个数组结构，数组中的每一项又是一个链表。当新建一个HashMap的时候，就会初始化一个数组。

 static class Entry<K,V> implements Map.Entry<K,V> {

         final K key;

         V value;

         Entry<K,V> next;

         final int hash;

         ........

 }

Entry就是数组中的元素，每个Map.Entry其实就是一个key-value对，它持有一个指向下一个元素的引用，这就构成了链表。同时，hash用来计算桶号，每一个桶就是一个链表。

源码阅读

HashMap类的源代码主要分如下及部分：

1）HashMap自身的成员变量及方法；

2）Entry的成员及方法。Entry是HashMap的子类，哈希表是一个Entry数组，数组的每一项代表一个桶；

3）

1.类成员变量

/**

*哈希表默认的初始容量 ，必须是2的幂.

*/

static final int DEFAULT_INITIAL_CAPACITY = 16;

/**

* 哈希表的最大容量。

*/

static final int MAXIMUM_CAPACITY = 1 << 30;

/**

* 转载因子，此值用于无参数的构造函数。

*/

static final float DEFAULT_LOAD_FACTOR = 0.75f;

/**

*哈希表，需要时会调整大小，长度必须是2的幂.

*/

transient Entry[] table;

/**

* 哈希表中的键值对.

*/

transient int size;

/**

* 阈值，当哈希表键值对数目达到此值时，哈希表会调整大小，等于capacity * load factor.

* @serial

*/

int threshold;

/**

* 哈希表的转载因子.

*

* @serial

*/

final float loadFactor;

/**

* 哈希表结构改变的次数

*/

transient volatile int modCount;

如以上源码所示，定义了哈希表重要的参数。因为HashMap实现了Serilizable接口，对象可以序列化，被关键字transient声明的变量可以不被序列化，详情可以参考http://www.cnblogs.com/lanxuezaipiao/p/3369962.html。

2构造函数

HashMap有四个构造函数，

 /**

      * Constructs an empty <tt>HashMap</tt> with the specified initial

      * capacity and load factor.

      *

      * @param  initialCapacity the initial capacity

      * @param  loadFactor      the load factor

      * @throws IllegalArgumentException if the initial capacity is negative

      *         or the load factor is nonpositive

      */

     public HashMap(int initialCapacity, float loadFactor) {

         if (initialCapacity < 0)

             throw new IllegalArgumentException("Illegal initial capacity: " +

                                                initialCapacity);

         if (initialCapacity > MAXIMUM_CAPACITY)

             initialCapacity = MAXIMUM_CAPACITY;

         if (loadFactor <= 0 || Float.isNaN(loadFactor))

             throw new IllegalArgumentException("Illegal load factor: " +

                                                loadFactor);

         // Find a power of 2 >= initialCapacity

         int capacity = 1;

         while (capacity < initialCapacity)

             capacity <<= 1;

         this.loadFactor = loadFactor;

         threshold = (int)(capacity * loadFactor);

         table = new Entry[capacity];

         init();

     }

     /**

      * Constructs an empty <tt>HashMap</tt> with the specified initial

      * capacity and the default load factor (0.75).

      *

      * @param  initialCapacity the initial capacity.

      * @throws IllegalArgumentException if the initial capacity is negative.

      */

     public HashMap(int initialCapacity) {

         this(initialCapacity, DEFAULT_LOAD_FACTOR);

     }

     /**

      * Constructs an empty <tt>HashMap</tt> with the default initial capacity

      * (16) and the default load factor (0.75).

      */

     public HashMap() {

         this.loadFactor = DEFAULT_LOAD_FACTOR;

         threshold = (int)(DEFAULT_INITIAL_CAPACITY * DEFAULT_LOAD_FACTOR);

         table = new Entry[DEFAULT_INITIAL_CAPACITY];

         init();

     }

     /**

      * Constructs a new <tt>HashMap</tt> with the same mappings as the

      * specified <tt>Map</tt>.  The <tt>HashMap</tt> is created with

      * default load factor (0.75) and an initial capacity sufficient to

      * hold the mappings in the specified <tt>Map</tt>.

      *

      * @param   m the map whose mappings are to be placed in this map

      * @throws  NullPointerException if the specified map is null

      */

     public HashMap(Map<? extends K, ? extends V> m) {

         this(Math.max((int) (m.size() / DEFAULT_LOAD_FACTOR) + 1,

                       DEFAULT_INITIAL_CAPACITY), DEFAULT_LOAD_FACTOR);

         putAllForCreate(m);

     }

构造函数的重要功能是初始化capacity、loadFactor、threshold和table等。值得注意的是，第四个构造函数是根据已有的某个HashMap创建新的HashMap.

3基本哈希操作

3.1求索引

 /**

      * Applies a supplemental hash function to a given hashCode, which

      * defends against poor quality hash functions.  This is critical

      * because HashMap uses power-of-two length hash tables, that

      * otherwise encounter collisions for hashCodes that do not differ

      * in lower bits. Note: Null keys always map to hash 0, thus index 0.

      */

     static int hash(int h) {

         // This function ensures that hashCodes that differ only by

         // constant multiples at each bit position have a bounded

         // number of collisions (approximately 8 at default load factor).

         h ^= (h >>> 20) ^ (h >>> 12);

         return h ^ (h >>> 7) ^ (h >>> 4);

     }

     /**

      * Returns index for hash code h.

      */

     static int indexFor(int h, int length) {

         return h & (length-1);

     }

这两个函数是求索引的，也就是键值对应该存放在哪个桶中。hash函数的输入是key的hashcode()。

3.2存

 /**

      * Associates the specified value with the specified key in this map.

      * If the map previously contained a mapping for the key, the old

      * value is replaced.

      *

      * @param key key with which the specified value is to be associated

      * @param value value to be associated with the specified key

      * @return the previous value associated with <tt>key</tt>, or

      *         <tt>null</tt> if there was no mapping for <tt>key</tt>.

      *         (A <tt>null</tt> return can also indicate that the map

      *         previously associated <tt>null</tt> with <tt>key</tt>.)

      */

     public V put(K key, V value) {

         if (key == null)

             return putForNullKey(value);

         int hash = hash(key.hashCode());

         int i = indexFor(hash, table.length);

         for (Entry<K,V> e = table[i]; e != null; e = e.next) {

             Object k;

             if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {

                 V oldValue = e.value;

                 e.value = value;

                 e.recordAccess(this);

                 return oldValue;

             }

         }

         modCount++;

         addEntry(hash, key, value, i);

         return null;

     }

当我们向HashMap中put元素的时候，先根据key的hashcode作为hash函数的输入计算hash值，根据hash值得到这个元素在数组中的下标，也就是哈希表的桶号。如果数组该位置上已经存放有其他元素，那么将新加入的放在链表头。如果数组该位置上没有元素，就直接将该元素放到数组中的该位置上。

分为三个情况：1）key已存在HashMap中；2）key不存在；3）key为null。对于第一种情况，定位到这个key，并将旧的value替换为新的value。对于第二种情况，调用addEntry函数，定义如下

  void addEntry(int hash, K key, V value, int bucketIndex) {

     Entry<K,V> e = table[bucketIndex];

         table[bucketIndex] = new Entry<K,V>(hash, key, value, e);

         if (size++ >= threshold)

             resize(2 * table.length);

     }

addEntry函数调用了Entry的构造函数，如下

 Entry(int h, K k, V v, Entry<K,V> n) {

             value = v;

             next = n;

             key = k;

             hash = h;

         }

从代码中可以看到，向表中插入一个Entry时，用到的是前插法，也就是说，新的Entry放在了链表的头部。当系统决定存储HashMap中的key-value对时，完全没有考虑Entry中的value，仅仅只是根据key来计算并决定每个Entry的存储位置。第三种情况，key为null时，调用函数putForNullKey，函数如下：

  private V putForNullKey(V value) {

         for (Entry<K,V> e = table[0]; e != null; e = e.next) {

             if (e.key == null) {

                 V oldValue = e.value;

                 e.value = value;

                 e.recordAccess(this);

                 return oldValue;

             }

         }

         modCount++;

         addEntry(0, null, value, 0);

         return null;

     }

当key为null时，索引为0.

  static int hash(int h) {

         // This function ensures that hashCodes that differ only by

         // constant multiples at each bit position have a bounded

         // number of collisions (approximately 8 at default load factor).

         h ^= (h >>> 20) ^ (h >>> 12);

         return h ^ (h >>> 7) ^ (h >>> 4);

     }

hash(int h)方法根据key的hashcode重新计算一次散列，此算法加入了高位计算，防止当低位不变，高位变化时，造成hash冲突。

在HashMap中要找到某个元素时，需要根据key的hash值来求得对应数组中的位置。HashMap的数据结构是数组和链表的结合，我们当然希望这个HashMap里面的元素尽量的分布均匀些，尽量使得每个位置上的元素数量只有一个，当我们查询时不用遍历链表，这样就大大优化了查询效率。

对于任意给定的对象，只要它的hashCode()返回值相同，那么程序调用hash(int h)方法所得到的hash值总是相同的，我们首先想到的就是把hash值对数组长度取模运算，这样一来，元素的分布相对来说是比较均匀的。但是，“模”运算的消耗还是比较大的，在HashMap中是这样的：调用indexFor(int h,int length)方法来计算该对象应该保存在table数组的哪个索引处。

  static int indexFor(int h, int length) {

         return h & (length-1);

     }

这个方法非常巧妙，它通过h&(table.length-1)来得到该对象的索引，而HashMap底层数组的长度总是2的n次方，这是HashMap在速度上的优化。在HashMap构造函数中有如下代码：

  // Find a power of 2 >= initialCapacity

         int capacity = 1;

         while (capacity < initialCapacity)

             capacity <<= 1;

这段代码保证初始化时HashMap的容量总是2的n次方，即底层的长度总是2的n次方。当length是2的n次方时，h&(length-1)运算等价于对length取模，也就是h%length，但是&比%具有更高的效率。

当数组长度是2的n次方时，2ⁿ-1得到的二进制数的每位上的值都为1，这使得在低位上&时，得到的和原hash的低位相同，加之hash(int h)方法对key的hashCode的进一步优化，加入了高位计算，就使得只有相同的hash值的两个值才会被放到数组中的同一个位置上形成链表。所以说，当数组长度为2的n次幂的时候，不同的key计算得到的index相同的几率较小。那么数据在数组上分布就比较均匀，也就是说碰撞的几率小，相对的，查询的时候就不用遍历某个位置上的链表，这样查询效率就较高了。

3.3取

根据key取对应的value，key可以是null，其索引是0

根据key取出其对应的Entry（键值对），key不存在时，返回null

HashMap在底层将key-value当成一个整体进行处理，这个整体就是一个Entry对象。HashMap底层采用一个Entry数组来保存所有的key-value对，当需要存储一个Entry对象时，会根据hash算法来决定其在数组中的存储位置，再根据equals方法决定其在该数组上的链表中的存储位置，当需要读取一个Entry时，也会根据hash算法找到其在数组中的存储位置，再根据equals方法从该位置上的链表中取出该Entry。

3.4 Rehash

当哈希表中的元素数目达到阈值时，哈希表就自动进行Rehash，capacity加倍。值得注意的是，当capacity等于MAXIMUM_CAPACITY的时候，不会调整哈希表大小，但是会把阈值调为Integer.MAX_VALUE。rehash是一个比较耗时的操作。Rahash对应的函数是resize，如下

 void resize(int newCapacity) {

         Entry[] oldTable = table;

         int oldCapacity = oldTable.length;

         if (oldCapacity == MAXIMUM_CAPACITY) {

             threshold = Integer.MAX_VALUE;

             return;

         }

         Entry[] newTable = new Entry[newCapacity];

         transfer(newTable);

         table = newTable;

         threshold = (int)(newCapacity * loadFactor);

     }

resize调用transfer函数，它的作用是将久表中的元素重新映射到新表中。transfer函数的代码如下：

  void transfer(Entry[] newTable) {

         Entry[] src = table;

         int newCapacity = newTable.length;

         for (int j = 0; j < src.length; j++) {

             Entry<K,V> e = src[j];

             if (e != null) {

                 src[j] = null;

                 do {

                     Entry<K,V> next = e.next;

                     int i = indexFor(e.hash, newCapacity);

                     e.next = newTable[i];

                     newTable[i] = e;

                     e = next;

                 } while (e != null);

             }

         }

     }

3.5删除

根据key删除其对应的Entry是由函数remove实现的，remove函数调用removeEntryForKey实现，removeEntryForKey的代码如下：

  final Entry<K,V> removeEntryForKey(Object key) {

         int hash = (key == null) ? 0 : hash(key.hashCode());

         int i = indexFor(hash, table.length);

         Entry<K,V> prev = table[i];

         Entry<K,V> e = prev;

         while (e != null) {

             Entry<K,V> next = e.next;

             Object k;

             if (e.hash == hash &&

                 ((k = e.key) == key || (key != null && key.equals(k)))) {

                 modCount++;

                 size--;

                 if (prev == e)

                     table[i] = next;

                 else

                     prev.next = next;

                 e.recordRemoval(this);

                 return e;

             }

             prev = e;

             e = next;

         }

         return e;

     }

删除并返回key所对应的Entry，如果哈希表中不存在参数key对于的键值对，则返回null。

clear函数将哈希表置空，如下

  /**

      * Removes all of the mappings from this map.

      * The map will be empty after this call returns.

      */

     public void clear() {

         modCount++;

         Entry[] tab = table;

         for (int i = 0; i < tab.length; i++)

             tab[i] = null;

         size = 0;

     }

可以看出，函数遍历Entry数组，并将数组的每一项置为null。

3.6存在性

判断某个value是否存储在哈希表中，函数如下

 /**

      * Returns <tt>true</tt> if this map maps one or more keys to the

      * specified value.

      *

      * @param value value whose presence in this map is to be tested

      * @return <tt>true</tt> if this map maps one or more keys to the

      *         specified value

      */

     public boolean containsValue(Object value) {

     if (value == null)

             return containsNullValue();

     Entry[] tab = table;

         for (int i = 0; i < tab.length ; i++)

             for (Entry e = tab[i] ; e != null ; e = e.next)

                 if (value.equals(e.value))

                     return true;

     return false;

     }

没有巧妙的方法，就是遍历哈希表。value也可以取值为null，用专门的函数containsNullValue，代码如下：

  /**

      * Special-case code for containsValue with null argument

      */

     private boolean containsNullValue() {

     Entry[] tab = table;

         for (int i = 0; i < tab.length ; i++)

             for (Entry e = tab[i] ; e != null ; e = e.next)

                 if (e.value == null)

                     return true;

     return false;

     }

也是遍历哈希表。

4.Entry类

键值对存储在一个Entry中，HashMap的存储结构就是一个Entry数组，索引是数组的下标。数组的每一项对应一个桶，每个桶是一个Entry单向列表。代码如下

  static class Entry<K,V> implements Map.Entry<K,V> {

         final K key;

         V value;

         Entry<K,V> next;

         final int hash;

         /**

          * Creates new entry.

          */

         Entry(int h, K k, V v, Entry<K,V> n) {

             value = v;

             next = n;

             key = k;

             hash = h;

         }

         public final K getKey() {

             return key;

         }

         public final V getValue() {

             return value;

         }

         public final V setValue(V newValue) {

         V oldValue = value;

             value = newValue;

             return oldValue;

         }

         public final boolean equals(Object o) {

             if (!(o instanceof Map.Entry))

                 return false;

             Map.Entry e = (Map.Entry)o;

             Object k1 = getKey();

             Object k2 = e.getKey();

             if (k1 == k2 || (k1 != null && k1.equals(k2))) {

                 Object v1 = getValue();

                 Object v2 = e.getValue();

                 if (v1 == v2 || (v1 != null && v1.equals(v2)))

                     return true;

             }

             return false;

         }

         public final int hashCode() {

             return (key==null   ? 0 : key.hashCode()) ^

                    (value==null ? 0 : value.hashCode());

         }

         public final String toString() {

             return getKey() + "=" + getValue();

         }

         /**

          * This method is invoked whenever the value in an entry is

          * overwritten by an invocation of put(k,v) for a key k that's already

          * in the HashMap.

          */

         void recordAccess(HashMap<K,V> m) {

         }

         /**

          * This method is invoked whenever the entry is

          * removed from the table.

          */

         void recordRemoval(HashMap<K,V> m) {

         }

     }

参考

http://www.cnblogs.com/skywang12345/p/3310835.html

http://www.codeceo.com/article/java-hashmap-learn.html

秒客网

浅析Java中HashMap的实现

概述