一、哈希表

哈希表是一种可以快速定位得数据结构。哈希表可以做到平均查找、插入、删除时间是O(1)，当然这是指不发生Hash碰撞得情况。而哈希表最大得缺陷就是哈希值得碰撞(collision)。

Hash碰撞：就是指hash桶有多个元素了。常见解决哈希碰撞得方法就是在hash桶后面加个链表

这里就引入第一个问题：为什么Map的底层设计要采用哈希表的这种数据结构？

HashMap设计时，要求其key不能重复。所以每次往HashMap设置值时，需要对HashMap现在容器所有key进行筛选，以保证不会对HashMap设置同样得key。

如果以普通if、else来判断得话，这样查找效率会非常得低。如果里面元素个数达到几百得话，程序得运行效率就会非常低。而这种情况，很符合哈希表得解决场景。

哈希表得一种实现方式就是数组+链表。这也是java7HashMap底层哈希表的实现方式。

二、`JDK1.7`的`HashMap`

1、相关概念

（1）初始容量

/**

 * The default initial capacity - MUST be a power of two.

 */

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

（2）负载因素

/**

 * The load factor used when none specified in constructor.

 */

static final float DEFAULT_LOAD_FACTOR = 0.75f;

（3）阙值

/**

 * The next size value at which to resize (capacity * load factor).

 * @serial

 */

// If table == EMPTY_TABLE then this is the initial capacity at which the

// table will be created when inflated.

int threshold;

（4）hash种子

/**

 * A randomizing value associated with this instance that is applied to

 * hash code of keys to make hash collisions harder to find. If 0 then

 * alternative hashing is disabled.

 */

// 标志位，是jdk为了解决hashMap安全隐患做的补丁

transient int hashSeed = 0;

2、源码分析

（1）put

public V put(K key, V value) {

	/* hashMap默认是延迟初始化，只有在用的时候，才会开辟空间 */

	// 如果hash表为空，则开始扩容hash表

	if (table == EMPTY_TABLE) {

		inflateTable(threshold);

	}

	// 对key为null的相关操作

	if (key == null)

		return putForNullKey(value);

	// 计算key的hash值

	int hash = hash(key);

	// 计算key的hash值对应哈希桶的下标

	int i = indexFor(hash, table.length);

	// 对哈希桶内的链表进行更新操作：可以看到如果是第一次插入，返回的就是null

	for (Entry<K,V> e = table[i]; e != null; e = e.next) {

		Object k;

		if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {

			V oldValue = e.value;

			e.value = value;

			e.recordAccess(this);

			return oldValue;

		}

	}

}

（2）inflateTable：扩充哈希表

private void inflateTable(int toSize) {

	// Find a power of 2 >= toSize

    /* 这里会发现即使你输入hashMap容量不是2的n次方，这里也会强制变成2的n次方 */

    // 计算哈希表桶的个数

	int capacity = roundUpToPowerOf2(toSize);

	threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);

	table = new Entry[capacity];

    // 按需初始化hash种子，也是解决hashMap安全隐患做的补丁

	initHashSeedAsNeeded(capacity);

}

（3）indexFor：计算key在哈希桶的下标

static int indexFor(int h, int length) {

	// assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";

	return h & (length-1);

}

这个方法就是为什么hashMap的初始值要设置2的n次方的原因。它可以快速得到key要放在hash桶的下标。

（4）扩容方法

当hashMap的size大于threshold时，就会扩容。而threshold = 容量 * 阈值

低效
线程不安全

void resize(int newCapacity) {

	Entry[] oldTable = table;

	int oldCapacity = oldTable.length;

	if (oldCapacity == MAXIMUM_CAPACITY) {

		threshold = Integer.MAX_VALUE;

		return;

	}

	Entry[] newTable = new Entry[newCapacity];

    // 将旧哈希表的元素转移到新的哈希表；这里是死锁的原因

	transfer(newTable, initHashSeedAsNeeded(newCapacity));

	table = newTable;

	threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);

}

3、`jdk1.7`的`HashMap`中的缺点

（1）死锁问题

死锁问题是因为用户将HashMap应用到多线程的环境下。详情参考：https://coolshell.cn/articles/9606.html

死锁问题是由于HashMap在扩容时，里面元素的顺序可能没有按照之前顺序排列，从而导致环形链表，造成死锁情况。

（2）潜在的安全隐患

tomcat底层接收参数是使用hashTable接收，假设黑客使用特意组合的参数，使用所有的参数都挂载在同一个哈希桶中。这就会使得哈希表退化成链表。这其实就是由于哈希碰撞的导致。

如果黑客使用精心设计的参数（这些参数可以根据jdk公开的哈希算法测试出来），通过HTTP请求，使用tomcat内部产生大量链表，消耗服务器的CPU，就会产生DoS（Denial of Service）。

因此：

Tomcat也对这种情况进行讨论。tomcat邮件

jdk1.7也对这种情况做了处理。例如：在initHashSeedAsNeeded方法中，不使用jdk中String公开的哈希算法，而是使用其他的哈希算法来避免上述出现的问题。

final boolean initHashSeedAsNeeded(int capacity) {

	// 判断是否需要使用hash算法

	boolean currentAltHashing = hashSeed != 0;

	boolean useAltHashing = sun.misc.VM.isBooted() &&

		(capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);

	boolean switching = currentAltHashing ^ useAltHashing;

	if (switching) {

		// 初始化hash种子

		hashSeed = useAltHashing

			? sun.misc.Hashing.randomHashSeed(this)

			: 0;

	}

	return switching;

}

三、`JDK1.8`的`HashMap`源码

JDK1.8中对HashMap进行了很大优化：

使用数组+链表/红黑树的数据结构，解决了底层哈希表的安全隐患；
在扩容时，通过对插入顺序的改进优化了死锁问题；

1、新增概念

（1）树化的阙值

    /**

     * The bin count threshold for using a tree rather than list for a

     * bin.  Bins are converted to trees when adding an element to a

     * bin with at least this many nodes. The value must be greater

     * than 2 and should be at least 8 to mesh with assumptions in

     * tree removal about conversion back to plain bins upon

     * shrinkage.

     */

    static final int TREEIFY_THRESHOLD = 8;

（2）退树化的阈值

    /**

     * The bin count threshold for untreeifying a (split) bin during a

     * resize operation. Should be less than TREEIFY_THRESHOLD, and at

     * most 6 to mesh with shrinkage detection under removal.

     */

    static final int UNTREEIFY_THRESHOLD = 6;

2、源码阅读

（1）putVal()

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {

        Node<K,V>[] tab; Node<K,V> p; int n, i;

        // 判断哈希表是否为空，为空就创建

        if ((tab = table) == null || (n = tab.length) == 0)

            n = (tab = resize()).length;

        // 如果是第一个节点，就往这个哈希桶添加一个元素

        if ((p = tab[i = (n - 1) & hash]) == null)

            tab[i] = newNode(hash, key, value, null);

        else {

            // 如果不是第一个节点

            Node<K,V> e; K k;

            //

            if (p.hash == hash &&

                ((k = p.key) == key || (key != null && key.equals(k))))

                e = p;

            // 判断节点是不是树节点

            else if (p instanceof TreeNode)

                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);

            else {

                // 该节点就是链表

                for (int binCount = 0; ; ++binCount) {

                    if ((e = p.next) == null) {

                        p.next = newNode(hash, key, value, null);

                        // 如果达到可以红黑树化的量，就将链表转换为树

                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st

                            treeifyBin(tab, hash);

                        break;

                    }

                    if (e.hash == hash &&

                        ((k = e.key) == key || (key != null && key.equals(k))))

                        break;

                    p = e;

                }

            }

            if (e != null) { // existing mapping for key

                V oldValue = e.value;

                if (!onlyIfAbsent || oldValue == null)

                    e.value = value;

                afterNodeAccess(e);

                return oldValue;

            }

        }

        ++modCount;

        if (++size > threshold)

            resize();

        afterNodeInsertion(evict);

        return null;

    }

（2）resize：扩容方法

     final Node<K,V>[] resize() {

        Node<K,V>[] oldTab = table;

        int oldCap = (oldTab == null) ? 0 : oldTab.length;

        int oldThr = threshold;

        int newCap, newThr = 0;

        if (oldCap > 0) {

            if (oldCap >= MAXIMUM_CAPACITY) {

                threshold = Integer.MAX_VALUE;

                return oldTab;

            }

            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&

                     oldCap >= DEFAULT_INITIAL_CAPACITY)

                newThr = oldThr << 1; // double threshold

        }

        else if (oldThr > 0) // initial capacity was placed in threshold

            newCap = oldThr;

        else {               // zero initial threshold signifies using defaults

            newCap = DEFAULT_INITIAL_CAPACITY;

            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);

        }

        if (newThr == 0) {

            float ft = (float)newCap * loadFactor;

            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?

                      (int)ft : Integer.MAX_VALUE);

        }

        threshold = newThr;

        @SuppressWarnings({"rawtypes","unchecked"})

        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];

        table = newTab;

        if (oldTab != null) {

            for (int j = 0; j < oldCap; ++j) {

                Node<K,V> e;

                if ((e = oldTab[j]) != null) {

                    oldTab[j] = null;

                    if (e.next == null)

                        newTab[e.hash & (newCap - 1)] = e;

                    else if (e instanceof TreeNode)

                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);

                    else {

                        // preserve order：保持顺序，这里仅仅是降低死锁的概率，并没有完全解决死锁问题

                        // 低位链表

                        Node<K,V> loHead = null, loTail = null;

                        // 高位链表

                        Node<K,V> hiHead = null, hiTail = null;

                        Node<K,V> next;

                        do {

                            next = e.next;

                            if ((e.hash & oldCap) == 0) {

                                if (loTail == null)

                                    loHead = e;

                                else

                                    loTail.next = e;

                                loTail = e;

                            }

                            else {

                                if (hiTail == null)

                                    hiHead = e;

                                else

                                    hiTail.next = e;

                                hiTail = e;

                            }

                        } while ((e = next) != null);

                        if (loTail != null) {

                            loTail.next = null;

                            newTab[j] = loHead;

                        }

                        if (hiTail != null) {

                            hiTail.next = null;

                            newTab[j + oldCap] = hiHead;

                        }

                    }

                }

            }

        }

        return newTab;

    }

四、相关面试题分析

HashMap默认容量
HashMap如何扩容
HashMap为什么是线程不安全的？

为什么`HashMap`的容量必须是`2`的`n`次方？

1、即使你在构造函数时，不传2的n次方，在后来的初始化方法中，也会强制变成2的n次方

1.7:put => inflateTable

1.8:HashMap => tableSizeFor

2、让元素能够快速定位哈希桶；让Hash表元素分布均匀，减少哈希碰撞的几率

（1）快速定位哈希桶

在哈希表实现中，有一个很重要的问题：如何将hash值映射到几个哈希桶里呢？对应java来说，就是int范围的数如何映射到哈希桶里呢？

很常见的解决方案是：取模，即int % n。但是这种方案有俩个缺点：

负数取模仍然是负数，这里要确保hash值一定是正数；
相比较HashMap的实现，这种方案性能稍微差一点；

1.7:indexFor(int h, int length){

    return h & (length - 1);// 记住这行代码，java8是直接用这段代码的

}

（2）为什么能让Hash表元素分布均匀，减少哈希碰撞的几率？

leng - 1 等价于2n - 1，其值格式是：11111...，这样它与hash计算时，得到哈希桶下标，完全取决于hash，只要确保hash算法比较均衡，那么哈希表元素就会比较均衡了。

3、方便扩容

在扩容之后，所有的元素需要重新计算其所在哈希桶的位置。只要保证哈希桶容量是2的幂，其原先的元素要么保持不变，要么被移动到旧index+旧桶大小的位置上。

`indexFor`方法除了`HashMap`的实现方式，还有其他方式吗？跟`HashMap`的实现方式相比，有哪些不足？

我们知道java中对哈希值的实现是根据每个对象的hashCode方法，而这个方法会可能返回42亿个可能性。而我们可以通过indexFor方法将这些42亿个可能性引入到16个哈希桶中。

除了HashMap内部的实现方式，还可以使用取模方法来达到效果。但要注意取模中负号的可能性！！！

    static int indexFor(int hash) {

        if (hash < 0) {

            hash = -hash;

        }

        return hash % 16;

    }

这种方式想比较HashMap原生的实现方式比较，就是低效。不仅仅是在这里计算时低效，而且还有可能会造成严重的哈希碰撞。

`jdk1.8`后，为什么要以8作为转化树的阙值？为什么要选择0.75作为负载因子？

算法，在空间和时间的折中问题

五、相关应用

我在项目遇到要依次匹配method然后引入对应的处理类。相关代码如下：

public class AppRestFactory {

    private static final String METHOD_WO_ = "";//方法名

    /**

     * 工作中心人员的查询

     */

    private static final String METHOD_WORKCENTER_CONTACT = "getWorkCenterInfoByContact";

    /**

     * 工作中心人员的查询

     */

    private static final String METHOD_TODOINFOLIST = "getTodoInfoList";

    /**

     * 获取逻辑处理实例

     *

     * @param method

     * @return

     */

    public static AppRestService createService(String method) {

        if (METHOD_POSITION.equalsIgnoreCase(method)){

            return SpringUtils.getBean(LocationServiceImpl.class);

        }

        if (METHOD_GETWOINFOLIST.equalsIgnoreCase(method)){

            return SpringUtils.getBean(WorkOrderServiceImpl.class);

        }

        return null;

    }

}

这里就可能存在if、else查找效率问题。假设最后一个if才是要找method的话，那么程序的运行效率就会非常低。这还是未完成的情况，假设后期这个method被补充到上百个，那效率就会及其低下。这里其实就可以使用HashMap底层的哈希表的数据结构来优化。代码如下：

public class AppRestFactory {

    /**

     * 存储APP端方法method得映射关系

     * todo:API数量确定后，固定HashMap得长度，防止hashMao扩容时，发生错误

     */

    private static final Map<String, Class> mapper = new HashMap<>(100);

    // 初始化method与其工作类得映射关系

    static {

        // 添加退回物料的信息

        mapper.put("addMaterialsReturnInfo", WoMaterialServiceImpl.class);

        //

        mapper.put("getBomAosProductsList", BomServiceImpl.class);

    }

    /**

     * 获取逻辑处理实例

     *

     * @param method

     * @return

     */

    public static AppRestService createService(String method) {

        return (AppRestService) SpringUtils.getBean(mapper.get(method));

    }

}

原博客地址

秒客网

HashMap源码与相关面试题

一、哈希表

二、`JDK1.7`的`HashMap`

1、相关概念

2、源码分析

3、`jdk1.7`的`HashMap`中的缺点

三、`JDK1.8`的`HashMap`源码

1、新增概念

2、源码阅读

四、相关面试题分析

为什么`HashMap`的容量必须是`2`的`n`次方？

`indexFor`方法除了`HashMap`的实现方式，还有其他方式吗？跟`HashMap`的实现方式相比，有哪些不足？

`jdk1.8`后，为什么要以8作为转化树的阙值？为什么要选择0.75作为负载因子？

五、相关应用

相关文章

HashMap源码与相关面试题

一、哈希表

二、JDK1.7的HashMap

1、相关概念

2、源码分析

3、jdk1.7的HashMap中的缺点

三、JDK1.8的HashMap源码

1、新增概念

2、源码阅读

四、相关面试题分析

为什么HashMap的容量必须是2的n次方？

indexFor方法除了HashMap的实现方式，还有其他方式吗？跟HashMap的实现方式相比，有哪些不足？

jdk1.8后，为什么要以8作为转化树的阙值？为什么要选择0.75作为负载因子？

五、相关应用

相关文章

二、`JDK1.7`的`HashMap`

3、`jdk1.7`的`HashMap`中的缺点

三、`JDK1.8`的`HashMap`源码

为什么`HashMap`的容量必须是`2`的`n`次方？

`indexFor`方法除了`HashMap`的实现方式，还有其他方式吗？跟`HashMap`的实现方式相比，有哪些不足？

`jdk1.8`后，为什么要以8作为转化树的阙值？为什么要选择0.75作为负载因子？