LinkedHashMap、LinkedHashSet源码学习笔记

一、LinkedHashMap

概述

理解LinkedHashMap的关键点在于清楚HashMap的迭代并没有顺序，LinkedHashMap利用了一个额外的双向链表来维护一个特定的迭代顺序，而存储键值和HashMap一样用的散列表作为数据结构。

HashMap的遍历得到的key并不是有特定顺序的。稍微回顾一下keySet方法如何获得key的集合：

    public Set<K> keySet() {
        Set<K> ks;
return (ks = keySet) == null ? (keySet = new KeySet()) : ks;
    }

返回一个KeySet类，该类继承AbstractSet。

    final class KeySet extends AbstractSet<K> {
public final int size()                 { return size; }
public final void clear()               { HashMap.this.clear(); }
public final Iterator<K> iterator()     { return new KeyIterator(); }
public final boolean contains(Object o) { return containsKey(o); }
public final boolean remove(Object key) {
return removeNode(hash(key), key, null, false, true) != null;
        }
//以下省略
    }

继续向上找发现该类最终实现了Iterable接口。iterator方法返回了一个KeyIterator迭代器。找到这个迭代器。

    final class KeyIterator extends HashIterator
implements Iterator<K> {
public final K next() { return nextNode().key; }
    }

这个迭代器的hasNext方法是继承父类的，next方法也使用了父类的方法，查看迭代器父类HashIterator。其构造器中先找到了数组中第一条不为空的拉链。nextNode方法是按数组下标从小到大的顺序访问每一条拉链，并且在每一条拉链中按顺序遍历每一个节点。

    abstract class HashIterator {
        Node<K,V> next;        // next entry to return
        Node<K,V> current;     // current entry
int expectedModCount;  // for fast-fail
int index;             // current slot

        HashIterator() {
            expectedModCount = modCount;
            Node<K,V>[] t = table;
            current = next = null;
index = 0;
//找到第一条不为空的链表，并作标记
if (t != null && size > 0) { // advance to first entry
                do {} while (index < t.length && (next = t[index++]) == null);
            }
        }

public final boolean hasNext() {
return next != null;
        }

final Node<K,V> nextNode() {
            Node<K,V>[] t;
            Node<K,V> e = next;
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
if (e == null)
throw new NoSuchElementException();
//每一次顺序返回每一条链表上的节点
if ((next = (current = e).next) == null && (t = table) != null) {
                do {} while (index < t.length && (next = t[index++]) == null);
            }
return e;
        }
    }

所以HashMap的遍历和put顺序无关，只和最后实际存放位置有关。我们既要保证查找和常数时间成正比（用散列表存储），又要保证迭代顺序和插入顺序或访问顺序相关，看来我们要维护一个额外的数据结构来保证迭代顺序。这就是LinkedHashMap和HashMap的不同之处——额外维护一个双向链表实现插入顺序或访问顺序的迭代。默认是按插入顺序排序。

继承的类和接口

public class LinkedHashMap<K,V>
extends HashMap<K,V>
implements Map<K,V>

数据结构

Entry继承了HashMap的Node，添加了before、after两个Entry类型的引用变量。并且维护head、tail两个Entry类型的成员作为双向链表的头节点和尾节点。

    static class Entry<K,V> extends HashMap.Node<K,V> {
        Entry<K,V> before, after;
        Entry(int hash, K key, V value, Node<K,V> next) {
super(hash, key, value, next);
        }
    }
/**
 * The head (eldest) of the doubly linked list.
 */
    transient LinkedHashMap.Entry<K,V> head;

/**
 * The tail (youngest) of the doubly linked list.
 */
    transient LinkedHashMap.Entry<K,V> tail;

构造函数

看两个典型的。initialCapacity和loadFactor都和HashMap的含义相同。还有一个构造器可以设置accessOrder，如果为true可以按照访问顺序迭代，默认为false。

    public LinkedHashMap(int initialCapacity, float loadFactor) {
super(initialCapacity, loadFactor);
        accessOrder = false;
    }
public LinkedHashMap(int initialCapacity,
float loadFactor,
boolean accessOrder) {
super(initialCapacity, loadFactor);
this.accessOrder = accessOrder;
    }

存储

LinkedHashMap并没有重写父类的put方法，但是重写了newNode方法，查看HashMap的putVal方法(put方法中)，当在key所属拉链中不包含该需要放入的key时，p.next = newNode(hash, key, value, null);这个语句新建一个节点放在拉链的末尾。由于多态查找过程，最终将调用LinkedHashMap的newNode方法，该方法如下：

    Node<K,V> newNode(int hash, K key, V value, Node<K,V> e) {
        LinkedHashMap.Entry<K,V> p =
new LinkedHashMap.Entry<K,V>(hash, key, value, e);
        linkNodeLast(p);
return p;
    }

除了新建一个节点，还调用了linkNodelast方法，这个方法操作的是双向链表，也就是将需要插入的key更新在了双向链表的末尾。

    // link at the end of list
private void linkNodeLast(LinkedHashMap.Entry<K,V> p) {
        LinkedHashMap.Entry<K,V> last = tail;
        tail = p;
if (last == null)
            head = p;
else {
            p.before = last;
            last.after = p;
        }
    }

如果当需要插入的key存在的时候（更新伴随着访问），putVal中不仅更新了value而且还调用了afterNodeAccess方法，这个方法在HashMap中是空方法，在LinkedHashMap中重写了这个方法，如果在构造器中设置了accessOder为true，则最新访问的节点被取出，并插入末尾。

    void afterNodeAccess(Node<K,V> e) { // move node to last
        LinkedHashMap.Entry<K,V> last;
//如果是按访问顺序，保证最近访问的节点放在双向链表最后（如果key存在，put也可能访问节点，所以在put中可能调用）
if (accessOrder && (last = tail) != e) {
            LinkedHashMap.Entry<K,V> p =
                (LinkedHashMap.Entry<K,V>)e, b = p.before, a = p.after;
            p.after = null;
if (b == null)
                head = a;
else
                b.after = a;
if (a != null)
                a.before = b;
else
                last = b;
if (last == null)
                head = p;
else {
                p.before = last;
                last.after = p;
            }
            tail = p;
            ++modCount;
        }
    }

在putVal方法的最后调用了afterNodeInsertion方法，这个方法也是在LinkedHashMap中重写的，在这个方法中检查是否需要删除最老元素，如果满足规则，则删除最老的元素，也就是开头的元素。

    void afterNodeInsertion(boolean evict) { // possibly remove eldest
        LinkedHashMap.Entry<K,V> first;
if (evict && (first = head) != null && removeEldestEntry(first)) {
            K key = first.key;
            removeNode(hash(key), key, null, false, true);
        }
    }

if判断中的removeEldestEntry方法总是返回false，也就是说默认LinkedHashMap是不会删除最老节点的。用户可以根据自己的需求继承LinkedHashMap并重写这个方法，我们可以看到LRU算法就和LinkedHashMap的这个方法有关。

    protected boolean removeEldestEntry(Map.Entry<K,V> eldest) {
return false;
    }

访问

调用HashMap的getNode方法查找节点，如果key存在而且设置了按访问顺序迭代，调用afterNodeAccess方法更新双向链表。

    public V get(Object key) {
        Node<K,V> e;
if ((e = getNode(hash(key), key)) == null)
return null;
if (accessOrder)
            afterNodeAccess(e);
return e.value;
    }

测试

结合测试用例来看体会双向链表的存储过程的源码会更加形象。将accessOrder设为false，有进行访问节点的put操作和get操作都没有改变迭代顺序，迭代的顺序还是插入顺序。

        LinkedHashMap<String, String> map = 
                new LinkedHashMap<String, String>();
        map.put("1", "1");
        map.put("2", "2");
        map.put("3", "3");
        map.put("4", "4");

        map.put("1", "11");
        map.put("5", "5");
        map.get("2");

        Iterator iterator = map.entrySet().iterator();
        while(iterator.hasNext()){
            Map.Entry entry = (Map.Entry)iterator.next();
            System.out.println(entry.getKey() + "=" + entry.getValue());
        }

结果是：

1=11
2=2
3=3
4=4
5=5

将accessOrder设为true，有进行访问节点的put操作和get操作改变了迭代顺序，访问到的节点先从原双向链表中删除，然后添加到了双向链表的末尾。

        LinkedHashMap<String, String> map = 
new LinkedHashMap<String, String>(16, (float)0.75, true);
map.put("1", "1");
map.put("2", "2");
map.put("3", "3");
map.put("4", "4");

map.put("1", "11");
map.put("5", "5");
map.get("2");

        Iterator iterator = map.entrySet().iterator();
while(iterator.hasNext()){
Map.Entry entry = (Map.Entry)iterator.next();
            System.out.println(entry.getKey() + "=" + entry.getValue());

结果是：

3=3
4=4
1=11
5=5
2=2

二、LinkedHashSet

概述

理解LinkedHashSet的关键点在于其底部是LinkedHashSet，除了和HashSet一样保存不重复键之外，还可以按插入顺序迭代。

构造器

    public LinkedHashSet(int initialCapacity, float loadFactor) {
super(initialCapacity, loadFactor, true);
    }

/**
 * Constructs a new, empty linked hash set with the specified initial
 * capacity and the default load factor (0.75).
 *
 * @param initialCapacity the initial capacity of the LinkedHashSet
 * @throws IllegalArgumentException if the initial capacity is less
 * than zero
 */
public LinkedHashSet(int initialCapacity) {
super(initialCapacity, .75f, true);
    }

/**
 * Constructs a new, empty linked hash set with the default initial
 * capacity (16) and load factor (0.75).
 */
public LinkedHashSet() {
super(16, .75f, true);
    }

/**
 * Constructs a new linked hash set with the same elements as the
 * specified collection. The linked hash set is created with an initial
 * capacity sufficient to hold the elements in the specified collection
 * and the default load factor (0.75).
 *
 * @param c the collection whose elements are to be placed into
 * this set
 * @throws NullPointerException if the specified collection is null
 */
public LinkedHashSet(Collection<? extends E> c) {
super(Math.max(2*c.size(), 11), .75f, true);
        addAll(c);
    }

这几乎是LinkedHashSet的全部代码，所有的构造器都会调用到父类HashSet的一个构造器，该构造器创建了一个以插入顺序为迭代顺序的LinkedHashMap，也就是说LinkedHashSet的底层是LinkedHashMap。

    HashSet(int initialCapacity, float loadFactor, boolean dummy) {
        map = new LinkedHashMap<>(initialCapacity, loadFactor);
    }

存储

和HashSet基于HashMap一样，LinkedHashSet的add方法也基于LinkedHashMap的put方法，只不过LinkedHashSet可以按照键的添加顺序迭代，而HashSet不行。

    public boolean add(E e) {
return map.put(e, PRESENT)==null;
    }

三、总结

LinkedHashMap在HashMap的基础上增加了一个双向链表用来维护键的插入顺序或访问顺序。

LinkedHashSet基于LinkedHashMap，所以可以（只能）按照插入顺序迭代。

秒客网