hashMap源码分析1--翻译

时间:2022-06-03 23:27:10
* Hash table based implementation of the <tt>Map</tt> interface.  This
* implementation provides all of the optional map operations, and permits
* <tt>null</tt> values and the <tt>null</tt> key. (The <tt>HashMap</tt>
* class is roughly equivalent to <tt>Hashtable</tt>, except that it is
* unsynchronized and permits nulls.) This class makes no guarantees as to
* the order of the map; in particular, it does not guarantee that the order
* will remain constant over time.

HashMap 继承自 接口Map. 这个实现类提供了接口Map所有的方法,并且允许空的value和空的key.
HashMap这个类 和类HashTable 类似,除了HashMap是异步的并且允许null值.
HashMap不保证map中元素的顺序,尤其,它不保证元素的顺序始终保持不变.

* <p>This implementation provides constant-time performance for the basic
* operations (<tt>get</tt> and <tt>put</tt>), assuming the hash function
* disperses the elements properly among the buckets. Iteration over
* collection views requires time proportional to the "capacity" of the
* <tt>HashMap</tt> instance (the number of buckets) plus its size (the number
* of key-value mappings). Thus, it's very important not to set the initial
* capacity too high (or the load factor too low) if iteration performance is
* important.

HashMap为基础操作(put,get)提供了 稳定的性能.
假设hash函数将元素适当的分散在桶(数组对象)中,迭代它需要和HashMap实例容量(capacity)及键值对数量(size)成比例的时间分配,
因此,初始化的时候,不宜将 容量(capacity)设置的太高或者太低很重要.

* <p>An instance of <tt>HashMap</tt> has two parameters that affect its
* performance: <i>initial capacity</i> and <i>load factor</i>. The
* <i>capacity</i> is the number of buckets in the hash table, and the initial
* capacity is simply the capacity at the time the hash table is created. The
* <i>load factor</i> is a measure of how full the hash table is allowed to
* get before its capacity is automatically increased. When the number of
* entries in the hash table exceeds the product of the load factor and the
* current capacity, the hash table is <i>rehashed</i> (that is, internal data
* structures are rebuilt) so that the hash table has approximately twice the
* number of buckets.

一个HashMap有两个参数(parameters):初始化容量(initial capacity)  +   负载因子(load factor);
初始化容量是指 HashMap实例的 数组的长度,负载因子 是一个测量指标,指当HashMap实例的size占容量的比例为多少时,自动扩增,导致容量(capacity)翻倍.

* <p>As a general rule, the default load factor (.75) offers a good
* tradeoff between time and space costs. Higher values decrease the
* space overhead but increase the lookup cost (reflected in most of
* the operations of the <tt>HashMap</tt> class, including
* <tt>get</tt> and <tt>put</tt>). The expected number of entries in
* the map and its load factor should be taken into account when
* setting its initial capacity, so as to minimize the number of
* rehash operations. If the initial capacity is greater than the
* maximum number of entries divided by the load factor, no rehash
* operations will ever occur.

通常,当负载因子为0.75时,会在时间和消耗中间取得一个较好的平衡.更高的负载因子虽然会减少空间开销,但会增加查找成本(反应在HashMap实例的大多数操作上面,包括get方法和put方法).
在设置初始化容量(initial capacity)时,应考虑map中元素的个数和 负载因子,尽量设置合理的初始化容量,这样能减少HashMap实例对象进行重新散列的操作次数.
如果初始化容量(initial capacity)比最大size(元素数)除以负载因子还大,那么永远不需要rehash(散列)操作.

* <p>If many mappings are to be stored in a <tt>HashMap</tt>
* instance, creating it with a sufficiently large capacity will allow
* the mappings to be stored more efficiently than letting it perform
* automatic rehashing as needed to grow the table. Note that using
* many keys with the same {@code hashCode()} is a sure way to slow
* down performance of any hash table. To ameliorate impact, when keys
* are {@link Comparable}, this class may use comparison order among
* keys to help break ties.

如果大量的键值对存储在HashMap实例中,设置足够大的初始化容量(initial capacity)比 不设置初始化容量(initial capacity),让HashMap实例自动扩增,更加高效.
请注意,大量使用key的散列值(hashCode())一样的键值对 会降低 对该对实例的操作效率.为了改善这种影响,key最好能实现Comparable接口,这个类能够降低这种影响.

* <p><strong>Note that this implementation is not synchronized.</strong>
* If multiple threads access a hash map concurrently, and at least one of
* the threads modifies the map structurally, it <i>must</i> be
* synchronized externally. (A structural modification is any operation
* that adds or deletes one or more mappings; merely changing the value
* associated with a key that an instance already contains is not a
* structural modification.) This is typically accomplished by
* synchronizing on some object that naturally encapsulates the map.

请注意这个实现(HashMap)不  是异步的.如果大量线程同时使用一个hashMap实例,最少会有一个线程会修改这个hashMap实例的结构.
因此,它需要增加额外的异步封装(一个结构性的修改包含任何增加或者删除键值对的操作,仅仅修改已有的一个键值对的值,并不算是结构性的修改),
这通常是在一些自然封装map的对象的异步完成.

* If no such object exists, the map should be "wrapped" using the
* {@link Collections#synchronizedMap Collections.synchronizedMap}
* method. This is best done at creation time, to prevent accidental
* unsynchronized access to the map:<pre>
* Map m = Collections.synchronizedMap(new HashMap(...));</pre>

如果这样的对象并不存在,map对象就应该使用Collections.synchronizedMap {@link Collections#synchronizedMap Collections.synchronizedMap}方法 来进行包裹.
这个操作最好在创建Map对象时进行,防止对这个Map对象进行了意外的非阻塞的操作.   (例:Map m = Collections.synchronizedMap(new HashMap(...));)

* <p>The iterators returned by all of this class's "collection view methods"
* are <i>fail-fast</i>: if the map is structurally modified at any time after
* the iterator is created, in any way except through the iterator's own
* <tt>remove</tt> method, the iterator will throw a
* {@link ConcurrentModificationException}. Thus, in the face of concurrent
* modification, the iterator fails quickly and cleanly, rather than risking
* arbitrary, non-deterministic behavior at an undetermined time in the
* future.

迭代器将会被该类的集合视图方法返回是 fail-fast机制,
fail-fast机制:如果这个map对象在创建迭代器后的任何时间里发生了结构性改变(除了迭代器 remove 的方式外),这个迭代器将会抛出异常({@link ConcurrentModificationException}),
因此,当面对多线程同事对map对象进行修改操作时,迭代器将会快速简洁的失败,而不是冒着在未来的某个不确定的时间里产生不确定性影响的风险.

* <p>Note that the fail-fast behavior of an iterator cannot be guaranteed
* as it is, generally speaking, impossible to make any hard guarantees in the
* presence of unsynchronized concurrent modification. Fail-fast iterators
* throw <tt>ConcurrentModificationException</tt> on a best-effort basis.
* Therefore, it would be wrong to write a program that depended on this
* exception for its correctness: <i>the fail-fast behavior of iterators
* should be used only to detect bugs.</i>

请注意, 迭代器的fail-fast机制不能被保证,一般来说,迭代器在进行异步的结构修改操作中,不能做出任何确定的保证.当发生Fail-fast时,迭代器将会尽量抛出异常(ConcurrentModificationException),
因此,我们在代码中不能依赖这个异常(ConcurrentModificationException)来确保代码的正确性.
迭代器的fail-fast机制应该被用来进行bug的探测.