前言
HashMap source code view
类注释
Hash table based implementation of the Map interface. This implementation provides all of the optional map operations, and permits null values and the null key. (The HashMap class is roughly equivalent to Hashtable, except that it is unsynchronized and permits nulls.) This class makes no guarantees as to the order of the map; in particular, it does not guarantee that the order will remain constant over time.
Hash table基于Map接口实现。这个实现提供了所有map的操作选项,并且允许null值和null键。(对于HashMap来说,除了不同步和允许存放null,其他几乎与HashTable一样)这个类不能保证map的中的顺序,尤其是不能保证顺序永远是恒定不变的。
This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets. Iteration over collection views requires time proportional to the "capacity" of the HashMap instance (the number of buckets) plus its size (the number of key-value mappings). Thus, it's very important not to set the initial capacity too high (or the load factor too low) if iteration performance is important.
这个类的实现,对于基本的操作(get和put)提供了常数级别的时间复杂度,假设哈希方法能分散各个元素到每个桶中。迭代所有集合元素所需时间与这个HashMap实例的容量(桶的数量)和大小(key-value映射表的数量)成比例。因此,如果迭代性能很重要,那么设置过高的初始容量(或者过低的扩容因子)都会有很大影响。
An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.
一个HashMap的实例有两个参数影响它的性能:初始容量和扩容因子。容量是指哈希表中桶的数量,而初始容量是指当哈希表被创建时的容量。扩容因子是指,在容量自动增加之前哈希表能被装的多满。当哈希表中entry数量超过的扩容因子和当前容量的乘积,那么哈希表将会执行rehash操作(rehash是指内部数据结构重建),以便哈希表有大约两倍桶的数量。
As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.
一般的规则是,默认的扩容因子(0.75)提供了一个权衡时间和空间花费后的值。扩容因子更高,减少了空间的浪费但增加了查询的花费(相对于 大多数对于HashMap的操作来说,包括get和put方法)。当设置初始容量时,应该考虑一下预计所能存放的对象数量和扩容因子,以便达到最小的rehash操作次数。如果初始容量比最大entry数量除以扩容因子还要大,将不会发生rehash操作。
If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table. Note that using many keys with the same {@code hashCode()} is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are {@link Comparable}, this class may use comparison order among keys to help break ties.
如果很多的映射关系将被存放在HashMap的实例中,一开始创建足够大的容量去存放映射关系,这将比它自动rehash扩容来的更有效率。需要注意使用相同hashCode的很多键去存放一定会减慢这个哈希表的性能。为了改善这样的影响,当键是可以比较的,这个类可能使用比较顺序去帮助断开连接。
Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the map.
注意这个实现是不同步的。如果多线程同时去访问一个HashMap,并且至少一个线程在修改这个structurally的结构,它必须在外部使它同步。(一个结构改变的操作是指任意增加或者删除一个或多个映射关系的操作;仅仅改变一个这个哈希表已经包含键对应的值,这样的操作不是一个改变结构的操作)典型的方法是通过一些对象同步来完成的其中自然也包括了map。
If no such object exists, the map should be "wrapped" using the {@link Collections#synchronizedMap Collections.synchronizedMap} method. This is best done at creation time, to prevent accidental unsynchronized access to the map:
如果没有这样的对象存在,那么这个map应该使用Collections.synchronizedMap方法。为了避免不同步的对这个map的访问,这个操作最好在创建的时候做
Map m = Collections.synchronizedMap(new HashMap(...));
The iterators returned by all of this class's "collection view methods" are fail-fast: if the map is structurally modified at any time after the iterator is created, in any way except through the iterator's own remove method, the iterator will throw a {@link ConcurrentModificationException}. Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.
迭代器返回这个类的所有“集合视图方法”都是快速失败的:意思是,如果一个map的结构在迭代器创建之后被修改,除了使用迭代器本身的remove方法,其他都将会抛出ConcurrentModificationException异常。因此,在面对并发修改时,与其冒着风险去执行,在不确定的时间和不确定的行为导致失败,不如快速的让迭代器失败。
Note that the fail-fast behavior of an iterator cannot be guaranteed as it is, generally speaking, impossible to make any hard guarantees in the presence of unsynchronized concurrent modification. Fail-fast iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.
注意快速失败行为迭代器是不能保证的,一般来说,不能对任何不同步并发修改做任何硬性保证。快速失败,迭代器会尽力抛出ConcurrentModificationException异常。因此,依赖这个异常处理区编码去保证正确性是不对的:迭代器的快速失败行为应该只是被用于侦测bug。