关于HashMap平时用的多, 面试的时候问的也多, 会问Hash碰撞, Hash表扩容, Hash表的整体实现数据结构, 自己用的时候也会遇到一些问题, 用多个线程去处理HashMap时会发生一点奇怪的东西, 有时候是百思不得其解, 不过确也能知其一二, 之前也看过几遍源码, 但是也是过三两天就忘得差不多了, 现记录下, 便日后方便反复阅读理解.
HashMap数据结构 : 数组 + 链表 + 红黑树 (JDk1.8)
略有粗糙, 不过也能看.
JDk1.8的hashmap由数组+链表+红黑树组成, 数组中的一个元素叫做bucket(桶), 一个桶里面可以有多个节点,桶可由链表或者红黑树组成. 关于何时使用链表, 何时使用红黑树在下面的属性中会有说明.
首先介绍一下各个属性值, 有默认的, 也有后期需要使用到的.
★ 红黑树有关的三个值
/** * The bin count threshold for using a tree rather than list for a * bin. Bins are converted to trees when adding an element to a * bin with at least this many nodes. The value must be greater * than 2 and should be at least 8 to mesh with assumptions in * tree removal about conversion back to plain bins upon * shrinkage. * 桶的树化阈值, 当一个桶中的节点数量不少于这个值的话, * 就从链表转化成红黑树, 桶中所有节点由链表节点转化成红黑树节点 */ static final int TREEIFY_THRESHOLD = 8; /** * The bin count threshold for untreeifying a (split) bin during a * resize operation. Should be less than TREEIFY_THRESHOLD, and at * most 6 to mesh with shrinkage detection under removal. * 桶由树转化成链表的还原阈值 * 在扩容的时候,桶中的元素小于这个值的话就会把桶中的红黑树转化(还原,切分)成链表 */ static final int UNTREEIFY_THRESHOLD = 6; /** * The smallest table capacity for which bins may be treeified. * (Otherwise the table is resized if too many nodes in a bin.) * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts * between resizing and treeification thresholds. * 哈希表的最小树化容量 * 当整个哈希表的容量大于这个值, 桶才能进行树形化, 否则桶不会进行树形化 */ static final int MIN_TREEIFY_CAPACITY = 64;
/** * The table, initialized on first use, and resized as * necessary. When allocated, length is always a power of two. * (We also tolerate length zero in some operations to allow * bootstrapping mechanics that are currently not needed.) * 数组表 */ transient HashMap.Node<K,V>[] table; /** * Holds cached entrySet(). Note that AbstractMap fields are used * for keySet() and values(). * 缓存条目集 */ transient Set<Map.Entry<K,V>> entrySet; /** * The number of key-value mappings contained in this map. * map当前的大小 */ transient int size; /** * The number of times this HashMap has been structurally modified * Structural modifications are those that change the number of mappings in * the HashMap or otherwise modify its internal structure (e.g., * rehash). This field is used to make iterators on Collection-views of * the HashMap fail-fast. (See ConcurrentModificationException). * modCount用于记录HashMap的修改次数, * 在HashMap的put(),get(),remove(),Interator()等方法中,都使用了该属性 * 由于HashMap不是线程安全的,所以在迭代的时候,会将modCount赋值到迭代器的expectedModCount属性中,然后进行迭代, * 如果在迭代的过程中HashMap被其他线程修改了,modCount的数值就会发生变化, * 这个时候expectedModCount和ModCount不相等, * 迭代器就会抛出ConcurrentModificationException()异常 */ transient int modCount; /** * The next size value at which to resize (capacity * load factor). * 下一个要调整大小的阈值, 这个值会等于 容量 x 负载因子 * @serial */ int threshold; /** * The load factor for the hash table. * 扩容所用的负载因子 * @serial */ final float loadFactor;
Map<String, Object> mymap = new HashMap<String, Object>();
/** * Constructs an empty <tt>HashMap</tt> with the default initial capacity * (16) and the default load factor (0.75). * 默认无参构造函数把默认的负载因子给到了 mymap */ public HashMap() { this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted }
这样一个HashMap实例mymap就生成了, 我们给mymap中加点东西.
mymap.put("hello", "world");
去到put方法中瞧一瞧, 直接从外部点击进去会到Map接口中的方法, 直接找到在HashMap中的实现,
/** * Associates the specified value with the specified key in this map. * If the map previously contained a mapping for the key, the old * value is replaced. * * @param key key with which the specified value is to be associated * @param value value to be associated with the specified key * @return the previous value associated with <tt>key</tt>, or * <tt>null</tt> if there was no mapping for <tt>key</tt>. * (A <tt>null</tt> return can also indicate that the map * previously associated <tt>null</tt> with <tt>key</tt>.) */ public V put(K key, V value) { return putVal(hash(key), key, value, false, true); }
static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }
再是调用了putVal()方法, 并且最后两个参数的值传了一个false, 一个true, 进去看看:
/** * Implements Map.put and related methods. * * @param hash hash for key 传进来的key的哈希值 * @param key the key 传进来的key * @param value the value to put 传进来的值 * @param onlyIfAbsent if true, don't change existing value 如果是真, 则不改变已存在的值 * @param evict if false, the table is in creation mode. 这个值可以给插入后的操作执行的方法提供一个判断 * @return previous value, or null if none 返回原来key的value值 */ final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) { /** * 此处我传进来的值 * key : hello * value : world * onlyIfAbsent : false * evict : true */ /** * 定义局部变量 * tab : 局部数组表 * param : 找到对应i下标的Node * tabLength : 局部数组表的长度 * i : tab的下标 */ HashMap.Node<K,V>[] tab; HashMap.Node<K,V> param; int tabLength, i; /** * 此时table是一个空数组 * tab 为null, 执行 resize() 方法(*此方法后续分析) */ if ((tab = table) == null || (tabLength = tab.length) == 0){ /** * 把重新分配的表赋值给tab */ tabLength = (tab = resize()).length; } /** * tab长度-1后与key的hash值得到一个下标赋值给i * 获取tab中该下标的值, 如果为null, * 则执行newNode()方法生成一个Node节点并且将节点赋值给tab[i] */ if ((param = tab[i = (tabLength - 1) & hash]) == null){ /** * 生成一个Node */ tab[i] = newNode(hash, key, value, null); } /** * 接下来的else分支操作是替换旧值 */ else { /** * 临时节点 */ HashMap.Node<K,V> okNode; /** * 临时key */ K tmpK; /** * 首先对比 找到的节点的hash值是否和传进来的一致; * 如果一致, 继续比较 * 找到的节点的key值是否和传入的key相等==, 相等判断的是引用地址 * 如果不相等, 判断一下传入的key是否和找到的节点的key equals, * equals方法判断的依据默认也是判断引用地址是否相等, 但是很多类像String, * 或者自定义的类, 都有可能重写equals() 方法, * 这里加上这么一个判断对传入Object类型的key做了一个是否equals校验的宽容性检查, * 如果一个自定义对象, 重写了equals方法, 重写了 hashcode方法, 并且只要该对象的两个实例的id相同, * 就判定两个实例相等, 这里就起了关键作用, 因为两个对象实例的引用地址一般都不相等, 除非直接赋值引用. * * 找到节点的hahs值和传入的hash值相等 而且 找到的节点的key和传入的key也相等,则判定该key存在 */ if (param.hash == hash && ((tmpK = param.key) == key || (key != null && key.equals(tmpK)))) { /** * 将找到的节点赋给临时节点 */ okNode = param; } /** * 继续判断找到的节点是否是树节点,也就是红黑树节点的实例 */ else if (param instanceof HashMap.TreeNode) { /** * 如果是红黑树的节点, 则调用putTreeVal()方法更新tab, hash, key, value等需要更新的值 * 并且返回旧的树节点 */ okNode = ((HashMap.TreeNode<K,V>)param).putTreeVal(this, tab, hash, key, value); } else { /** * 链表节点的判断和赋值操作 * 链表操作, 不做过多解释, * 嘿嘿, 给自己留点思考的空间吧, 留着以后自己看的时候思考一下 */ for (int binCount = 0; ; ++binCount) { if ((okNode = param.next) == null) { param.next = newNode(hash, key, value, null); if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st { /** * 链表树形化操作 */ treeifyBin(tab, hash); } break; } if (okNode.hash == hash && ((tmpK = okNode.key) == key || (key != null && key.equals(tmpK)))) break; param = okNode; } } /** * 被返回的树节点, 不是树节点就是链表节点 */ if (okNode != null) { // existing mapping for key /** * 旧节点的值 */ V oldValue = okNode.value; /** * 根据 onlyIfAbsent 值来决定返回的value * 如果值没有改变返回的值就是传进来的value自己 */ if (!onlyIfAbsent || oldValue == null) { okNode.value = value; } /** * 空方法 */ afterNodeAccess(okNode); return oldValue; } } /** * 操作数自增1 */ ++modCount; /** * 表的容量自增1之后如果大于要扩容的阈值, 则继续重新计算大小 */ if (++size > threshold) { resize(); } /** * 一个空方法, 用户可以自行实现 */ afterNodeInsertion(evict); /** * 此时的值是一个新的值, 所以没有旧的值, 返回null */ return null; }
// Callbacks to allow LinkedHashMap post-actions void afterNodeAccess(HashMap.Node<K,V> p) { } void afterNodeInsertion(boolean evict) { } void afterNodeRemoval(HashMap.Node<K,V> p) { }
/** * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and the default load factor (0.75). * * @param initialCapacity the initial capacity. * @throws IllegalArgumentException if the initial capacity is negative. * 传入一个初始容量值, 使用默认的负载因子 */ public HashMap(int initialCapacity) { /** * 调用的下面的这个构造函数 */ this(initialCapacity, DEFAULT_LOAD_FACTOR); } /** * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and the default load factor (0.75). * * @param initialCapacity the initial capacity. * @throws IllegalArgumentException if the initial capacity is negative. * 传入一个初始容量值, 使用默认的负载因子 */ public HashMap(int initialCapacity) { /** * 调用的下面的这个构造函数 */ this(initialCapacity, DEFAULT_LOAD_FACTOR); } /** * Constructs an empty <tt>HashMap</tt> with the specified initial * capacity and load factor. * * @param initialCapacity the initial capacity * @param loadFactor the load factor * @throws IllegalArgumentException if the initial capacity is negative * or the load factor is nonpositive * 传入自定义的初始容量, 传入自定义的负载因子 */ public HashMap(int initialCapacity, float loadFactor) { /** * 初始容量不能小于0 * 否则抛出异常 */ if (initialCapacity < 0) { throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity); } /** * 初始容量如果大于最大允许容量, 则使用最大允许容量 */ if (initialCapacity > MAXIMUM_CAPACITY) { initialCapacity = MAXIMUM_CAPACITY; } /** * 负载因子只能是大于0的浮点数 * 非法的负载因子会抛出异常 */ if (loadFactor <= 0 || Float.isNaN(loadFactor)) { throw new IllegalArgumentException("Illegal load factor: " + loadFactor); } this.loadFactor = loadFactor; /** * 调用方法计算初始容量的 2 的 幂的值 * 容量大小只允许为2的倍数 */ this.threshold = tableSizeFor(initialCapacity); } /** * Returns a power of two size for the given target capacity. * 对给定的容量, 计算出接近该值2倍大小幂的值 */ static final int tableSizeFor(int cap) { int n = cap - 1; n |= n >>> 1; n |= n >>> 2; n |= n >>> 4; n |= n >>> 8; n |= n >>> 16; return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1; } /** * Constructs a new <tt>HashMap</tt> with the same mappings as the * specified <tt>Map</tt>. The <tt>HashMap</tt> is created with * default load factor (0.75) and an initial capacity sufficient to * hold the mappings in the specified <tt>Map</tt>. * * @param m the map whose mappings are to be placed in this map * @throws NullPointerException if the specified map is null * 初始化时直接传入一个map, 使用默认的负载因子 */ public HashMap(Map<? extends K, ? extends V> m) { this.loadFactor = DEFAULT_LOAD_FACTOR; /** * 将传入的map的条目放进新的map中 */ putMapEntries(m, false); } /** * Implements Map.putAll and Map constructor. * * @param m the map * @param evict false when initially constructing this map, else * true (relayed to method afterNodeInsertion). */ final void putMapEntries(Map<? extends K, ? extends V> m, boolean evict) { /** * 传入map 的大小 */ int size = m.size(); if (size > 0) { /** * 先看看自身table是不是null */ if (table == null) { // pre-size float ft = ((float)size / loadFactor) + 1.0F; int t = ((ft < (float)MAXIMUM_CAPACITY) ? (int)ft : MAXIMUM_CAPACITY); if (t > threshold) threshold = tableSizeFor(t); } /** * 看看大小是否超过了下一个要调整的大小, 超过了则重新计算 */ else if (size > threshold) { resize(); } /** * 使用entrySet遍历并且put * entrySet 这里值得追溯一下 , 仔细点可以看到其实没有一个地方显式的设置entrySet的值, * 在put操作里面没有找到相关给entrySet设置值的代码, 这里面的值是怎么来的值得追寻 */ for (Map.Entry<? extends K, ? extends V> e : m.entrySet()) { K key = e.getKey(); V value = e.getValue(); putVal(hash(key), key, value, false, evict); } } }
上面说到entrySet()的值是怎么来的, 其实是从抽象类 AbstractCollection<E> 中的 toString()方法来的;
怎么说呢, 在需要使用entrySet()的时候就会使用到这个方法:
public String toString() { Iterator<E> it = iterator(); if (! it.hasNext()) return "[]"; StringBuilder sb = new StringBuilder(); sb.append('['); for (;;) { E e = it.next(); sb.append(e == this ? "(this Collection)" : e); if (! it.hasNext()) return sb.append(']').toString(); sb.append(',').append(' '); } }
这个方法又会调用 iterator() 方法:
final class EntrySet extends AbstractSet<Map.Entry<K,V>> { public final int size() { return size; } public final void clear() { HashMap.this.clear(); } public final Iterator<Map.Entry<K,V>> iterator() { return new EntryIterator(); } public final boolean contains(Object o) { if (!(o instanceof Map.Entry)) return false; Map.Entry<?,?> e = (Map.Entry<?,?>) o; Object key = e.getKey(); Node<K,V> candidate = getNode(hash(key), key); return candidate != null && candidate.equals(e); } public final boolean remove(Object o) { if (o instanceof Map.Entry) { Map.Entry<?,?> e = (Map.Entry<?,?>) o; Object key = e.getKey(); Object value = e.getValue(); return removeNode(hash(key), key, value, true, true) != null; } return false; } public final Spliterator<Map.Entry<K,V>> spliterator() { return new EntrySpliterator<>(HashMap.this, 0, -1, 0, 0); } public final void forEach(Consumer<? super Map.Entry<K,V>> action) { Node<K,V>[] tab; if (action == null) throw new NullPointerException(); if (size > 0 && (tab = table) != null) { int mc = modCount; for (int i = 0; i < tab.length; ++i) { for (Node<K,V> e = tab[i]; e != null; e = e.next) action.accept(e); } if (modCount != mc) throw new ConcurrentModificationException(); } } }
Iterator()方法又会new一个EntryIterator , 所以其实是懒加载返回了下一个节点:
final class EntryIterator extends HashIterator implements Iterator<Map.Entry<K,V>> { public final Map.Entry<K,V> next() { return nextNode(); } }