[原创]Android系统中常用JAVA类源码浅析之HashMap

由于是浅析，所以我只分析常用的接口，注意是Android系统中的JAVA类，可能和JDK的源码有区别。

首先从构造函数开始，

  1     /**
  2      * Min capacity (other than zero) for a HashMap. Must be a power of two
  3      * greater than 1 (and less than 1 << 30).
  4      */
  5     private static final int MINIMUM_CAPACITY = 4;
  6 
  7     /**
  8      * Max capacity for a HashMap. Must be a power of two >= MINIMUM_CAPACITY.
  9      */
 10     private static final int MAXIMUM_CAPACITY = 1 << 30;
 11 
 12     /**
 13      * An empty table shared by all zero-capacity maps (typically from default
 14      * constructor). It is never written to, and replaced on first put. Its size
 15      * is set to half the minimum, so that the first resize will create a
 16      * minimum-sized table.
 17      */
 18     private static final Entry[] EMPTY_TABLE
 19             = new HashMapEntry[MINIMUM_CAPACITY >>> 1];
 20 
 21     /**
 22      * The default load factor. Note that this implementation ignores the
 23      * load factor, but cannot do away with it entirely because it's
 24      * mentioned in the API.
 25      *
 26      * <p>Note that this constant has no impact on the behavior of the program,
 27      * but it is emitted as part of the serialized form. The load factor of
 28      * .75 is hardwired into the program, which uses cheap shifts in place of
 29      * expensive division.
 30      */
 31     static final float DEFAULT_LOAD_FACTOR = .75F;
 32 
 33     /**
 34      * The hash table. If this hash map contains a mapping for null, it is
 35      * not represented this hash table.
 36      */
 37     transient HashMapEntry<K, V>[] table;
 38 
 39     /**
 40      * The entry representing the null key, or null if there's no such mapping.
 41      */
 42     transient HashMapEntry<K, V> entryForNullKey;
 43 
 44     /**
 45      * The number of mappings in this hash map.
 46      */
 47     transient int size;
 48 
 49     /**
 50      * Incremented by "structural modifications" to allow (best effort)
 51      * detection of concurrent modification.
 52      */
 53     transient int modCount;
 54 
 55     /**
 56      * The table is rehashed when its size exceeds this threshold.
 57      * The value of this field is generally .75 * capacity, except when
 58      * the capacity is zero, as described in the EMPTY_TABLE declaration
 59      * above.
 60      */
 61     private transient int threshold;
 62 
 63     public HashMap() {
 64         table = (HashMapEntry<K, V>[]) EMPTY_TABLE;
 65         threshold = -1; // Forces first put invocation to replace EMPTY_TABLE
 66     }
 67 
 68     public HashMap(int capacity) {
 69         if (capacity < 0) {
 70             throw new IllegalArgumentException("Capacity: " + capacity);
 71         }
 72 
 73         if (capacity == 0) {
 74             @SuppressWarnings("unchecked")
 75             HashMapEntry<K, V>[] tab = (HashMapEntry<K, V>[]) EMPTY_TABLE;
 76             table = tab;
 77             threshold = -1; // Forces first put() to replace EMPTY_TABLE
 78             return;
 79         }
 80 
 81         if (capacity < MINIMUM_CAPACITY) {
 82             capacity = MINIMUM_CAPACITY;
 83         } else if (capacity > MAXIMUM_CAPACITY) {
 84             capacity = MAXIMUM_CAPACITY;
 85         } else {
 86             capacity = Collections.roundUpToPowerOfTwo(capacity);
 87         }
 88         makeTable(capacity);
 89     }
 90 
 91     public HashMap(int capacity, float loadFactor) {
 92         this(capacity);
 93 
 94         if (loadFactor <= 0 || Float.isNaN(loadFactor)) {
 95             throw new IllegalArgumentException("Load factor: " + loadFactor);
 96         }
 97 
 98         /*
 99          * Note that this implementation ignores loadFactor; it always uses
100          * a load factor of 3/4. This simplifies the code and generally
101          * improves performance.
102          */
103     }

通过三个构造函数的源码，我们可以知道：

HashMap内部实际上使用HashMapEntry数组来实现的。
当调用new HashMap()时，会创建容量为2的HashMapEntry数组，并且threshold为-1。
当调用HashMap(int capacity)时，HashMap会将传入的capacity转换成最小的大于等于capacity的2的次方，比如：capacity=25，会转换成32。并且threshold为总容量的75%，threshold的作用是当entry的数量大于threshold时，进行扩容。
HashMap(int capacity, float loadFactory)实际上和HashMap(int capacity)是一样的，loadFactory参数未被使用（注意这是Android做的修改，实际上JDK中会使用这个参数）。

既然是HashMapEntry数组实现的，我们简单看下这个Entry什么样，

static class HashMapEntry<K, V> implements Entry<K, V> {
        final K key;
        V value;
        final int hash;
        HashMapEntry<K, V> next;

        HashMapEntry(K key, V value, int hash, HashMapEntry<K, V> next) {
            this.key = key;
            this.value = value;
            this.hash = hash;
            this.next = next;
        }
}

这里注意关注next属性，有一定经验的朋友肯定知道，这是单向链表的实现，所以实现HashMap的数组的每一项其实是一个单向链表的Head，继续往下看，

接下来我们分析下put(K key, V value)方法，

 1     void addNewEntryForNullKey(V value) {
 2         entryForNullKey = new HashMapEntry<K, V>(null, value, 0, null);
 3     }
 4 
 5     private V putValueForNullKey(V value) {
 6         HashMapEntry<K, V> entry = entryForNullKey;
 7         if (entry == null) {
 8             addNewEntryForNullKey(value);
 9             size++;
10             modCount++;
11             return null;
12         } else {
13             preModify(entry);
14             V oldValue = entry.value;
15             entry.value = value;
16             return oldValue;
17         }
18     }
19 
20     private HashMapEntry<K, V>[] makeTable(int newCapacity) {
21         @SuppressWarnings("unchecked") HashMapEntry<K, V>[] newTable
22                 = (HashMapEntry<K, V>[]) new HashMapEntry[newCapacity];
23         table = newTable;
24         threshold = (newCapacity >> 1) + (newCapacity >> 2); // 3/4 capacity
25         return newTable;
26     }
27 
28     private HashMapEntry<K, V>[] doubleCapacity() {
29         HashMapEntry<K, V>[] oldTable = table;
30         int oldCapacity = oldTable.length;
31         if (oldCapacity == MAXIMUM_CAPACITY) {
32             return oldTable;
33         }
34         int newCapacity = oldCapacity * 2;
35         HashMapEntry<K, V>[] newTable = makeTable(newCapacity);
36         if (size == 0) {
37             return newTable;
38         }
39 
40         for (int j = 0; j < oldCapacity; j++) {
41             /*
42              * Rehash the bucket using the minimum number of field writes.
43              * This is the most subtle and delicate code in the class.
44              */
45             HashMapEntry<K, V> e = oldTable[j];
46             if (e == null) {
47                 continue;
48             }
49             int highBit = e.hash & oldCapacity;
50             HashMapEntry<K, V> broken = null;
51             newTable[j | highBit] = e;
52             for (HashMapEntry<K, V> n = e.next; n != null; e = n, n = n.next) {
53                 int nextHighBit = n.hash & oldCapacity;
54                 if (nextHighBit != highBit) {
55                     if (broken == null)
56                         newTable[j | nextHighBit] = n;
57                     else
58                         broken.next = n;
59                     broken = e;
60                     highBit = nextHighBit;
61                 }
62             }
63             if (broken != null)
64                 broken.next = null;
65         }
66         return newTable;
67     }
68 
69     @Override public V put(K key, V value) {
70         if (key == null) {
71             return putValueForNullKey(value);
72         }
73 
74         int hash = Collections.secondaryHash(key);
75         HashMapEntry<K, V>[] tab = table;
76         int index = hash & (tab.length - 1);
77         for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
78             if (e.hash == hash && key.equals(e.key)) {
79                 preModify(e);
80                 V oldValue = e.value;
81                 e.value = value;
82                 return oldValue;
83             }
84         }
85 
86         // No entry for (non-null) key is present; create one
87         modCount++;
88         if (size++ > threshold) {
89             tab = doubleCapacity();
90             index = hash & (tab.length - 1);
91         }
92         addNewEntry(key, value, hash, index);
93         return null;
94     }
95 
96     void addNewEntry(K key, V value, int hash, int index) {
97         table[index] = new HashMapEntry<K, V>(key, value, hash, table[index]);
98     }

从put(K key, V value)的源码我们可以得到如下信息：

当添加key为null的value时，会用单独的HashMapEntry entryForNullKey对象来储存。
entry数组的索引是通过hash算出来的：int index = hash & (tab.length - 1)。
当发生碰撞时（也就是算出的index上已经存在entry了），会首先检查是否是同一个hash和key，如果是则更新value，然后直接将old value返回。
新创建的entry会被设置成对应index上的链表Head。
当entry数量大于threshold（capacity的75%）时，对数组进行扩容，扩大为原来的2倍，并重新计算原数组中所有entry的index，然后复制到新数组中。

分析完put后，其他如get、remove、containsKey等接口就大同小异了，在此直接略过。

接下来我们看下Set<K> keySet()接口：

    @Override public Set<K> keySet() {
        Set<K> ks = keySet;
        return (ks != null) ? ks : (keySet = new KeySet());
    }

    private final class KeySet extends AbstractSet<K> {
        public Iterator<K> iterator() {
            return newKeyIterator();
        }
        public int size() {
            return size;
        }
        public boolean isEmpty() {
            return size == 0;
        }
        public boolean contains(Object o) {
            return containsKey(o);
        }
        public boolean remove(Object o) {
            int oldSize = size;
            HashMap.this.remove(o);
            return size != oldSize;
        }
        public void clear() {
            HashMap.this.clear();
        }
    }

    Iterator<K> newKeyIterator() { return new KeyIterator();   }

    private final class KeyIterator extends HashIterator
            implements Iterator<K> {
        public K next() { return nextEntry().key; }
    }

从源码中可以得出如下结论：

keySet返回的Set对象实际上和HashMap是强关联的，对Set接口的调用，实际上操作的还是HashMap。
Set中的iterator实际上也是实现自HashIterator。
entrySet()、valueSet()和keySet()的实现原理一样。

知道HashMap的实现原理后，我们就可以知道他的优缺点了：

优点：读写效率高，接近数组的索引方式。

缺陷：会占用大量的无效内存，为了减少碰撞，Entry数组的容量只能是2的N次幂，并且当entry数大于总容量的75%时就会扩容两倍。

如有问题，欢迎指出！

转载请注明出处。

相关阅读:
十大开源Web应用安全测试工具
 HC大会，华为联合合作伙伴发布一站式物联网IoT开发工具小熊派BearPi
漫谈边缘计算（四）：赢家是软还是硬
 漫谈边缘计算（三）：5G的好拍档
 漫谈边缘计算（二）：各怀心事的玩家
 漫谈边缘计算（一）：边缘计算是大势所趋
 从小小后视镜看物联网的生态（下）
机器学习笔记（四）---- 逻辑回归的多分类
 机器学习笔记（三）---- 逻辑回归（二分类）
机器学习笔记（二）---- 线性回归
原文地址：https://www.cnblogs.com/coding-way/p/5436180.html