• HashMap解惑


     HashMap中有一些我们容易忽视的点

    1. 关于key的hash和equals

    public V put(K key, V value) {
            if (table == EMPTY_TABLE) {
                inflateTable(threshold);
            }
            if (key == null)
                return putForNullKey(value);
            int hash = hash(key);
            int i = indexFor(hash, table.length);
            for (Entry<K,V> e = table[i]; e != null; e = e.next) {
                Object k;
                if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                    V oldValue = e.value;
                    e.value = value;
                    e.recordAccess(this);
                    return oldValue;
                }
            }
    
            modCount++;
            addEntry(hash, key, value, i);
            return null;
        }

     由上述代码知道,hash值是用来确定bucketIndex,equals是用来和链表上的值比较,因此对于key是自定义的类,强烈建议重写hashCode和equals方法。此处也是容易引起内存泄露的点。

    下面摘抄一段JDK里面的注释,

    /**
         * Returns a hash code value for the object. This method is
         * supported for the benefit of hash tables such as those provided by
         * {@link java.util.HashMap}.
         * <p>
         * The general contract of {@code hashCode} is:
         * <ul>
         * <li>Whenever it is invoked on the same object more than once during
         *     an execution of a Java application, the {@code hashCode} method
         *     must consistently return the same integer, provided no information
         *     used in {@code equals} comparisons on the object is modified.
         *     This integer need not remain consistent from one execution of an
         *     application to another execution of the same application.
         * <li>If two objects are equal according to the {@code equals(Object)}
         *     method, then calling the {@code hashCode} method on each of
         *     the two objects must produce the same integer result.
         * <li>It is <em>not</em> required that if two objects are unequal
         *     according to the {@link java.lang.Object#equals(java.lang.Object)}
         *     method, then calling the {@code hashCode} method on each of the
         *     two objects must produce distinct integer results.  However, the
         *     programmer should be aware that producing distinct integer results
         *     for unequal objects may improve the performance of hash tables.
         * </ul>
         * <p>
         * As much as is reasonably practical, the hashCode method defined by
         * class {@code Object} does return distinct integers for distinct
         * objects. (This is typically implemented by converting the internal
         * address of the object into an integer, but this implementation
         * technique is not required by the
         * Java<font size="-2"><sup>TM</sup></font> programming language.)
         *
         * @return  a hash code value for this object.
         * @see     java.lang.Object#equals(java.lang.Object)
         * @see     java.lang.System#identityHashCode
         */
        public native int hashCode();

    2. rehash的条件

    void addEntry(int hash, K key, V value, int bucketIndex) {
            if ((size >= threshold) && (null != table[bucketIndex])) {
                resize(2 * table.length);
                hash = (null != key) ? hash(key) : 0;
                bucketIndex = indexFor(hash, table.length);
            }
    
            createEntry(hash, key, value, bucketIndex);
        }

     if条件告诉我们rehash的条件要同时满足两个:map中元素个数不小于阀值即容量*负载因子,对应的bucketIndex处有元素。

     另外,如下代码作备忘,

    static int indexFor(int h, int length) {
            // assert Integer.bitCount(length) == 1 : "length must be a non-zero power of 2";
            return h & (length-1);
        }

    3. 可以插入(null, null)吗

    Map<String, String> map = new HashMap<String, String>();
            map.put(null, null);
            System.out.println(map.size()); // 1
    
    private V putForNullKey(V value) {
            for (Entry<K,V> e = table[0]; e != null; e = e.next) {
                if (e.key == null) {
                    V oldValue = e.value;
                    e.value = value;
                    e.recordAccess(this);
                    return oldValue;
                }
            }
            modCount++;
            addEntry(0, null, value, 0); // hash = 0, bucketIndex = 0
            return null;
        }
    
    

    注意,Hashtable和ConcurrentHashMap进行put时若value为null,将抛出NullPointerException。 

    4. table默认初始大小 - 16

    public HashMap(int initialCapacity, float loadFactor) {
            // ...
    
            this.loadFactor = loadFactor; // 0.75
            threshold = initialCapacity; // 16
            init(); // nothing
        }
    
    public V put(K key, V value) {
            if (table == EMPTY_TABLE) {
                inflateTable(threshold);
            }
            // ...
    }
    
    private void inflateTable(int toSize) {
            // Find a power of 2 >= toSize
            int capacity = roundUpToPowerOf2(toSize); // 16
    
            threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1);
            table = new Entry[capacity];
            initHashSeedAsNeeded(capacity);
        }
    
    private static int roundUpToPowerOf2(int number) {
            // assert number >= 0 : "number must be non-negative";
            return number >= MAXIMUM_CAPACITY
                    ? MAXIMUM_CAPACITY
                    : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;
        }

    5. 关于HashMap里的hash(Object key)方法

    final int hash(Object k) {
            int h = hashSeed;
            if (0 != h && k instanceof String) {
                return sun.misc.Hashing.stringHash32((String) k);
            }
    
            h ^= k.hashCode();
    
            // This function ensures that hashCodes that differ only by
            // constant multiples at each bit position have a bounded
            // number of collisions (approximately 8 at default load factor).
            h ^= (h >>> 20) ^ (h >>> 12);
            return h ^ (h >>> 7) ^ (h >>> 4);
        }
    
    /**
         * Initialize the hashing mask value. We defer initialization until we
         * really need it.
         */
        final boolean initHashSeedAsNeeded(int capacity) {
            boolean currentAltHashing = hashSeed != 0;
            boolean useAltHashing = sun.misc.VM.isBooted() &&
                    (capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);
            boolean switching = currentAltHashing ^ useAltHashing;
            if (switching) {
                hashSeed = useAltHashing
                    ? sun.misc.Hashing.randomHashSeed(this)
                    : 0;
            }
            return switching;
        }

     6.我来告诉大伙啥叫线程安全问题,比如,hashmap在非线程安全下可能放入重复键值对或抹掉其他线程将要放入的值,归根结底是是链表的不安全。解决线程安全,无非同步或CAS

     7.元素为什么放在链表的头部:因为单链表,放头部最快

     8.LinkedHashMap的实现:继承于HashMap,内部的Entry也继承于HashMap.Entry,增加了属性before,after,另外,重写了get,addEntry,createEntry方法,提供了iterator相关方法。

     9.TreeMap的实现,重点看put方法

    public V put(K key, V value) {
            Entry<K,V> t = root;
            if (t == null) {
                compare(key, key); // type (and possibly null) check
    
                root = new Entry<>(key, value, null);
                size = 1;
                modCount++;
                return null;
            }
            int cmp;
            Entry<K,V> parent;
            // split comparator and comparable paths
            Comparator<? super K> cpr = comparator;
            if (cpr != null) {
                do {
                    parent = t;
                    cmp = cpr.compare(key, t.key);
                    if (cmp < 0)
                        t = t.left;
                    else if (cmp > 0)
                        t = t.right;
                    else
                        return t.setValue(value);
                } while (t != null);
            }
            else {
                if (key == null)
                    throw new NullPointerException();
                Comparable<? super K> k = (Comparable<? super K>) key;
                do {
                    parent = t;
                    cmp = k.compareTo(t.key);
                    if (cmp < 0)
                        t = t.left;
                    else if (cmp > 0)
                        t = t.right;
                    else
                        return t.setValue(value);
                } while (t != null);
            }
            Entry<K,V> e = new Entry<>(key, value, parent);
            if (cmp < 0)
                parent.left = e;
            else
                parent.right = e;
            fixAfterInsertion(e);
            size++;
            modCount++;
            return null;
        }

      10. jdk7 ConcurrentHashMap的get,size,put方法

    public V get(Object key) {
            Segment<K,V> s; // manually integrate access methods to reduce overhead
            HashEntry<K,V>[] tab;
            int h = hash(key);
            long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
            if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
                (tab = s.table) != null) {
                for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
                         (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
                     e != null; e = e.next) {
                    K k;
                    if ((k = e.key) == key || (e.hash == h && key.equals(k)))
                        return e.value;
                }
            }
            return null;
        }
    public int size() {
            // Try a few times to get accurate count. On failure due to
            // continuous async changes in table, resort to locking.
            final Segment<K,V>[] segments = this.segments;
            int size;
            boolean overflow; // true if size overflows 32 bits
            long sum;         // sum of modCounts
            long last = 0L;   // previous sum
            int retries = -1; // first iteration isn't retry
            try {
                for (;;) {
                    if (retries++ == RETRIES_BEFORE_LOCK) {
                        for (int j = 0; j < segments.length; ++j)
                            ensureSegment(j).lock(); // force creation
                    }
                    sum = 0L;
                    size = 0;
                    overflow = false;
                    for (int j = 0; j < segments.length; ++j) {
                        Segment<K,V> seg = segmentAt(segments, j);
                        if (seg != null) {
                            sum += seg.modCount;
                            int c = seg.count;
                            if (c < 0 || (size += c) < 0)
                                overflow = true;
                        }
                    }
                    if (sum == last)
                        break;
                    last = sum;
                }
            } finally {
                if (retries > RETRIES_BEFORE_LOCK) {
                    for (int j = 0; j < segments.length; ++j)
                        segmentAt(segments, j).unlock();
                }
            }
            return overflow ? Integer.MAX_VALUE : size;
        }

      

    public V put(K key, V value) {
            Segment<K,V> s;
            if (value == null)
                throw new NullPointerException();
            int hash = hash(key);
            int j = (hash >>> segmentShift) & segmentMask;
            if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
                 (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment
                s = ensureSegment(j);
            return s.put(key, hash, value, false);
        }
    
    public V putIfAbsent(K key, V value) {
            Segment<K,V> s;
            if (value == null)
                throw new NullPointerException();
            int hash = hash(key);
            int j = (hash >>> segmentShift) & segmentMask;
            if ((s = (Segment<K,V>)UNSAFE.getObject
                 (segments, (j << SSHIFT) + SBASE)) == null)
                s = ensureSegment(j);
            return s.put(key, hash, value, true);
        }
    
    final V put(K key, int hash, V value, boolean onlyIfAbsent) {
                HashEntry<K,V> node = tryLock() ? null :
                    scanAndLockForPut(key, hash, value);
                V oldValue;
                try {
                    HashEntry<K,V>[] tab = table;
                    int index = (tab.length - 1) & hash;
                    HashEntry<K,V> first = entryAt(tab, index);
                    for (HashEntry<K,V> e = first;;) {
                        if (e != null) {
                            K k;
                            if ((k = e.key) == key ||
                                (e.hash == hash && key.equals(k))) {
                                oldValue = e.value;
                                if (!onlyIfAbsent) {
                                    e.value = value;
                                    ++modCount;
                                }
                                break;
                            }
                            e = e.next;
                        }
                        else {
                            if (node != null)
                                node.setNext(first);
                            else
                                node = new HashEntry<K,V>(hash, key, value, first);
                            int c = count + 1;
                            if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                                rehash(node);
                            else
                                setEntryAt(tab, index, node);
                            ++modCount;
                            count = c;
                            oldValue = null;
                            break;
                        }
                    }
                } finally {
                    unlock();
                }
                return oldValue;
            }
    
    /**
             * Scans for a node containing given key while trying to
             * acquire lock, creating and returning one if not found. Upon
             * return, guarantees that lock is held. UNlike in most
             * methods, calls to method equals are not screened: Since
             * traversal speed doesn't matter, we might as well help warm
             * up the associated code and accesses as well.
             *
             * @return a new node if key not found, else null
             */
            private HashEntry<K,V> scanAndLockForPut(K key, int hash, V value) {
                HashEntry<K,V> first = entryForHash(this, hash);
                HashEntry<K,V> e = first;
                HashEntry<K,V> node = null;
                int retries = -1; // negative while locating node
                while (!tryLock()) {
                    HashEntry<K,V> f; // to recheck first below
                    if (retries < 0) {
                        if (e == null) {
                            if (node == null) // speculatively create node
                                node = new HashEntry<K,V>(hash, key, value, null);
                            retries = 0;
                        }
                        else if (key.equals(e.key))
                            retries = 0;
                        else
                            e = e.next;
                    }
                    else if (++retries > MAX_SCAN_RETRIES) {
                        lock();
                        break;
                    }
                    else if ((retries & 1) == 0 &&
                             (f = entryForHash(this, hash)) != first) {
                        e = first = f; // re-traverse if entry changed
                        retries = -1;
                    }
                }
                return node;
            }
  • 相关阅读:
    Hadoop专有名词
    mapreduce的cleanUp和setUp的特殊用法(TopN问题)和常规用法
    Hadoop 中文编码相关问题 -- mapreduce程序处理GBK编码数据并输出GBK编码数据
    MapReduce核心
    技术提升
    Linq使用group by
    C#中DllImport用法
    mysql分组后将未分组的列合并成行GROUP BY,GROUP_CONCAT
    MVC之CodeFirst
    MVC的Forms登录验证
  • 原文地址:https://www.cnblogs.com/ainima/p/6331709.html
Copyright © 2020-2023  润新知