再挖一挖ThreadLocal

关于ThreadLocal源码分析的文章可以说多如牛毛，不仅是由于ThreadLocal源码简洁，还因为它是由Java界的两个大师级的作者编写，Josh Bloch和Doug Lea。Josh Bloch 在 Sun 公司多年为 Java 平台作出了杰出贡献，包括JDK5语言增强、Java集合(Collections)框架，现在 Google 就职，是获奖图书《Effective Java》及《Effective Java: Second Edition》的作者。Doug Lea是JUC包的作者，Java并发编程的泰斗。笔者写这篇博客起初是因为一直对ThreadLocal中的魔数0x61c88647有疑惑，为使这块内容较完整，也就按照通常的内容结构来写了，然而在写的过程中再次感受到，即使再熟悉的东西，仔细思考总结，依然收获满满，推荐小伙伴们也动笔写起来～

一. ThreadLocal做什么的
二 ThreadLocal设计原理
三. ThreadLocalMap源码解析
四. 最佳实践与内存泄漏
五. 结语
Reference

一. ThreadLocal做什么的

ThreadLocal类可以看作为线程提供局部变量的工具类，也就是说如果定义了一个ThreadLocal，每个线程往这个ThreadLocal中读写是线程隔离的，互相之间不会影响，是线程安全的。与局部变量对应的就是共享变量了，如果多个线程共享一个变量，并发环境下读写共享变量是线程不安全的，为处理这种并发问题，常用做法就是加锁。加锁还是使用局部变量就需要我们根据实际情况去权衡了。

我们先看下ThreadLocal的用法，以下是源码中提供的例子：为每个线程生成唯一id，线程第一次访问ThreadLocal会触发initialValue()方法，因此线程调用ThreadId.get()时得到唯一id：

import java.util.concurrent.atomic.AtomicInteger;

 public class ThreadId {
     // Atomic integer containing the next thread ID to be assigned
     private static final AtomicInteger nextId = new AtomicInteger(0);

     // Thread local variable containing each thread's ID
     private static final ThreadLocal<Integer> threadId = new ThreadLocal<Integer>() {
             	@Override 
     					protected Integer initialValue() {
              	return nextId.getAndIncrement();
         }
     };

     // Returns the current thread's unique ID, assigning it if necessary
     public static int get() {
         return threadId.get();
     }
 }

二 ThreadLocal设计原理

2.1 ThreadLocal与线程

由表及里，我们先来回顾下ThreadLocal的主要方法：

很简单，是不是？get()方法源码如下，

    /**
     * Returns the value in the current thread's copy of this
     * thread-local variable.  If the variable has no value for the
     * current thread, it is first initialized to the value returned
     * by an invocation of the {@link #initialValue} method.
     *
     * @return the current thread's value of this thread-local
     */
    public T get() {
      // 获得当前线程，然后从当前线程中拿到ThreadLocalMap
        Thread t = Thread.currentThread();
        ThreadLocalMap map = getMap(t);
        if (map != null) {
          // 根据当前ThreadLocal引用从ThreadLocalMap中拿到Entry对象，返回其value
            ThreadLocalMap.Entry e = map.getEntry(this);
            if (e != null) {
                @SuppressWarnings("unchecked")
                T result = (T)e.value;
                return result;
            }
        }
        return setInitialValue();
    }

    /**
     * Get the map associated with a ThreadLocal. Overridden in
     * InheritableThreadLocal.
     *
     * @param  t the current thread
     * @return the map
     */
    ThreadLocalMap getMap(Thread t) {
        return t.threadLocals;
    }

而线程Thread.java中有ThreadLocal.ThreadLocalMap的实例变量threadLocals，也就是说每个线程有一个自己的ThreadLocalMap：

ThreadLocal.ThreadLocalMap threadLocals = null;

上面的源码很好理解，我们已经可以看出ThreadLocal的精髓是在ThreadLocalMap了，后面我们会细致解读ThreadLocalMap的源码。

2.2 开放寻址与线性探查

ThreadLocalMap，名字虽然是map，但却和java.util.Map不一样，我们先来看看其存储结构:

static class ThreadLocalMap {
  		// 继承弱引用，使作为key的ThreadLocal<?> k设置为弱引用
        static class Entry extends WeakReference<ThreadLocal<?>> {
            // ThreadLocal变量存储的值 
            Object value;
            Entry(ThreadLocal<?> k, Object v) {
                super(k);
                value = v;
            }
        }

        // 可以扩容，但table.length必须是2的幂次方
        private Entry[] table;
  
  			// table的初始容量 
        private static final int INITIAL_CAPACITY = 16;

        // table中Entry的数量 
        private int size = 0;

        // 扩容阈值
        private int threshold; // Default to 0
  
  		// 设置阈值，这里的len即使table.length，初始化时是INITIAL_CAPACITY，扩容后则是新的table长度
        private void setThreshold(int len) {
            threshold = len * 2 / 3;
        }

ThreadLocalMap使用散列法，散列函数即是ThreadLocal对象的threadLocalHashCode（我们后面仔细研究）对table.length取模，从而计算出在table的位置i。但由于table.length总是2的幂次方，为提高运算效率，使用位运算来替代取模，即下图所示，这种操作在jdk中非常常见，如HashMap，即使人为初始化的长度不是2的幂次方，也会调整为2的幂次方：

即然是散列，就会有冲突，HashMap中使用链接法（chaining），即冲突的元素存放在链表中，而ThreadLocalMap使用开放寻址法（open addressing），即所有的元素都存放在散列表中，同时使用线性探查（linear probing）来解决探查序列问题，见以下ThreadLocalMap代码及2.3小节图：

			// 开放寻址中的线性探测法(linear probing)，循环数组向后寻址，走到table.length - 1的位置再从0继续
        private static int nextIndex(int i, int len) {
            return ((i + 1 < len) ? i + 1 : 0);
        }
  
 		 	// 循环数组向前寻址，走到0的位置，再从table.length - 1的位置继续 */
        private static int prevIndex(int i, int len) {
            return ((i - 1 >= 0) ? i - 1 : len - 1);
        }

2.3 弱引用的作用

ThreadLocalMap还有一个非常值得关注的点，即Entry中的弱引用。我们都知道，java内存管理中对象能否被回收与引用有很大关系，引用由强到弱分别为：强引用，软引用，弱引用，虚引用。对于弱引用的对象，只要GC运行就会被回收。如果这里使用普通的形式来定义存储结构，就会造成Entry节点的生命周期与线程强绑定，只要线程没有销毁，那么节点在GC分析中一直处于可达状态，没办法被回收。而使用弱引用，当某个ThreadLocal已经没有强引用可达，就会被垃圾回收，在ThreadLocalMap里对应的Entry的键值会失效，即null，便于ThreadLocalMap使用自带带垃圾清理机制进行清理，至于如何清理我们后面在看源码时就清楚了。

至此，我们已经可以大致勾勒出ThreadLocal的设计。下面是我绘制的示意图：

ThreadLocal开放寻址与弱引用

2.4 魔数0x61c88647

散列通常分两个步骤：通过哈希函数计算哈希值，之后根据哈希值计算所在的槽位（slot），不同于HashMap中用户自定义哈希函数，每个ThreadLocal实例在初始化时被赋值threadLocalHashCode，可以看作是该ThreadLocal实例的id，这个值是根据上一个ThreadLocal实例的threadLocalHashCode累加0x61c88647得到的，即：

// 原子操作，从0开始
private static AtomicInteger nextHashCode =
  new AtomicInteger();

// 生成threadLocalHashCode的间隙为0x61c88647，使依次生成的ThreadLocal的id（threadLocalHashCode）较为均匀地分布在2的幂次方大小的数组中
private static final int HASH_INCREMENT = 0x61c88647;

private final int threadLocalHashCode = nextHashCode();

private static int nextHashCode() {
  return nextHashCode.getAndAdd(HASH_INCREMENT);

那么为什么累加0x61c88647呢？给threadLocalHashCode设置一个随机数，貌似也可以啊？

Fibonacci Hashing

关于Fibonacci Hashing的资料不多，按照自己的理解在此整理。首先我们来看下Fibonacci序列，逐步了解下这种思想在哈希中的应用：

[F(n) = F(n-1) + F(n-2) ]

得到的序列为：

[1 quad 1quad 2quad 3quad 5quad 8quad 13quad 21quad 34quad 55 quad89 quad ..... ]

观察这个序列，(5/8=0.625)，(21/34=0.6176...)，(55/89=0.61797...)，当(n)趋近于无穷时，这个序列前一项与后一项的比值越来越接近(1 / phi = frac{sqrt{5} - 1}{2}approx 0.618)（为什么是(frac{sqrt{5} - 1}{2})，这里就暂不证明了，有兴趣可以去查阅资料），或者说这个序列后一项与前一项的比值越来越接近黄金比例(phi approx 1.618)，我们都知道，黄金比例的应用层面相当广阔，数学、物理、建筑、美术甚至是音乐，不过这和我们今天讨论的Fibonacci Hashing有什么关系呢？不着急，我们再来看看自然界中一种神奇的现象：

自然界中，植物的叶子或者花瓣都希望能得到更多的阳光，那怎么办呢？

没错，让一根茎上的叶子或者一朵花的花瓣尽量不重叠。哈哈，植物也是这么想的。

那么怎么做到不重叠呢？叶子该怎么一片一片长出来？

左一片，右一片，左一片，右一片。。。不好，左右两边都有重叠了。

如果植物知道自己一根茎上只长8片叶子，那么按照(360^o / 8)这个角度均匀长一圈就好了，可是植物恐怕算数不好，又或者算数太累，而且也不保证一根茎上能长多少片叶子，但还希望所有叶子尽量不重叠，该怎么办呢？

我们上面提到的(phi = 1.6180...)又派上用场了，如果每次生成一片叶子或者一个花瓣就旋转(360^o/phi approx 222.5^o)呢？见下图：

有没有惊叹大自然的鬼斧神工？

好了，我们回到正题，既然我们想把一些ThreadLocal对象分散到(2^N)个槽（slot）上，整数范围(N)最大是32，java里我们用long型表示32位无符号整数，范围是([0,2^{32}-1])，再转成有符号整数，范围是([-2^{31},2^{31}-1])，如果将ThreadLocal的id设置在整数范围，我们来算下(2^{32} / phi)，以及由符号整数又是什么？

输出如下：

而有符号的32位整数的黄金分割是-1640531527，如下图所示：

ThreadLocal中的魔数0x61c88647对应的十进制为1640531527，1640531527和-1640531527，正负号有关系么？

没关系，ThreadLocal的id一直在累加，递增的方向相反而已，就像我们上面的花瓣生长图，顺时针或逆时针旋转(222.5^o)都可以，我们也可以看看其他数字的效果，来对比一下：

//    // 可以尝试其他的数字，观察一下
//    private static final int HASH_INCREMENT = 0x61c88641;
//    private static final int HASH_INCREMENT = 0x61c88643;
//    private static final int HASH_INCREMENT = 0x61c88645;
//    private static final int HASH_INCREMENT = 0x61c88648;
//    private static final int HASH_INCREMENT = 0x61c88649;

    private static final int HASH_INCREMENT = 0x61c88647;
//    private static final int HASH_INCREMENT = -0x61c88647;

    public static void main(String[] args) {
        magicHash(16); //初始大小16
        magicHash(128); //扩容3次后
    }

    private static void magicHash(int size){
        int hashCode = 0;
        for(int i = 0; i < size; i++){
            hashCode = hashCode + HASH_INCREMENT;
            // 根据size的大小选取部分低位二进制，作为槽
            int slot = hashCode & (size-1);
            System.out.print(slot + " ");
        }
        System.out.println();
    }

HASH_INCREMENT = 0x61c88647运行结果部分截图如下：

HASH_INCREMENT = -0x61c88647运行结果部分截图如下：

可以看出魔数累加得到的id截取低位之后也仍能保持均匀，如果使用随机数来设置ThreadLocal的id不能保证这样的均匀结果。同时能看到正负号的影响确实是递增方向而已。1640531527 mod 16为7，槽位以7为基数自增向后排列，1640531527 mod 128为71，槽位以71为基数自增向后排列。我们也可以尝试其他数字作为魔数，来观察一下槽位分布情况，比如HASH_INCREMENT = 0x61c88648，会有重复：

而HASH_INCREMENT = 0x61c88641，又不够分散：

因此与其他数字相比较，1640531527作为魔数得到的槽位较更为分散，分散且均匀，不正是我们想要的么。

在网上看到一位不知名的学数学的女神从另一个角度证明了这个问题，即累加1640531527，不会有两个值同时映射到同一个槽，为我们看问题提供了新的角度，女神亲笔，我就直接拿来了，感谢博主和这位不知名的女神：

根据上图假设存在(i，j)均是由(HASH\_INCREMENT=1640531527)递增得到的数字，且能经过mod (2^N)得到相同的余数(r_1)，则得到等式$ (j-i) * HASH_INCREMENT= 2^{N} * (m_2-m_1)$ ，由于(j-i<2^N)，因此约不掉右侧整个(2^N)，右侧肯定是偶数，左侧的(HASH\_INCREMENT)只要是奇数，(i，j)不相等，则必定等式不能成立，说明(2^N)范围内不存在满足条件的两个不相等的(i,j)。如果换成上面程序中的(HASH\_INCREMENT = 0x61c88648)，即1640531528，取(j=2,i=0)时，可以使等式成立(2 * 1640531528 = 16 * (m_2 - m_1))，此时(m_2 - m1 = 205066441)。与我们程序运行结果一致。

(j-i<2^N)，因此可以取到([0, 2^N))之间的任意数，((m_2 - m_1))也可以取任意数，因此只要魔数和(2^N)有公约数，就存在重复，由此可以扩展到(a^N)。也就是说只要(HASH\_INCREMENT)和(a^N)没有公约数，就可以利用此哈希算法得到一组填充满整个表的散列值。

至此，就是我对这个魔数的一些个人理解了，至少能说服自己了，说服自己是说服别人的前提嘛～

三. ThreadLocalMap源码解析

经过了第二节的铺垫，接下来我们深入细节分析下ThreadLocalMap中几个重要方法的部分实现，由于网上分析代码的文章真是数不胜数了，熟悉的可以跳过。

3.1 getEntry方法

				/**
         * Get the entry associated with key.  This method
         * itself handles only the fast path: a direct hit of existing
         * key. It otherwise relays to getEntryAfterMiss.  This is
         * designed to maximize performance for direct hits, in part
         * by making this method readily inlinable.
         *
         * @param  key the thread local object
         * @return the entry associated with key, or null if no such
         */
        /**
         * 上面的注释中提到了该方法易于内联（readily inlinable），在java中大致意思是：虚拟机不再执行正常的
         * 方法调用（压入栈桢，跳转到方法处执行，再跳回，弹出栈帧），而是直接将方法展开，以方法体中的实际代码替
         * 代原来的方法调用，这样减少了方法调用的开销。但如果一个方法体本身就很大，会占用太多内存。假设直接命
         * 中的概率比较高，调用getEntry的次数多，而调用getEntryAfterMiss这段代码的次数较少，当getEntry
         * 本身比较简短时，方便JVM做内联优化。可以比较下不考虑内联优化的set方法。
        */
        private Entry getEntry(ThreadLocal<?> key) {
          	// 通过模运算，取threadLocalHashCode的低位，仍能得到较为均匀的槽地址i
            int i = key.threadLocalHashCode & (table.length - 1);
            Entry e = table[i];
          	// Entry存在且有效，且弱引用指向的就是要找的ThreadLocal对象，则返回
            if (e != null && e.get() == key)
                return e;
            else
              	/** 由于是开放寻址+线性探测，会继续寻找，仍可能找到目标key。注意这里有三种情况：e存在且
              	* 有效，但指向的不是要找的key；e存在但无效，即e.get()为null；e为空。
              	 */
                return getEntryAfterMiss(key, i, e);
        }

				/**
         * Version of getEntry method for use when key is not found in
         * its direct hash slot.
         *
         * @param  key the thread local object
         * @param  i the table index for key's hash code
         * @param  e the entry at table[i]
         * @return the entry associated with key, or null if no such
         */
        private Entry getEntryAfterMiss(ThreadLocal<?> key, int i, Entry e) {
            Entry[] tab = table;
            int len = tab.length;
					// 线性探测，不断向后寻找直到遇到空的Entry
            while (e != null) {
                ThreadLocal<?> k = e.get();
              	// 找到目标，返回
                if (k == key)
                    return e;
                if (k == null)
                    /** 失效的Entry，Entry中的ThreadLocal已被回收，从位置为i的槽开始至遇到空的Entry，
                    * 清理这一段中间所有无效Entry
                  	 */
                    expungeStaleEntry(i);
                else
                    i = nextIndex(i, len);
                e = tab[i];
            }
            return null;
        }

				/**
         * Expunge a stale entry by rehashing any possibly colliding entries
         * lying between staleSlot and the next null slot.  This also expunges
         * any other stale entries encountered before the trailing null.  See
         * Knuth, Section 6.4
         *
         * @param staleSlot index of slot known to have null key
         * @return the index of the next null slot after staleSlot
         * (all between staleSlot and this slot will have been checked
         * for expunging).
         */
			// 对staleSlot到第一个遇到的null之间的Entry进行整理，清理掉无效Entry，有效Entry重新rehash
        private int expungeStaleEntry(int staleSlot) {
            Entry[] tab = table;
            int len = tab.length;

            // expunge entry at staleSlot
          	// 对无效Entry，直接清理
            tab[staleSlot].value = null;
            tab[staleSlot] = null;
            size--;

            // Rehash until we encounter null
            Entry e;
            int i;
            for (i = nextIndex(staleSlot, len); (e = tab[i]) != null; i = nextIndex(i, len)) {
                ThreadLocal<?> k = e.get();
                if (k == null) {
                  	// 无效Entry -> 清理
                    e.value = null;
                    tab[i] = null;
                    size--;
                } else {
                  	/** 因为之前可能有被清理的Entry已经置为null，造成这段Entry不连续了，所以有效
                  	* Entry -> rehash，rehash后可能不在原来的位置i了，就需要线性探测下一个位置，
                  	* 直到找到第一个null的位置
                  	 */
                    int h = k.threadLocalHashCode & (len - 1);
                    if (h != i) {
                        tab[i] = null;

                        // Unlike Knuth 6.4 Algorithm R, we must scan until
                        // null because multiple entries could have been stale.
                        while (tab[h] != null)
                            h = nextIndex(h, len);
                        tab[h] = e;
                    }
                }
            }
          	// 返回staleSlot之后第一个空的slot索引
            return i;
        }

3.2 set方法

关于函数的介绍，原文doc写的很清楚，就直接搬来了，这里顺便安利下读书读原著，如果是欧美大师写的书，那么推荐读原著或者影印版或者。。。因为大师们通常文笔也不错，语言通俗易懂，经过翻译反而晦涩难懂了。好了，我们来继续看set方法的部分代码：

			/**
         * Set the value associated with key.
         *
         * @param key the thread local object
         * @param value the value to be set
         */
        private void set(ThreadLocal<?> key, Object value) {

            // We don't use a fast path as with get() because it is at
            // least as common to use set() to create new entries as
            // it is to replace existing ones, in which case, a fast
            // path would fail more often than not.
          // 这里不考虑为直接命中做内联优化是因为：认为创建新的Entry和替换一个Entry概率相差不大

            Entry[] tab = table;
            int len = tab.length;
            int i = key.threadLocalHashCode & (len-1);
						
          // 线性探测，直到遇到空的Entry
            for (Entry e = tab[i]; e != null; e = tab[i = nextIndex(i, len)]) {
                ThreadLocal<?> k = e.get();
								// 找到对应的entry，替换value
                if (k == key) {
                    e.value = value;
                    return;
                }
								
              	// 替换失效的entry
                if (k == null) {
                    replaceStaleEntry(key, value, i);
                    return;
                }
            }
						
          	// 既没有找到对应的entry，也没有遇到失效的entry，则new新的entry放置在null的位置
            tab[i] = new Entry(key, value);
            int sz = ++size;
          	// 没有失效的entry被清理，且大小达到阈值，则rehash
            if (!cleanSomeSlots(i, sz) && sz >= threshold)
                rehash();
        }

				/**
         * Replace a stale entry encountered during a set operation
         * with an entry for the specified key.  The value passed in
         * the value parameter is stored in the entry, whether or not
         * an entry already exists for the specified key.
         *
         * As a side effect, this method expunges all stale entries in the
         * "run" containing the stale entry.  (A run is a sequence of entries
         * between two null slots.)
         *
         * @param  key the key
         * @param  value the value to be associated with key
         * @param  staleSlot index of the first stale entry encountered while
         *         searching for key.
         */
        private void replaceStaleEntry(ThreadLocal<?> key, Object value,
                                       int staleSlot) {
            Entry[] tab = table;
            int len = tab.length;
            Entry e;

            // Back up to check for prior stale entry in current run.
            // We clean out whole runs at a time to avoid continual
            // incremental rehashing due to garbage collector freeing
            // up refs in bunches (i.e., whenever the collector runs).
          // 向前扫描，查找最前的一个无效slot
            int slotToExpunge = staleSlot;
            for (int i = prevIndex(staleSlot, len); (e = tab[i]) != null;i = prevIndex(i, len))
                if (e.get() == null)
                    slotToExpunge = i;

            // Find either the key or trailing null slot of run, whichever
            // occurs first
          // 向后遍历
            for (int i = nextIndex(staleSlot, len); 
                 (e = tab[i]) != null;
                 i = nextIndex(i, len)) {
                ThreadLocal<?> k = e.get();

                // If we find key, then we need to swap it
                // with the stale entry to maintain hash table order.
                // The newly stale slot, or any other stale slot
                // encountered above it, can then be sent to expungeStaleEntry
                // to remove or rehash all of the other entries in run.
              	// 找到了key，将其与无效的slot交换
                if (k == key) {
                  	// 更新对应slot的value值
                    e.value = value;

                    tab[i] = tab[staleSlot];
                    tab[staleSlot] = e;

                    // Start expunge at preceding stale entry if it exists
                  	//以最前面的失效entry(包含向前扫描和向后遍历)作为清理的起点，否则以当前i作为清理起点
                    if (slotToExpunge == staleSlot)
                        slotToExpunge = i;
                 		// 从slotToExpunge开始做一次连续段的清理，再做一次启发式清理
                    cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
                    return;
                }

                // If we didn't find stale entry on backward scan, the
                // first stale entry seen while scanning for key is the
                // first still present in the run.
              	// 如果当前entry已经无效，并且向前扫描过程中没有无效entry，则更新slotToExpunge为当前位置
                if (k == null && slotToExpunge == staleSlot)
                    slotToExpunge = i;
            }

            // If key not found, put new entry in stale slot
          // 如果key在table中不存在，则在原地放一个即可
            tab[staleSlot].value = null;
            tab[staleSlot] = new Entry(key, value);

            // If there are any other stale entries in run, expunge them
          // 在探测过程中如果发现任何无效entry，则做一次清理（连续段清理+启发式清理）
            if (slotToExpunge != staleSlot)
                cleanSomeSlots(expungeStaleEntry(slotToExpunge), len);
        }

				/**
         * Heuristically scan some cells looking for stale entries.
         * This is invoked when either a new element is added, or
         * another stale one has been expunged. It performs a
         * logarithmic number of scans, as a balance between no
         * scanning (fast but retains garbage) and a number of scans
         * proportional to number of elements, that would find all
         * garbage but would cause some insertions to take O(n) time.
         *
         * @param i a position known NOT to hold a stale entry. The
         * scan starts at the element after i.
         *
         * @param n scan control: {@code log2(n)} cells are scanned,
         * unless a stale entry is found, in which case
         * {@code log2(table.length)-1} additional cells are scanned.
         * When called from insertions, this parameter is the number
         * of elements, but when from replaceStaleEntry, it is the
         * table length. (Note: all this could be changed to be either
         * more or less aggressive by weighting n instead of just
         * using straight log n. But this version is simple, fast, and
         * seems to work well.)
         *
         * @return true if any stale entries have been removed.
         */
        /**
         * 启发式地清理slot,
         * i对应entry是非无效（指向的ThreadLocal没被回收，或者entry本身为空）
         * n是用于控制控制扫描次数的
         * 正常情况下如果log n次扫描没有发现无效slot，函数就结束了
         * 但是如果发现了无效的slot，将n置为table的长度len，做一次连续段的清理
         * 再从下一个空的slot开始继续扫描
         */
        private boolean cleanSomeSlots(int i, int n) {
            boolean removed = false;
            Entry[] tab = table;
            int len = tab.length;
            do {
              	// i在任何情况下自己都不会是一个无效slot，所以从下一个开始判断
                i = nextIndex(i, len);
                Entry e = tab[i];
                if (e != null && e.get() == null) {
                  	// 扩大扫描控制因子
                    n = len;
                    removed = true;
                  	// 清理一个连续段
                    i = expungeStaleEntry(i);
                }
            } while ( (n >>>= 1) != 0);
            return removed;
        }

				/**
         * Re-pack and/or re-size the table. First scan the entire
         * table removing stale entries. If this doesn't sufficiently
         * shrink the size of the table, double the table size.
         */
        private void rehash() {
          // 做一次全量清理
            expungeStaleEntries();

            // Use lower threshold for doubling to avoid hysteresis
          	/*
             * 调低阈值来判断是否需要扩容
             * threshold默认为len*2/3，所以这里的threshold - threshold / 4相当于len/2
             */
            if (size >= threshold - threshold / 4)
              	// 扩容，因为需要保证table的容量len为2的幂，所以扩容即扩大2倍
                resize();
        }

3.3 remove方法

/**
 * Remove the entry for key.
 */
private void remove(ThreadLocal<?> key) {
    Entry[] tab = table;
    int len = tab.length;
    int i = key.threadLocalHashCode & (len-1);
    for (Entry e = tab[i];
         e != null;
         e = tab[i = nextIndex(i, len)]) {
        if (e.get() == key) {
          // 显式断开弱引用
            e.clear();
          // 进行段清理
            expungeStaleEntry(i);
            return;
        }
    }
}

四. 最佳实践与内存泄漏

ThreadLocal的最佳实践很简单：

使用ThreadLocal过程中，显示调用remove()方法，清除Entry中的数据。例如一个很常见的场景，对于一个请求一个线程的server如tomcat，在代码中对web api作一个切面，连接点方法前存放一些如用户名等用户信息，在连接点方法结束后，再显式调用remove。

那么如果没有显式地进行remove呢？因为ThreadLocal有自身的一套垃圾清理机制，经过GC后，只存在弱引用的key被清除，对应线程在之后的操作中调用ThreadLocal的get和set方法都有很高的概率会顺便清理掉无效对象，断开value强引用，从而使value中的大对象被垃圾收集器回收；或者线程不会复用，用完即销毁也不会存在问题；但如果线程是复用的如线程池中的线程，一个线程的寿命很长，在之后的操作中没有调用get或set方法，那么value中的对象就会长期保留了，这种情况可以认为是造成了内存泄漏；此外，还有重要的一点值得我们注意，即正确使用ThreadLocal，这点对所有使用所有工具都适用了，我在网上看到这样一个例子[2]：

public class ThreadLocalTest {
    // OOM解决一
  	// private static final ThreadLocal<Value> threadLocalPart = new ThreadLocal<Value>();
  	ThreadLocal<Value> threadLocalPart = new ThreadLocal<Value>();
  
  	// OOM解决二
  	//static class Value{  
  	class Value{
        final int i;
        Value(int i){
            this.i = i;
        }
    }
 
    ThreadLocalTest setThreadVal(int i){
        threadLocalPart.set(new Value(i));
        return this;
    }
 
    int getThreadVal(){
        return threadLocalPart.get().i;
    }
 
    public static void main(String[] args) {
        int sum = 0;
        for(int i = -500000000;i<=500000000;i++){
            sum+= new ThreadLocalTest().setThreadVal(i).getThreadVal();
        }
        System.out.println(sum);
    }
}

这段代码造成了OOM，可以复现。将博客[2]的解释拿过来了：非静态内部类会持有创建这个类的外部的引用，这里entry的value是非静态内部类，所以存在该内部类所对应的外部类的引用（ThreadLocalTest），而外部类又存在对ThreadLocal的强引用，导致key无法回收，key无法回收又带来value无法回收最终导致ThreadLocalMap被撑满：

解决方法就是断开key在栈上的强引用，如上图所示。

五. 结语

本文总结了一下ThreadLocal的一些重要的点，也参考了很多写的很好的博客，列了一些在Reference中，再次表示感谢，并没有觉得写的更好，只是基本解决了自己的一点疑问，并且按照自己的想法梳理成文，若有不足，还望指正。这种使用线程局部变量从而避免多线程竞争的思想很常见，比如Netty中的内存管理，借鉴jemalloc的思想，也是后续想写的内容。

Reference

[1] http://www.jeepxie.net/article/49661.html

[2]https://blog.csdn.net/zys890523/article/details/84385472

[3]https://probablydance.com/2018/06/16/fibonacci-hashing-the-optimization-that-the-world-forgot-or-a-better-alternative-to-integer-modulo/

[4]https://www.gohired.in/2018/07/31/fibonacci-hashing-fastest-hashtable/

[5]https://www.cnblogs.com/micrari/p/6790229.html#

相关阅读:
String、StringBuffer与StringBuilder之间区别
 Java String之String和CharSequence、StringBuilder和StringBuffer的区别(1)
Rational Rose2007（v7.0）下载地址、安装及激活详解教程
 FileSystemXmlApplicationContext、ClassPathXmlApplicationContext和XmlWebApplicationContext简介
 洛克菲勒
 NOIP提高组DAY1T2——信息传递（最小环）
洛谷P2016——战略游戏（树形）
数字转换（树上直径）
洛谷P2014——选课（树形dp）
没有上司的舞会（简单树形dp）
原文地址：https://www.cnblogs.com/withwhom/p/13372100.html