注意,以下所述源码版本为 JDK 1.8.0_212
1 引用的概念
Java中的数据类型分为:
基本数据类型:byte、short、int、long、float、double 8种。
引用类型:上述基本数据类型的包装类、其他各种对象类型。如Integer、Object等。
当说到“引用”时,指的可能是 引用类型 或 一个引用类型的变量,具体视上下文而定。
在JDK1.2之前,Java中的引用的定义是十分传统的:如果reference类型的数据中存储的数值代表的是另一块内存的起始地址,就称这块内存代表着一个引用。在这种定义之下,一个对象只有被引用和没有被引用两种状态,用户代码中无法在对象被GC回收后做一些额外的工作(如清理堆外内存等)。
实际上,我们更希望存在这样的一类对象:当内存空间还足够的时候,这些对象能够保留在内存空间中;如果当内存空间在进行了垃圾收集之后还是非常紧张,则可以抛弃这些对象。基于这种特性,可以满足很多系统的缓存功能的使用场景。
从JDK 1.2起,对引用概念进行了扩充,将引用分为 强引用(Strong Reference)、软引用(Soft Reference)、弱引用(Weak Reference)、虚引用(Phantom Reference)、终引用(Final Reference)。其中强引用就是JDK 1.2之前的引用,日常代码中绝大多数引用都是强引用;而其他几种引用则是JDK 1.2引进的,通过这些新引入的引用可以在用户代码中实现类似“感知到对象被回收从而做一些额外工作”的效果。
新引入的引用是java.lang.Reference类的子类(强引用则不是),其间的关系如下:
注:
强引用没有对应的类型表示,也就是说强引用是普遍存在的,如Object object = new Object(); 。
直接继承java.lang.ref.Reference创建自定义的引用类型是无效的,但是可以直接继承已经存在的引用类型,如sun.misc.Cleaner就是继承自java.lang.ref.PhantomReference。
FinalReference 是package private的作用域的,因此 FinalReference、Finalizer 都无法在用户代码中直接使用,而是由JVM去创建的。
Cleaner位于 sun.misc 包下而非 java.lang.ref 。
2 引用的创建
直接定义的引用变量是强引用,如 Integer a = new Integer() ,a是强引用。
其他几种引用则需要通过Reference子类创建,如 WeakReference<Integer> wr = new WeakReference<>(a); 。这里wr仍是强引用,wr对象内的referent成员才是弱引用。
3 引用对象的可达性及回收
对象间互相引用形成引用链。不同引用间的强弱关系依次是 强引用 > 软引用 > 弱引用 > 虚引用 > 终引用,如果有更强的引用关系存在,那么引用链的可达性将由更强的引用关系决定。因此,可达性就分为:
强可达:对象与GC Root间存在强引用链。例如下图的A、B。
软可达:对象与GC Root间不存在强引用链,但存在软引用链。例如下图的E。
弱可达:对象与GC Root间不存在强、软引用链,但存在弱引用链。例如下图的F、C、D。
虚可达:对象与GC Root间不存在强、软、弱引用链,但存在虚引用链。例如下图的G。
不可达:对象与GC Root间不存在任何引用链。例如下图的H、I。
强引用(日常代码里的引用类型变量几乎都是这种引用)的对象不会被回收(严格来说也不是绝对的,如软、弱、虚引用对象内部对实际对象的引用referent也是强引用,但说强引用时一般不包括此情形),其他引用对象则可能被回收。
软引用:软引用对象在系统将要发生内存耗尽(OOM)前会被回收。示例:
1 // VM参数:-Xmx4m -Xms4m 2 public class SoftReferenceMain { 3 4 public static void main(String[] args) throws Exception { 5 ReferenceQueue<SoftReferenceObject> queue = new ReferenceQueue<>(); 6 SoftReferenceObject object = new SoftReferenceObject(); 7 SoftReference<SoftReferenceObject> reference = new SoftReference<>(object, queue); 8 object = null; 9 System.gc(); 10 Thread.sleep(500); 11 System.out.println(reference.get()); 12 } 13 14 private static class SoftReferenceObject { 15 16 int[] array = new int[120_000]; 17 18 @Override 19 public String toString() { 20 return "SoftReferenceObject"; 21 } 22 } 23 } 24 // 运行后输出结果,可见GC后软引用关联的对象被回收了 25 null
弱引用:被弱引用关联的对象只能生存到下一次垃圾收集发生之前,简言之就是:一旦发生GC必定回收被弱引用关联的对象,不管当前的内存是否足够。其一个特点是它何时被回收是不可确定的,因为这是由GC运行的不确定性所确定的。示例:
1 class WeakReferenceMain { 2 3 public static void main(String[] args) throws Exception { 4 ReferenceQueue<WeakReferenceObject> queue = new ReferenceQueue<>(); 5 WeakReferenceObject object = new WeakReferenceObject(); 6 System.out.println(object); 7 WeakReference<WeakReferenceObject> reference = new WeakReference<>(object, queue); 8 object = null; 9 Thread.sleep(500); 10 System.out.println(reference.get()); 11 System.gc(); 12 Thread.sleep(500); 13 System.out.println(reference.get()); 14 } 15 16 private static class WeakReferenceObject { 17 18 @Override 19 public String toString() { 20 return "WeakReferenceObject"; 21 } 22 } 23 } 24 25 // 运行后输出结果 26 WeakReferenceObject 27 WeakReferenceObject 28 null
虚引用:虚引用对象肯定会被回收,一个对象是否有虚引用完全不会对其生存时间构成影响,也无法通过虚引用来取得一个对象实例(PhantomReference覆盖了Reference#get()并且总是返回null),对象被设置成虚引用关联的唯一作用是在对象被垃圾收集时收到一个系统通知。JDK里的实现是 sun.misc.Cleaner。示例见后文的DireactByteBuffer。
终引用:对象被回收时由JVM包装成这种引用。当对象重写了 Object#finalize 方法、该方法未被调用过、方法内部该对象又被成员变量(类变量或实例变量)引用,则对象不被回收,否则被回收。finalize最多只会被JVM调用一次。JDK里的实现是Finalizer。
总结:
4 Reference核心原理
(注:以下所述的Reference类型、引用类型指的是Reference各种子类的引用类型,因此强引用类型不包括在内)
由于新引进的几种引用类型是通过Reference子类创建的,该子类对象自身就是一个对象、该对象内部包含对实际对象的引用。故为免描述上的歧义,这里明确下概念:
“Reference对象”指Reference类或其子类变量直接指向的对象。如前面的wr指向的对象。
“引用对象”指 Reference变量对应的最终对象。如前面的a。 wr指向的对象内部包含对实际对象的引用,可通过wr.get()获取该实际对象。
4.1 出发点
引入各种新引用类型的主要作用是 使得用户代码层面能够知道哪些对象被JVM GC回收了 及 能够在知道对象被回收时做一些额外的处理工作。这里权且名之曰“对象回收感知”。
4.2 相关类
Reference:抽象类,前面介绍的JDK 1.2扩充的软、弱、虚引用都是该类的子类,这些扩充的引用类型的数据结构及实现上述“出发点”的核心逻辑都在该抽象类中实现了。相关源码:
1 /** 2 * Abstract base class for reference objects. This class defines the 3 * operations common to all reference objects. Because reference objects are 4 * implemented in close cooperation with the garbage collector, this class may 5 * not be subclassed directly. 6 * 7 * @author Mark Reinhold 8 * @since 1.2 9 */ 10 11 public abstract class Reference<T> { 12 13 /* A Reference instance is in one of four possible internal states: 14 * 15 * Active: Subject to special treatment by the garbage collector. Some 16 * time after the collector detects that the reachability of the 17 * referent has changed to the appropriate state, it changes the 18 * instance's state to either Pending or Inactive, depending upon 19 * whether or not the instance was registered with a queue when it was 20 * created. In the former case it also adds the instance to the 21 * pending-Reference list. Newly-created instances are Active. 22 * 23 * Pending: An element of the pending-Reference list, waiting to be 24 * enqueued by the Reference-handler thread. Unregistered instances 25 * are never in this state. 26 * 27 * Enqueued: An element of the queue with which the instance was 28 * registered when it was created. When an instance is removed from 29 * its ReferenceQueue, it is made Inactive. Unregistered instances are 30 * never in this state. 31 * 32 * Inactive: Nothing more to do. Once an instance becomes Inactive its 33 * state will never change again. 34 * 35 * The state is encoded in the queue and next fields as follows: 36 * 37 * Active: queue = ReferenceQueue with which instance is registered, or 38 * ReferenceQueue.NULL if it was not registered with a queue; next = 39 * null. 40 * 41 * Pending: queue = ReferenceQueue with which instance is registered; 42 * next = this 43 * 44 * Enqueued: queue = ReferenceQueue.ENQUEUED; next = Following instance 45 * in queue, or this if at end of list. 46 * 47 * Inactive: queue = ReferenceQueue.NULL; next = this. 48 * 49 * With this scheme the collector need only examine the next field in order 50 * to determine whether a Reference instance requires special treatment: If 51 * the next field is null then the instance is active; if it is non-null, 52 * then the collector should treat the instance normally. 53 * 54 * To ensure that a concurrent collector can discover active Reference 55 * objects without interfering with application threads that may apply 56 * the enqueue() method to those objects, collectors should link 57 * discovered objects through the discovered field. The discovered 58 * field is also used for linking Reference objects in the pending list. 59 */ 60 61 private T referent; /* Treated specially by GC */ 62 63 volatile ReferenceQueue<? super T> queue; 64 65 /* When active: NULL 66 * pending: this 67 * Enqueued: next reference in queue (or this if last) 68 * Inactive: this 69 */ 70 @SuppressWarnings("rawtypes") 71 volatile Reference next; 72 73 /* When active: next element in a discovered reference list maintained by GC (or this if last) 74 * pending: next element in the pending list (or null if last) 75 * otherwise: NULL 76 */ 77 transient private Reference<T> discovered; /* used by VM */ 78 79 80 /* Object used to synchronize with the garbage collector. The collector 81 * must acquire this lock at the beginning of each collection cycle. It is 82 * therefore critical that any code holding this lock complete as quickly 83 * as possible, allocate no new objects, and avoid calling user code. 84 */ 85 static private class Lock { } 86 private static Lock lock = new Lock(); 87 88 89 /* List of References waiting to be enqueued. The collector adds 90 * References to this list, while the Reference-handler thread removes 91 * them. This list is protected by the above lock object. The 92 * list uses the discovered field to link its elements. 93 */ 94 private static Reference<Object> pending = null; 95 96 /* High-priority thread to enqueue pending References 97 */ 98 private static class ReferenceHandler extends Thread { 99 100 private static void ensureClassInitialized(Class<?> clazz) { 101 try { 102 Class.forName(clazz.getName(), true, clazz.getClassLoader()); 103 } catch (ClassNotFoundException e) { 104 throw (Error) new NoClassDefFoundError(e.getMessage()).initCause(e); 105 } 106 } 107 108 static { 109 // pre-load and initialize InterruptedException and Cleaner classes 110 // so that we don't get into trouble later in the run loop if there's 111 // memory shortage while loading/initializing them lazily. 112 ensureClassInitialized(InterruptedException.class); 113 ensureClassInitialized(Cleaner.class); 114 } 115 116 ReferenceHandler(ThreadGroup g, String name) { 117 super(g, name); 118 } 119 120 public void run() { 121 while (true) { 122 tryHandlePending(true); 123 } 124 } 125 } 126 127 /** 128 * Try handle pending {@link Reference} if there is one.<p> 129 * Return {@code true} as a hint that there might be another 130 * {@link Reference} pending or {@code false} when there are no more pending 131 * {@link Reference}s at the moment and the program can do some other 132 * useful work instead of looping. 133 * 134 * @param waitForNotify if {@code true} and there was no pending 135 * {@link Reference}, wait until notified from VM 136 * or interrupted; if {@code false}, return immediately 137 * when there is no pending {@link Reference}. 138 * @return {@code true} if there was a {@link Reference} pending and it 139 * was processed, or we waited for notification and either got it 140 * or thread was interrupted before being notified; 141 * {@code false} otherwise. 142 */ 143 static boolean tryHandlePending(boolean waitForNotify) { 144 Reference<Object> r; 145 Cleaner c; 146 try { 147 synchronized (lock) { 148 if (pending != null) { 149 r = pending; 150 // 'instanceof' might throw OutOfMemoryError sometimes 151 // so do this before un-linking 'r' from the 'pending' chain... 152 c = r instanceof Cleaner ? (Cleaner) r : null; 153 // unlink 'r' from 'pending' chain 154 pending = r.discovered; 155 r.discovered = null; 156 } else { 157 // The waiting on the lock may cause an OutOfMemoryError 158 // because it may try to allocate exception objects. 159 if (waitForNotify) { 160 lock.wait(); 161 } 162 // retry if waited 163 return waitForNotify; 164 } 165 } 166 } catch (OutOfMemoryError x) { 167 // Give other threads CPU time so they hopefully drop some live references 168 // and GC reclaims some space. 169 // Also prevent CPU intensive spinning in case 'r instanceof Cleaner' above 170 // persistently throws OOME for some time... 171 Thread.yield(); 172 // retry 173 return true; 174 } catch (InterruptedException x) { 175 // retry 176 return true; 177 } 178 179 // Fast path for cleaners 180 if (c != null) { 181 c.clean(); 182 return true; 183 } 184 185 ReferenceQueue<? super Object> q = r.queue; 186 if (q != ReferenceQueue.NULL) q.enqueue(r); 187 return true; 188 } 189 190 static { 191 ThreadGroup tg = Thread.currentThread().getThreadGroup(); 192 for (ThreadGroup tgn = tg; 193 tgn != null; 194 tg = tgn, tgn = tg.getParent()); 195 Thread handler = new ReferenceHandler(tg, "Reference Handler"); 196 /* If there were a special system-only priority greater than 197 * MAX_PRIORITY, it would be used here 198 */ 199 handler.setPriority(Thread.MAX_PRIORITY); 200 handler.setDaemon(true); 201 handler.start(); 202 203 // provide access in SharedSecrets 204 SharedSecrets.setJavaLangRefAccess(new JavaLangRefAccess() { 205 @Override 206 public boolean tryHandlePendingReference() { 207 return tryHandlePending(false); 208 } 209 }); 210 } 211 212 /* -- Referent accessor and setters -- */ 213 214 /** 215 * Returns this reference object's referent. If this reference object has 216 * been cleared, either by the program or by the garbage collector, then 217 * this method returns <code>null</code>. 218 * 219 * @return The object to which this reference refers, or 220 * <code>null</code> if this reference object has been cleared 221 */ 222 public T get() { 223 return this.referent; 224 } 225 226 /** 227 * Clears this reference object. Invoking this method will not cause this 228 * object to be enqueued. 229 * 230 * <p> This method is invoked only by Java code; when the garbage collector 231 * clears references it does so directly, without invoking this method. 232 */ 233 public void clear() { 234 this.referent = null; 235 } 236 237 238 /* -- Queue operations -- */ 239 240 /** 241 * Tells whether or not this reference object has been enqueued, either by 242 * the program or by the garbage collector. If this reference object was 243 * not registered with a queue when it was created, then this method will 244 * always return <code>false</code>. 245 * 246 * @return <code>true</code> if and only if this reference object has 247 * been enqueued 248 */ 249 public boolean isEnqueued() { 250 return (this.queue == ReferenceQueue.ENQUEUED); 251 } 252 253 /** 254 * Adds this reference object to the queue with which it is registered, 255 * if any. 256 * 257 * <p> This method is invoked only by Java code; when the garbage collector 258 * enqueues references it does so directly, without invoking this method. 259 * 260 * @return <code>true</code> if this reference object was successfully 261 * enqueued; <code>false</code> if it was already enqueued or if 262 * it was not registered with a queue when it was created 263 */ 264 public boolean enqueue() { 265 return this.queue.enqueue(this); 266 } 267 268 269 /* -- Constructors -- */ 270 271 Reference(T referent) { 272 this(referent, null); 273 } 274 275 Reference(T referent, ReferenceQueue<? super T> queue) { 276 this.referent = referent; 277 this.queue = (queue == null) ? ReferenceQueue.NULL : queue; 278 } 279 280 }
Reference类的关键成员变量有两个:referent、queue。前者表示该“Reference对象”的最终引用的对象,后者表示队列(下面介绍)。有两个构造方法,一个是只有referent参数、一个是还有queue参数,可见:queue是由Reference对象的创建者提供;一个对象可以被不同的Reference对象关联。
Reference类还有个关键的类变量:private static Reference<Object> pending,由JVM GC负责为该变量赋值——在referent对象被回收时对应的Reference对象会被JVM GC赋给pending;下面将介绍的ReferenceHandler会从pending取值做进一步处理,可见 pending 是JVM GC逻辑与用户代码逻辑交互的媒介。(pending与下面的ReferenceHandler一样是Reference类的静态内容,感觉与Reference自身的数据结构关系不大,因此不应耦合在Reference类中而是应拆到单独类里实现?)
ReferenceHandler:Reference类的静态内部类,是Thread的子类。
Reference类内部static代码块中会启动该线程,该线程优先级是最高级别的且是守护线程。可见,对于一个Java程序而言,只要Reference类被JVM加载后就会有该线程存在。
该线程做的事是:将被JVM GC回收的对象对应的Reference对象取出(通过Reference类的pending成员变量取,JVM在回收对象时会将对应的Reference对象存到该变量中) ,执行其clean方法(虚引用场景)或放到ReferenceQueue队列中(软、弱引用场景)。相关源码:
1 static { 2 ThreadGroup tg = Thread.currentThread().getThreadGroup(); 3 for (ThreadGroup tgn = tg; 4 tgn != null; 5 tg = tgn, tgn = tg.getParent()); 6 Thread handler = new ReferenceHandler(tg, "Reference Handler"); 7 /* If there were a special system-only priority greater than 8 * MAX_PRIORITY, it would be used here 9 */ 10 handler.setPriority(Thread.MAX_PRIORITY); 11 handler.setDaemon(true); 12 handler.start(); 13 14 // provide access in SharedSecrets 15 SharedSecrets.setJavaLangRefAccess(new JavaLangRefAccess() { 16 @Override 17 public boolean tryHandlePendingReference() { 18 return tryHandlePending(false); 19 } 20 }); 21 } 22 23 24 static boolean tryHandlePending(boolean waitForNotify) { 25 Reference<Object> r; 26 Cleaner c; 27 try { 28 synchronized (lock) { 29 if (pending != null) { 30 r = pending; 31 // 'instanceof' might throw OutOfMemoryError sometimes 32 // so do this before un-linking 'r' from the 'pending' chain... 33 c = r instanceof Cleaner ? (Cleaner) r : null; 34 // unlink 'r' from 'pending' chain 35 pending = r.discovered; 36 r.discovered = null; 37 } else { 38 // The waiting on the lock may cause an OutOfMemoryError 39 // because it may try to allocate exception objects. 40 if (waitForNotify) { 41 lock.wait(); 42 } 43 // retry if waited 44 return waitForNotify; 45 } 46 } 47 } catch (OutOfMemoryError x) { 48 // Give other threads CPU time so they hopefully drop some live references 49 // and GC reclaims some space. 50 // Also prevent CPU intensive spinning in case 'r instanceof Cleaner' above 51 // persistently throws OOME for some time... 52 Thread.yield(); 53 // retry 54 return true; 55 } catch (InterruptedException x) { 56 // retry 57 return true; 58 } 59 60 // Fast path for cleaners 61 if (c != null) { 62 c.clean(); 63 return true; 64 } 65 66 ReferenceQueue<? super Object> q = r.queue; 67 if (q != ReferenceQueue.NULL) q.enqueue(r); 68 return true; 69 }
ReferenceQueue:队列,用来存放“被垃圾回收的对象”对应的Reference子类对象,如SoftReference、WeakReference对象等。
该队列只存储了引用链表的头节点,提供了引用链表的操作,实际上,引用链表是Reference实例内部变量存储。
通常让同一个队列作为多个Reference对象的构造参数,这些Reference对象对应的实际对象可以是有交集(即一个“引用对象”对应多个“Reference对象”的情形)或无交集的。
从上面可知,JVM GC在回收对象时该对象对应的Reference对象可能会(构造时有提供queue参数)被ReferenceHandler放入队列。因此,用户程序通过拉取queue元素可知哪些对象被回收了,从而做一些个性化处理。WeakHashMap就是借助此特点这实现的。
4.3 核心流程
在GC时如果当前对象只被Reference对象引用(虽然这引用也是强引用),JVM会根据Reference具体类型决定是否把当前对象相关的Reference对象加入到一个Reference类型的pending链表上,如果能加入pending链表JVM同时会通知ReferenceHandler线程进行处理。ReferenceHandler线程收到通知后会调用Cleaner#clean或ReferenceQueue#enqueue方法进行处理。用户程序就可以通过从该队列拉取元素得到被回收的对象对应的Reference对象。流程图如下:
4.4 各引用的原理
这里的引用指JDK中 Reference 类的各种子类实现。
总的来说,SoftReference、WeakReference、PhantomReference、FinalReference、Finalizer 在实现上核心逻辑与Reference中的一样(从源码就能看出来,几乎都是委托给Reference类),只有Finalizer 更特殊点,其把“感知对象回收、做额外处理”的事情也做了而非像 WeakReference 等那样由用户代码自行去做。
既然前四者逻辑几乎与Reference一样,为何要有这么多种不同的类型、为何不直接采用Reference类?因为这相当于一种标识或约定,JVM会根据对象关联的Reference类型的不同来对对象做不同处理:
若当前对象仅被SoftReference对象引用,则JVM在要OOM时才会回收当前对象;
若当前对象仅被WeakReference对象引用,则JVM在当前对象遇到第二次垃圾回收时会回收之;
若当前对象仅被PhantomReference/Cleaner对象引用,则JVM肯定会回收当前对象;
若当前对象被回收时会被JVM包装成 FinalReference/Finalizer 引用,进一步地,JVM会根据当前对象finalize方法是否重新及是否被调用过给与其一次逃过被回收的机会。
SoftReference
核心逻辑可认为与Reference的一样,只不过加了tiimestamp实例变量,从源码可见:
1 public class SoftReference<T> extends Reference<T> { 2 3 /** 4 * Timestamp clock, updated by the garbage collector 5 */ 6 static private long clock; 7 8 /** 9 * Timestamp updated by each invocation of the get method. The VM may use 10 * this field when selecting soft references to be cleared, but it is not 11 * required to do so. 12 */ 13 private long timestamp; 14 15 /** 16 * Creates a new soft reference that refers to the given object. The new 17 * reference is not registered with any queue. 18 * 19 * @param referent object the new soft reference will refer to 20 */ 21 public SoftReference(T referent) { 22 super(referent); 23 this.timestamp = clock; 24 } 25 26 /** 27 * Creates a new soft reference that refers to the given object and is 28 * registered with the given queue. 29 * 30 * @param referent object the new soft reference will refer to 31 * @param q the queue with which the reference is to be registered, 32 * or <tt>null</tt> if registration is not required 33 * 34 */ 35 public SoftReference(T referent, ReferenceQueue<? super T> q) { 36 super(referent, q); 37 this.timestamp = clock; 38 } 39 40 /** 41 * Returns this reference object's referent. If this reference object has 42 * been cleared, either by the program or by the garbage collector, then 43 * this method returns <code>null</code>. 44 * 45 * @return The object to which this reference refers, or 46 * <code>null</code> if this reference object has been cleared 47 */ 48 public T get() { 49 T o = super.get(); 50 if (o != null && this.timestamp != clock) 51 this.timestamp = clock; 52 return o; 53 } 54 55 }
WeakReference
核心逻辑可与Reference的一样,从源码可见:
1 public class WeakReference<T> extends Reference<T> { 2 3 /** 4 * Creates a new weak reference that refers to the given object. The new 5 * reference is not registered with any queue. 6 * 7 * @param referent object the new weak reference will refer to 8 */ 9 public WeakReference(T referent) { 10 super(referent); 11 } 12 13 /** 14 * Creates a new weak reference that refers to the given object and is 15 * registered with the given queue. 16 * 17 * @param referent object the new weak reference will refer to 18 * @param q the queue with which the reference is to be registered, 19 * or <tt>null</tt> if registration is not required 20 */ 21 public WeakReference(T referent, ReferenceQueue<? super T> q) { 22 super(referent, q); 23 } 24 25 }
PhantomReference、Cleaner
前者核心逻辑与Reference的一样,从源码可见:
1 public class PhantomReference<T> extends Reference<T> { 2 3 /** 4 * Returns this reference object's referent. Because the referent of a 5 * phantom reference is always inaccessible, this method always returns 6 * <code>null</code>. 7 * 8 * @return <code>null</code> 9 */ 10 public T get() { 11 return null; 12 } 13 14 /** 15 * Creates a new phantom reference that refers to the given object and 16 * is registered with the given queue. 17 * 18 * <p> It is possible to create a phantom reference with a <tt>null</tt> 19 * queue, but such a reference is completely useless: Its <tt>get</tt> 20 * method will always return null and, since it does not have a queue, it 21 * will never be enqueued. 22 * 23 * @param referent the object the new phantom reference will refer to 24 * @param q the queue with which the reference is to be registered, 25 * or <tt>null</tt> if registration is not required 26 */ 27 public PhantomReference(T referent, ReferenceQueue<? super T> q) { 28 super(referent, q); 29 } 30 31 }
后者增加了 clean 回调方法,被 ReferenceHandler 调用(见前面Reference原理部分)。源码在 sun.misc 包中。
JDK 9引入了java.lang.ref.Cleaner,且提倡用Cleaner来替换 Object#finalize() 功能。
FinalReference、Finalizer
前者核心逻辑与Reference的一样,只不过创建FinalReference时要求一定要提供队列参数,从源码可见:
1 class FinalReference<T> extends Reference<T> { 2 3 public FinalReference(T referent, ReferenceQueue<? super T> q) { 4 super(referent, q); 5 } 6 }
后者基于Reference核心流程做了修改或扩充,源码:
1 final class Finalizer extends FinalReference<Object> { /* Package-private; must be in 2 same package as the Reference 3 class */ 4 5 private static ReferenceQueue<Object> queue = new ReferenceQueue<>(); 6 private static Finalizer unfinalized = null; 7 private static final Object lock = new Object(); 8 9 private Finalizer 10 next = null, 11 prev = null; 12 13 private boolean hasBeenFinalized() { 14 return (next == this); 15 } 16 17 private void add() { 18 synchronized (lock) { 19 if (unfinalized != null) { 20 this.next = unfinalized; 21 unfinalized.prev = this; 22 } 23 unfinalized = this; 24 } 25 } 26 27 private void remove() { 28 synchronized (lock) { 29 if (unfinalized == this) { 30 if (this.next != null) { 31 unfinalized = this.next; 32 } else { 33 unfinalized = this.prev; 34 } 35 } 36 if (this.next != null) { 37 this.next.prev = this.prev; 38 } 39 if (this.prev != null) { 40 this.prev.next = this.next; 41 } 42 this.next = this; /* Indicates that this has been finalized */ 43 this.prev = this; 44 } 45 } 46 47 private Finalizer(Object finalizee) { 48 super(finalizee, queue); 49 add(); 50 } 51 52 static ReferenceQueue<Object> getQueue() { 53 return queue; 54 } 55 56 /* Invoked by VM */ 57 static void register(Object finalizee) { 58 new Finalizer(finalizee); 59 } 60 61 private void runFinalizer(JavaLangAccess jla) { 62 synchronized (this) { 63 if (hasBeenFinalized()) return; 64 remove(); 65 } 66 try { 67 Object finalizee = this.get(); 68 if (finalizee != null && !(finalizee instanceof java.lang.Enum)) { 69 jla.invokeFinalize(finalizee); 70 71 /* Clear stack slot containing this variable, to decrease 72 the chances of false retention with a conservative GC */ 73 finalizee = null; 74 } 75 } catch (Throwable x) { } 76 super.clear(); 77 } 78 79 /* Create a privileged secondary finalizer thread in the system thread 80 group for the given Runnable, and wait for it to complete. 81 82 This method is used by both runFinalization and runFinalizersOnExit. 83 The former method invokes all pending finalizers, while the latter 84 invokes all uninvoked finalizers if on-exit finalization has been 85 enabled. 86 87 These two methods could have been implemented by offloading their work 88 to the regular finalizer thread and waiting for that thread to finish. 89 The advantage of creating a fresh thread, however, is that it insulates 90 invokers of these methods from a stalled or deadlocked finalizer thread. 91 */ 92 private static void forkSecondaryFinalizer(final Runnable proc) { 93 AccessController.doPrivileged( 94 new PrivilegedAction<Void>() { 95 public Void run() { 96 ThreadGroup tg = Thread.currentThread().getThreadGroup(); 97 for (ThreadGroup tgn = tg; 98 tgn != null; 99 tg = tgn, tgn = tg.getParent()); 100 Thread sft = new Thread(tg, proc, "Secondary finalizer"); 101 sft.start(); 102 try { 103 sft.join(); 104 } catch (InterruptedException x) { 105 Thread.currentThread().interrupt(); 106 } 107 return null; 108 }}); 109 } 110 111 /* Called by Runtime.runFinalization() */ 112 static void runFinalization() { 113 if (!VM.isBooted()) { 114 return; 115 } 116 117 forkSecondaryFinalizer(new Runnable() { 118 private volatile boolean running; 119 public void run() { 120 // in case of recursive call to run() 121 if (running) 122 return; 123 final JavaLangAccess jla = SharedSecrets.getJavaLangAccess(); 124 running = true; 125 for (;;) { 126 Finalizer f = (Finalizer)queue.poll(); 127 if (f == null) break; 128 f.runFinalizer(jla); 129 } 130 } 131 }); 132 } 133 134 /* Invoked by java.lang.Shutdown */ 135 static void runAllFinalizers() { 136 if (!VM.isBooted()) { 137 return; 138 } 139 140 forkSecondaryFinalizer(new Runnable() { 141 private volatile boolean running; 142 public void run() { 143 // in case of recursive call to run() 144 if (running) 145 return; 146 final JavaLangAccess jla = SharedSecrets.getJavaLangAccess(); 147 running = true; 148 for (;;) { 149 Finalizer f; 150 synchronized (lock) { 151 f = unfinalized; 152 if (f == null) break; 153 unfinalized = f.next; 154 } 155 f.runFinalizer(jla); 156 }}}); 157 } 158 159 private static class FinalizerThread extends Thread { 160 private volatile boolean running; 161 FinalizerThread(ThreadGroup g) { 162 super(g, "Finalizer"); 163 } 164 public void run() { 165 // in case of recursive call to run() 166 if (running) 167 return; 168 169 // Finalizer thread starts before System.initializeSystemClass 170 // is called. Wait until JavaLangAccess is available 171 while (!VM.isBooted()) { 172 // delay until VM completes initialization 173 try { 174 VM.awaitBooted(); 175 } catch (InterruptedException x) { 176 // ignore and continue 177 } 178 } 179 final JavaLangAccess jla = SharedSecrets.getJavaLangAccess(); 180 running = true; 181 for (;;) { 182 try { 183 Finalizer f = (Finalizer)queue.remove(); 184 f.runFinalizer(jla); 185 } catch (InterruptedException x) { 186 // ignore and continue 187 } 188 } 189 } 190 } 191 192 static { 193 ThreadGroup tg = Thread.currentThread().getThreadGroup(); 194 for (ThreadGroup tgn = tg; 195 tgn != null; 196 tg = tgn, tgn = tg.getParent()); 197 Thread finalizer = new FinalizerThread(tg); 198 finalizer.setPriority(Thread.MAX_PRIORITY - 2); 199 finalizer.setDaemon(true); 200 finalizer.start(); 201 } 202 203 }
体现在:(详情可参阅 Finalizer)
Finalizer 类是package private 作用域的,构造器是private的,故Finalizer对象只能由JVM创建 : /* Invoked by VM */ static void register(Object finalizee) { new Finalizer(finalizee); } ;
当数据对象(如 Person 对象)被回收时,在满足条件时会被JVM通过上述方法包装成Finalizer对象注册到队列,条件是:数据对象重写了finalize方法且尚未被调用过;
Finalizer 类内部包含了“感知对象回收、根据队列做额外处理”的逻辑(Reference、WeakReference等中不包含这些逻辑,需要由用户实现):在Reference内部逻辑的基础上,Finalizer 类内部还会启动一个线程,从队列中取 Finalizer 对象,调用对象的finalize方法:Finalizer#runFinalizer 。
Finalizer守护线程是由Finalizer类的静态代码块创建和运行的,作用是处理Finalizer类内部维护的F-Queue链表(链表元素入队操作由JVM实现)的元素调用关联对象的finalize()方法。
ReferenceHandler守护线程线和Finalizer守护线程共同协作才能使引用类型对象内存回收系统的工作能够正常进行。
4.5 应用
基于Reference类型的“对象回收感知”特点,有三个典型应用:堆外内存自动回收、WeakHashMap实现不用数据自动回收、ThreadLocalMap解决MemoryLeak。
(这些典型应用好像通过重写 Object#finalize 方法也能实现?)
4.5.1 DirectByteBuffer堆外内存回收
Cleaner是PhantomReference类型的引用。当JVM GC时如果发现当前处理的对象只被PhantomReference类型对象引用,同之前说的一样其会将该Reference对象加到pending-Reference链上,只是ReferenceHandler线程在处理时如果PhantomReference类型实际类型又是Cleaner的话,就会调用Cleaner.clean方法。DirectByteBuffer分配的堆外内存收回就是通过实现Cleaner.clean方法实现的。
创建DirectByteBuffer对象时会创建一个Cleaner对象,Cleaner对象持有了DirectByteBuffer对象的引用。当JVM在GC时,如果发现DirectByteBuffer没被引用(仅被Reference引用),JVM会将其对应的Cleaner加入到pending-reference链表中,同时通知ReferenceHandler线程处理,ReferenceHandler收到通知后,会调用Cleaner#clean方法,而对于DirectByteBuffer创建的Cleaner对象其clean方法内部会调用unsafe.freeMemory释放堆外内存。最终达到了DirectByteBuffer对象被GC回收时其对应的堆外内存也被回收的目的。相关源码:
1 // ByteBuffer#allocateDirect 2 3 //直接new一个指定字节大小的DirectByteBuffer对象 4 public static ByteBuffer allocateDirect(int capacity) { 5 return new DirectByteBuffer(capacity); 6 } 7 DirectByteBuffer(int cap) { 8 //省略部分代码... 9 try { 10 //调用unsafe分配内存 11 base = unsafe.allocateMemory(size); 12 } catch (OutOfMemoryError x) { 13 //省略部分代码... 14 } 15 //省略部分代码... 16 //前面分析中的Cleaner对象创建,持有当前DirectByteBuffer的引用 17 cleaner = Cleaner.create(this, new Deallocator(base, size, cap)); 18 att = null; 19 } 20 21 22 23 24 // Deallocator#run 25 public void run() { 26 if (address == 0) { 27 return; 28 } 29 //通过unsafe.freeMemory释放创建的堆外内存 30 unsafe.freeMemory(address); 31 address = 0; 32 Bits.unreserveMemory(size, capacity); 33 }
4.5.2 WeakHashMap实现不用数据自动回收
WeakHashMap在使用方法上与普通的HashMap一样,区别在于当key不再被使用时(即不强可达时),WeakHashMap中这些key及其对应的entry、value都会被GC回收。其实现上就是借助于WeakReference。
WeakHashMap的Entry继承了WeakReference类,其key相当于前面介绍的Reference类中的referent。在创建Entry时还传了个queue实参、该queue被各Entry共享。相关源码:
1 //Entry继承了WeakReference, WeakReference引用的是Map的key 2 private static class Entry<K,V> extends WeakReference<Object> implements Map.Entry<K,V> { 3 V value; 4 final int hash; 5 Entry<K,V> next; 6 /** 7 * 创建Entry对象,上面分析过的ReferenceQueue,这个queue实际是WeakHashMap的成员变量, 8 * 创建WeakHashMap时其便被初始化 final ReferenceQueue<Object> queue = new ReferenceQueue<>() 9 */ 10 Entry(Object key, V value, 11 ReferenceQueue<Object> queue, 12 int hash, Entry<K,V> next) { 13 super(key, queue); 14 this.value = value; 15 this.hash = hash; 16 this.next = next; 17 } 18 //省略部分原码... 19 }
1 private final ReferenceQueue<Object> queue = new ReferenceQueue<>(); 2 3 public V put(K key, V value) { 4 Object k = maskNull(key); 5 int h = hash(k); 6 Entry<K,V>[] tab = getTable(); 7 int i = indexFor(h, tab.length); 8 9 for (Entry<K,V> e = tab[i]; e != null; e = e.next) { 10 if (h == e.hash && eq(k, e.get())) { 11 V oldValue = e.value; 12 if (value != oldValue) 13 e.value = value; 14 return oldValue; 15 } 16 } 17 18 modCount++; 19 Entry<K,V> e = tab[i]; 20 tab[i] = new Entry<>(k, value, queue, h, e);// queue被各Entry共享 21 if (++size >= threshold) 22 resize(tab.length * 2); 23 return null; 24 }
往WeakHashMap添加元素时会调用Entry的构造方法,也就是会创建一个WeakReference对象,这个对象引用的是WeakHashMap刚加入的Key,而所有的WeakReference对象关联在同一个ReferenceQueue上,因此通过该队列可以知道所有被回收的key对象对应的Reference对象,即Entry对象。那么,queue里面的元素在WeakHashMap里面是在哪里被拿出去做了什么操作?关键在于WeakHashMap#expungeStaleEntries方法,相关源码:
1 private void expungeStaleEntries() { 2 //不断地从ReferenceQueue中取出,那些只有被WeakReference对象引用的对象的Reference 3 for (Object x; (x = queue.poll()) != null; ) { 4 synchronized (queue) { 5 //转为 entry 6 Entry<K,V> e = (Entry<K,V>) x; 7 //计算其对应的桶的下标 8 int i = indexFor(e.hash, table.length); 9 //取出桶中元素 10 Entry<K,V> prev = table[i]; 11 Entry<K,V> p = prev; 12 //桶中对应位置有元素,遍历桶链表所有元素 13 while (p != null) { 14 Entry<K,V> next = p.next; 15 //如果当前元素(也就是entry)与queue取出的一致,将entry从链表中去除 16 if (p == e) { 17 if (prev == e) 18 table[i] = next; 19 else 20 prev.next = next; 21 //清空entry对应的value 22 e.value = null; // Help GC 23 size--; 24 break; 25 } 26 prev = p; 27 p = next; 28 } 29 } 30 } 31 }
虽然key对象被GC自动回收了,但与key相关的value、entry却还在,GC不会自动回收,因此如果不在代码中删除相应的这些数据就会导致memory leak。如何删除?从前面的expungeStaleEntries方法可以看出会循环从队列取entry、根据entry定位到桶、然后删除该桶中的链表上的该entry及value(值得注意的是这里index是根据entry计算的因为key已是null,而HashMap中是根据key计算的)。
expungeStaleEntries方法会被 getTable、size、resize 方法调用,而 getTable 又被 get、containsKey、put、remove、containsValue、replaceAll 等方法调用,最终看来,只要操作了WeakHashMap就会调用expungeStaleEntries方法。故只要没再被该Map以外的地方用到(强引用)的key,对应的key、entry、value就会被从该Map清除。
通过以上分析可知,WeakHashMap适合用在内存容量有限的场景,因为它能够自动对不再需要使用(强引用)的数据从Map中删除。
注:key不再被强引用后会被GC自动回收(回收多少个、回收哪些key都是不固定的,由GC决定),key对应的entry、value是用户代码处理之后才自动被GC回收的。
4.5.3 ThreadLocalMap解决MemoryLeak
ThreadLocalMap 在未采用WeakReference时,若某个key对象被回收了,则同样会存在该key对应的Entry、value没有回收的问题,即Memory Leak。
解决方法与WeakHashMap大同小异,Entry也继承了WeakReference、key为referent。区别在于:
Entry创建时不再提供queue参数;
1 static class Entry extends WeakReference<ThreadLocal<?>> { 2 /** The value associated with this ThreadLocal. */ 3 Object value; 4 5 Entry(ThreadLocal<?> k, Object v) { 6 super(k); 7 value = v; 8 } 9 }
1 private void set(ThreadLocal<?> key, Object value) { 2 3 // We don't use a fast path as with get() because it is at 4 // least as common to use set() to create new entries as 5 // it is to replace existing ones, in which case, a fast 6 // path would fail more often than not. 7 8 Entry[] tab = table; 9 int len = tab.length; 10 int i = key.threadLocalHashCode & (len-1); 11 12 for (Entry e = tab[i]; 13 e != null; 14 e = tab[i = nextIndex(i, len)]) { 15 ThreadLocal<?> k = e.get(); 16 17 if (k == key) { 18 e.value = value; 19 return; 20 } 21 22 if (k == null) { 23 replaceStaleEntry(key, value, i); 24 return; 25 } 26 } 27 28 tab[i] = new Entry(key, value); 29 int sz = ++size; 30 if (!cleanSomeSlots(i, sz) && sz >= threshold) 31 rehash(); 32 }
解决散列冲突用的是线性探测再散列,因此在内部存储上只有一个大数组而非桶链法;
只有在set时才会触发 expungeStaleEntries,且清除不用的对象对应的Reference对象时不是从queue取,而是遍历整个桶的元素,即Entry,判断Entry的key对应的对象是否已被回收;
1 /** 2 * Expunge all stale entries in the table. 3 */ 4 private void expungeStaleEntries() { 5 Entry[] tab = table; 6 int len = tab.length; 7 for (int j = 0; j < len; j++) { 8 Entry e = tab[j]; 9 if (e != null && e.get() == null) 10 expungeStaleEntry(j); 11 } 12 } 13 14 15 private int expungeStaleEntry(int staleSlot) { 16 Entry[] tab = table; 17 int len = tab.length; 18 19 // expunge entry at staleSlot 20 tab[staleSlot].value = null; 21 tab[staleSlot] = null; 22 size--; 23 24 // Rehash until we encounter null 25 Entry e; 26 int i; 27 for (i = nextIndex(staleSlot, len); 28 (e = tab[i]) != null; 29 i = nextIndex(i, len)) { 30 ThreadLocal<?> k = e.get(); 31 if (k == null) { 32 e.value = null; 33 tab[i] = null; 34 size--; 35 } else { 36 int h = k.threadLocalHashCode & (len - 1); 37 if (h != i) { 38 tab[i] = null; 39 40 // Unlike Knuth 6.4 Algorithm R, we must scan until 41 // null because multiple entries could have been stale. 42 while (tab[h] != null) 43 h = nextIndex(h, len); 44 tab[h] = e; 45 } 46 } 47 } 48 return i; 49 }
关于ThreadLocalMap的更多内容,可参阅 Java ThreadLocal - MarchOn
题外话:
在探索上述原理的过程中发现 WeakHashMap、ThreadLocalMap 在计算hash、解决散列冲突方面与HashMap的区别:WeakHashMap根据entry计算hash、ThreadLocalMap用线性探测再散列解决hash冲突。
计算hash:(table length 都要求满足2^n)
WeakHashMap(根据entry计算): h=entry.hash; length=table.length; return h & (length-1); 。由于queue里元素的key始终为null故根据entry计算。
ThreadLocalMap(根据key计算): return threadLocal.threadLocalHashCode & (len-1);
HashMap(根据key计算): (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16)
解决散列冲突:
WeakHashMap:桶链法
ThreadLocalMap:线性探测再散列。这是由使用场景决定的:该Map的key为ThreadLocal对象,在一个程序中并不会有很多个这种对象(撑死了上百?)。
HashMap:桶链法