ThreadPoolExecutor是Executor运行框架最重要的一个实现类。提供了线程池管理和任务管理是两个最主要的能力。这篇通过分析ThreadPoolExecutor的源代码来看看怎样设计和实现一个基于生产者消费者模型的运行器。
生产者消费者模型
生产者消费者模型包括三个角色:生产者,工作队列,消费者。对于ThreadPoolExecutor来说,
1. 生产者是任务的提交者,是外部调用ThreadPoolExecutor的线程
2. 工作队列是一个堵塞队列的接口,详细的实现类能够有非常多种。
BlockingQueue<Runnable> workQueue;
3. 消费者是封装了线程的Worker类的集合。
HashSet<Worker> workers = new HashSet<Worker>();
主要属性
明白了ThreadPoolExecutor的基本运行模型之后,来看下它的几个主要属性:
1. private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0)); 一个32位的原子整形作为线程池的状态控制描写叙述符。
低29位作为工作者线程的数量。
所以工作者线程最多有2^29 -1个。高3位来保持线程池的状态。ThreadPoolExecutor总共同拥有5种状态:
* RUNNING: 能够接受新任务并运行
* SHUTDOWN: 不再接受新任务,可是仍然运行工作队列中的任务
* STOP: 不再接受新任务,不运行工作队列中的任务。而且中断正在运行的任务
* TIDYING: 全部任务被终止,工作线程的数量为0。会去运行terminated()钩子方法
* TERMINATED: terminated()运行结束
以下是一系列ctl这个变量定义和工具方法
private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0)); private static final int COUNT_BITS = Integer.SIZE - 3; private static final int CAPACITY = (1 << COUNT_BITS) - 1; // runState is stored in the high-order bits private static final int RUNNING = -1 << COUNT_BITS; private static final int SHUTDOWN = 0 << COUNT_BITS; private static final int STOP = 1 << COUNT_BITS; private static final int TIDYING = 2 << COUNT_BITS; private static final int TERMINATED = 3 << COUNT_BITS; // Packing and unpacking ctl private static int runStateOf(int c) { return c & ~CAPACITY; } private static int workerCountOf(int c) { return c & CAPACITY; } private static int ctlOf(int rs, int wc) { return rs | wc; } private static boolean runStateLessThan(int c, int s) { return c < s; } private static boolean runStateAtLeast(int c, int s) { return c >= s; } private static boolean isRunning(int c) { return c < SHUTDOWN; } private boolean compareAndIncrementWorkerCount(int expect) { return ctl.compareAndSet(expect, expect + 1); } private boolean compareAndDecrementWorkerCount(int expect) { return ctl.compareAndSet(expect, expect - 1); } private void decrementWorkerCount() { do {} while (! compareAndDecrementWorkerCount(ctl.get())); }
2. private final BlockingQueue<Runnable> workQueue; 工作队列,採用了BlockingQueue堵塞队列的接口,详细实现类能够依照不同的策略来选择。比方有边界的ArrayBlockingQueue,无边界的LinkedBlockingQueue。
3. private final ReentrantLock mainLock = new ReentrantLock(); 控制ThreadPoolExecutor的全局可重入锁,全部须要同步的操作都要被这个锁保护
4. private final Condition termination = mainLock.newCondition(); mainLock的条件队列,来进行wait()和notify()等条件操作
5. private final HashSet<Worker> workers = new HashSet<Worker>(); 工作线程集合
6. private volatile ThreadFactory threadFactory; 创建线程的工厂,能够自己定义线程创建的逻辑
7. private volatile RejectedExecutionHandler handler; 拒绝运行任务的处理器。能够自己定义拒绝的策略
8. private volatile long keepAliveTime; 空暇线程的存活时间。
能够依据这个存活时间来推断空暇线程是否等待超时,然后採取对应的线程回收操作
9. private volatile boolean allowCoreThreadTimeOut; 是否同意coreThread线程超时回收
10. private volatile int corePoolSize; 可存活的线程的最小值。假设设置了allowCoreThreadTimeOut, 那么corePoolSize的值能够为0。
11. private volatile int maximumPoolSize; 可存活的线程的最大值
工作线程创建和回收策略
ThreadPoolExecutor通过corePoolSize,maximumPoolSize, allowCoreThreadTimeOut。keepAliveTime等几个參数提供一个灵活的工作线程创建和回收的策略。
创建策略:
1. 当工作线程数量小于corePoolSize时,无论其它线程是否空暇。都创建新的工作线程来处理新增加的任务
2. 当工作线程数量大于corePoolSize,小于maximumPoolSize时,仅仅有当工作队列满了,才会创建新的工作线程来处理新增加的任务。当工作队列有空余时,仅仅把新任务增加队列
3. 把corePoolSize和maximumPoolSize 设置成同样的值时,线程池就是一个固定(fixed)工作线程数的线程。
回收策略:
1. keepAliveTime变量设置了空暇工作线程超时的时间,当工作线程数量超过了corePoolSize后。空暇的工作线程等待超过了keepAliveTime后,会被回收。后面会说怎么确定一个工作线程是否“空暇”。
2. 假设设置了allowCoreThreadTimeOut,那么core Thread也能够被回收,即当core thread也空暇时。也能够被回收,直到工作线程集合为0。
工作队列策略
工作队列BlockingQueue<Runnable> workQueue 是用来存放提交的任务的。它有4个主要的策略。而且依据不同的堵塞队列的实现类能够引入很多其它的工作队列的策略。
4个基本策略:
1. 当工作线程数量小于corePoolSize时。新提交的任务总是会由新创建的工作线程运行。不入队列
2. 当工作线程数量大于corePoolSize。假设工作队列没满。新提交的任务就入队列
3. 当工作线程数量大于corePoolSize,小于MaximumPoolSize时。假设工作队列满了,新提交的任务就交给新创建的工作线程,不入队列
4. 当工作线程数量大于MaximumPoolSize。而且工作队列满了。那么新提交的任务会被拒绝运行。详细看採用何种拒绝策略
依据不同的堵塞队列的实现类。又有几种额外的策略
1. 採用SynchronousQueue直接将任务传递给空暇的线程运行。不额外存储任务。这样的方式须要无限制的MaximumPoolSize,能够创建无限制的工作线程来处理提交的任务。这样的方式的优点是任务能够非常快被运行,适用于任务到达时间大于任务处理时间的情况。
缺点是当任务量非常大时,会占用大量线程
2. 採用无边界的工作队列LinkedBlockingQueue。这样的情况下。由于工作队列永远不会满,那么工作线程的数量最大就是corePoolSize,由于当工作线程数量达到corePoolSize时,仅仅有工作队列满的时候才会创建新的工作线程。
这样的方式优点是使用的线程数量是稳定的,当内存足够大时,能够处理足够多的请求。
缺点是假设任务直接有依赖,非常有可能形成死锁,由于当工作线程被消耗完时,不会创建新的工作现场,仅仅会把任务增加工作队列。
而且可能由于内存耗尽引发内存溢出OOM
3. 採用有界的工作队列AraayBlockingQueue。这样的情况下对于内存资源是可控的。可是须要合理调节MaximumPoolSize和工作队列的长度。这两个值是相互影响的。当工作队列长度比較小的时,必然会创建很多其它的线程。
而很多其它的线程会引起上下文切换等额外的消耗。
当工作队列大,MaximumPoolSize小的时候,会影响吞吐量,而且会触发拒绝机制。
拒绝运行策略
当Executor处于shutdown状态或者工作线程超过MaximumPoolSize而且工作队列满了之后。新提交的任务将会被拒绝运行。RejectedExecutionHandler接口定义了拒绝运行的策略。
详细的策略有
CallerRunsPolicy:由调用者线程来运行被拒绝的任务。属于同步运行
AbortPolicy:中止运行,抛出RejectedExecutionException异常
DiscardPolicy:丢弃任务
DiscardOldestPolicy:丢弃最老的任务
public static class CallerRunsPolicy implements RejectedExecutionHandler { /** * Creates a {@code CallerRunsPolicy}. */ public CallerRunsPolicy() { } /** * Executes task r in the caller's thread, unless the executor * has been shut down, in which case the task is discarded. * * @param r the runnable task requested to be executed * @param e the executor attempting to execute this task */ public void rejectedExecution(Runnable r, ThreadPoolExecutor e) { if (!e.isShutdown()) { r.run(); } } } /** * A handler for rejected tasks that throws a * {@code RejectedExecutionException}. */ public static class AbortPolicy implements RejectedExecutionHandler { /** * Creates an {@code AbortPolicy}. */ public AbortPolicy() { } /** * Always throws RejectedExecutionException. * * @param r the runnable task requested to be executed * @param e the executor attempting to execute this task * @throws RejectedExecutionException always. */ public void rejectedExecution(Runnable r, ThreadPoolExecutor e) { throw new RejectedExecutionException("Task " + r.toString() + " rejected from " + e.toString()); } } /** * A handler for rejected tasks that silently discards the * rejected task. */ public static class DiscardPolicy implements RejectedExecutionHandler { /** * Creates a {@code DiscardPolicy}. */ public DiscardPolicy() { } /** * Does nothing, which has the effect of discarding task r. * * @param r the runnable task requested to be executed * @param e the executor attempting to execute this task */ public void rejectedExecution(Runnable r, ThreadPoolExecutor e) { } } /** * A handler for rejected tasks that discards the oldest unhandled * request and then retries {@code execute}, unless the executor * is shut down, in which case the task is discarded. */ public static class DiscardOldestPolicy implements RejectedExecutionHandler { /** * Creates a {@code DiscardOldestPolicy} for the given executor. */ public DiscardOldestPolicy() { } /** * Obtains and ignores the next task that the executor * would otherwise execute, if one is immediately available, * and then retries execution of task r, unless the executor * is shut down, in which case task r is instead discarded. * * @param r the runnable task requested to be executed * @param e the executor attempting to execute this task */ public void rejectedExecution(Runnable r, ThreadPoolExecutor e) { if (!e.isShutdown()) { e.getQueue().poll(); e.execute(r); } } }
工作线程Worker的设计
工作线程没有直接使用Thread,而是採用了Worker类封装了Thread。目的是更好地进行中断控制。
Worker直接继承了AbstractQueuedSynchronizer来进行同步操作。它实现了一个不可重入的相互排斥结构。当它的state属性为0时表示unlock。state为1时表示lock。任务执行时必须在lock状态的保护下。防止出现同步问题。因此当Worker处于lock状态时。表示它正在执行,当它处于unlock状态时。表示它“空暇”。
当它空暇超过keepAliveTime时。就有可能被回收。
Worker还实现了Runnable接口, 运行它的线程是Worker包括的Thread对象。在Worker的构造函数能够看到Thread创建时,把Worker对象传递给了它。
private final class Worker extends AbstractQueuedSynchronizer implements Runnable { /** Thread this worker is running in. Null if factory fails. */ final Thread thread; /** Initial task to run. Possibly null. */ Runnable firstTask; /** Per-thread task counter */ volatile long completedTasks; Worker(Runnable firstTask) { setState(-1); // inhibit interrupts until runWorker this.firstTask = firstTask; // 把Worker对象作为Runnable的实例传递给了新创建Thread对象 this.thread = getThreadFactory().newThread(this); } public void run() { runWorker(this); } // Lock methods // // The value 0 represents the unlocked state. // The value 1 represents the locked state. protected boolean isHeldExclusively() { return getState() != 0; } protected boolean tryAcquire(int unused) { if (compareAndSetState(0, 1)) { setExclusiveOwnerThread(Thread.currentThread()); return true; } return false; } protected boolean tryRelease(int unused) { setExclusiveOwnerThread(null); setState(0); return true; } public void lock() { acquire(1); } public boolean tryLock() { return tryAcquire(1); } public void unlock() { release(1); } public boolean isLocked() { return isHeldExclusively(); } void interruptIfStarted() { Thread t; if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) { try { t.interrupt(); } catch (SecurityException ignore) { } } } }
Worker被它的线程运行时。run方法调用了ThreadPoolExecutor的runWorker方法。
1. wt指向当前运行Worker的run方法的线程,也就是指向了Worker包括的工作线程对象
2. task指向Worker包括的firstTask对象。表示当前要运行的任务
3. 当task不为null或者从工作队列中取到了新任务,那么先加锁w.lock表示正在执行任务。在真正開始执行task.run()之前。先推断线程池的状态是否已经STOP。假设是。就中断Worker的线程。
4. 一旦推断当前线程不是STOP而且工作线程没有中断。那么就開始运行task.run()了。Worker的interruptIfStarted方法能够中断这个Worker的线程。从而中断正在运行任务。
5. beforeExecute(wt, task)和afterExecute(wt,task)是两个钩子方法,支持在任务真正開始运行前即可扩展。
final void runWorker(Worker w) { Thread wt = Thread.currentThread(); Runnable task = w.firstTask; w.firstTask = null; w.unlock(); // allow interrupts boolean completedAbruptly = true; try { while (task != null || (task = getTask()) != null) { w.lock(); // If pool is stopping, ensure thread is interrupted; // if not, ensure thread is not interrupted. This // requires a recheck in second case to deal with // shutdownNow race while clearing interrupt if ((runStateAtLeast(ctl.get(), STOP) || (Thread.interrupted() && runStateAtLeast(ctl.get(), STOP))) && !wt.isInterrupted()) wt.interrupt(); try { beforeExecute(wt, task); Throwable thrown = null; try { task.run(); } catch (RuntimeException x) { thrown = x; throw x; } catch (Error x) { thrown = x; throw x; } catch (Throwable x) { thrown = x; throw new Error(x); } finally { afterExecute(task, thrown); } } finally { task = null; w.completedTasks++; w.unlock(); } } completedAbruptly = false; } finally { processWorkerExit(w, completedAbruptly); } }
工作线程Worker创建和回收的源代码
首先看一下ThreadPoolExecutor的execute方法,这个方式是任务提交的入口。能够看到它的逻辑符合之前说的工作线程创建的基本策略
1. 当工作线程数量小于corePoolSize时,通过addWorker(command,true)来新建工作线程处理新建的任务,不入工作队列
2. 当工作线程数量大于等于corePoolSize时。先入队列,使用的是BlockingQueue的offer方法。当工作线程数量为0时,还会通过addWorker(null, false)加入一个新的工作线程
3. 当工作队列满了而且工作线程数量在corePoolSize和MaximumPoolSize之间,就创建新的工作线程去运行新加入的任务。
当工作线程数量超过了MaximumPoolSize,就拒绝任务。
public void execute(Runnable command) { if (command == null) throw new NullPointerException(); int c = ctl.get(); if (workerCountOf(c) < corePoolSize) { if (addWorker(command, true)) return; c = ctl.get(); } if (isRunning(c) && workQueue.offer(command)) { int recheck = ctl.get(); if (! isRunning(recheck) && remove(command)) reject(command); else if (workerCountOf(recheck) == 0) addWorker(null, false); } else if (!addWorker(command, false)) reject(command); }
能够看到addWorker方法是创建Worker工作线程的所在。
1. retry这个循环推断线程池的状态和当前工作线程数量的边界。假设同意创建工作现场,首先改动ctl变量表示的工作线程的数量
2. 把工作线程加入到workers集合中的操作要在mainLock这个锁的保护下进行。全部和ThreadPoolExecutor状态相关的操作都要在mainLock锁的保护下进行
3. w = new Worker(firstTask); 创建Worker实例,把firstTask作为它当前的任务。firstTask为null时表示先仅仅创建Worker线程。然后去工作队列中取任务运行
4. 把新创建的Worker实例增加到workers集合,改动相关统计变量。
5. 当增加集合成功后。開始启动这个Worker实例。
启动的方法是调用Worker封装的Thread的start()方法。
之前说了,这个Thread相应的Runnable是Worker本身,会去调用Worker的run方法,然后调用ThreadPoolExecutor的runWorker方法。在runWorker方法中真正去运行任务。
private boolean addWorker(Runnable firstTask, boolean core) { retry: for (;;) { int c = ctl.get(); int rs = runStateOf(c); // Check if queue empty only if necessary. if (rs >= SHUTDOWN && ! (rs == SHUTDOWN && firstTask == null && ! workQueue.isEmpty())) return false; for (;;) { int wc = workerCountOf(c); if (wc >= CAPACITY || wc >= (core ? corePoolSize : maximumPoolSize)) return false; if (compareAndIncrementWorkerCount(c)) break retry; c = ctl.get(); // Re-read ctl if (runStateOf(c) != rs) continue retry; // else CAS failed due to workerCount change; retry inner loop } } boolean workerStarted = false; boolean workerAdded = false; Worker w = null; try { final ReentrantLock mainLock = this.mainLock; w = new Worker(firstTask); final Thread t = w.thread; if (t != null) { mainLock.lock(); try { // Recheck while holding lock. // Back out on ThreadFactory failure or if // shut down before lock acquired. int c = ctl.get(); int rs = runStateOf(c); if (rs < SHUTDOWN || (rs == SHUTDOWN && firstTask == null)) { if (t.isAlive()) // precheck that t is startable throw new IllegalThreadStateException(); workers.add(w); int s = workers.size(); if (s > largestPoolSize) largestPoolSize = s; workerAdded = true; } } finally { mainLock.unlock(); } if (workerAdded) { t.start(); workerStarted = true; } } } finally { if (! workerStarted) addWorkerFailed(w); } return workerStarted; }
工作线程回收的方法是processWorkerExit(),它在runWorker方法运行结束的时候被调用。之前说了空暇的工作线程可能会在keepAliveTime时间之后被回收。
这个逻辑隐含在runWorker方法和getTask方法中,会在以下说怎样从工作队列取任务时说明。
processWorkerExit方法单纯仅仅是处理工作线程的回收。
1. 结合runWorker方法看,假设Worker运行task.run()的时候抛出了异常,那么completedAbruptly为true。须要从workers集合中把这个工作线程移除掉。
2. 假设是completedAbruptly为true,而且线程池不是STOP状态,那么就创建一个新的Worker工作线程
3. 假设是completedAbruptly为false。而且线程池不是STOP状态,首先检查是否allowCoreThreadTimeout,假设执行。那么最少线程数能够为0,否则是corePoolSize。假设最少线程数为0,而且工作队列不为空,那么最小值为1。最后检查当前的工作线程数量,假设小于最小值。就创建新的工作线程。
private void processWorkerExit(Worker w, boolean completedAbruptly) { if (completedAbruptly) // If abrupt, then workerCount wasn't adjusted decrementWorkerCount(); final ReentrantLock mainLock = this.mainLock; mainLock.lock(); try { completedTaskCount += w.completedTasks; workers.remove(w); } finally { mainLock.unlock(); } tryTerminate(); int c = ctl.get(); if (runStateLessThan(c, STOP)) { if (!completedAbruptly) { int min = allowCoreThreadTimeOut ? 0 : corePoolSize; if (min == 0 && ! workQueue.isEmpty()) min = 1; if (workerCountOf(c) >= min) return; // replacement not needed } addWorker(null, false); } }
任务的获取
工作线程从工作队列中取任务的代码在getTask方法中
1. timed变量表示是否要计时。当计时超过keepAliveTime后还没取到任务。就返回null。结合runWorker方法能够知道,当getTask返回null时,该Worker线程会被回收,这就是怎样回收空暇工作线程的方法。
timed变量当allowCoreThreadTimeout为true或者当工作线程数大于corePoolSize时为true。
2. 假设timed为true,就用BlockingQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS)方法来计时从队头取任务,否则直接用take()方法从队头取任务
private Runnable getTask() { boolean timedOut = false; // Did the last poll() time out? retry: for (;;) { int c = ctl.get(); int rs = runStateOf(c); // Check if queue empty only if necessary. if (rs >= SHUTDOWN && (rs >= STOP || workQueue.isEmpty())) { decrementWorkerCount(); return null; } boolean timed; // Are workers subject to culling? for (;;) { int wc = workerCountOf(c); timed = allowCoreThreadTimeOut || wc > corePoolSize; if (wc <= maximumPoolSize && ! (timedOut && timed)) break; if (compareAndDecrementWorkerCount(c)) return null; c = ctl.get(); // Re-read ctl if (runStateOf(c) != rs) continue retry; // else CAS failed due to workerCount change; retry inner loop } try { Runnable r = timed ?workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS) : workQueue.take(); if (r != null) return r; timedOut = true; } catch (InterruptedException retry) { timedOut = false; } } }
线程池的关闭
线程池有SHUTDOWN, STOP, TIDYING, TERMINATED这几个状态和线程池关闭相关。
通常我们把关闭分为优雅的关闭和强制立马关闭。
所谓优雅的关闭就是调用shutdown()方法,线程池进入SHUTDOWN状态。不在接收新的任务,会把工作队列的任务运行完成后再结束。
强制立马关闭就是调用shutdownNow()方法,线程池直接进入STOP状态,会中断正在运行的工作线程,清空工作队列。
1. 在shutdown方法中。先设置线程池状态为SHUTDOWN,然后先去中断空暇的工作线程,再调用onShutdown钩子方法。最后tryTerminate()
2. 在shutdownNow方法中。先设置线程池状态为STOP。然后先中断全部的工作线程,再清空工作队列。最后tryTerminate()。这种方法会把工作队列中的任务返回给调用者处理。
public void shutdown() { final ReentrantLock mainLock = this.mainLock; mainLock.lock(); try { checkShutdownAccess(); advanceRunState(SHUTDOWN); interruptIdleWorkers(); onShutdown(); // hook for ScheduledThreadPoolExecutor } finally { mainLock.unlock(); } tryTerminate(); } public List<Runnable> shutdownNow() { List<Runnable> tasks; final ReentrantLock mainLock = this.mainLock; mainLock.lock(); try { checkShutdownAccess(); advanceRunState(STOP); interruptWorkers(); tasks = drainQueue(); } finally { mainLock.unlock(); } tryTerminate(); return tasks; }
interruptIdleWorkers方法会去中断空暇的工作线程。所谓空暇的工作线程即没有上锁的Worker。
而interruptWorkers方法直接去中断全部的Worker,调用Worker.interruptIfStarted()方法
private void interruptIdleWorkers(boolean onlyOne) { final ReentrantLock mainLock = this.mainLock; mainLock.lock(); try { for (Worker w : workers) { Thread t = w.thread; if (!t.isInterrupted() && w.tryLock()) { try { t.interrupt(); } catch (SecurityException ignore) { } finally { w.unlock(); } } if (onlyOne) break; } } finally { mainLock.unlock(); } } private void interruptWorkers() { final ReentrantLock mainLock = this.mainLock; mainLock.lock(); try { for (Worker w : workers) w.interruptIfStarted(); } finally { mainLock.unlock(); } } void interruptIfStarted() { Thread t; if (getState() >= 0 && (t = thread) != null && !t.isInterrupted()) { try { t.interrupt(); } catch (SecurityException ignore) { } } }
tryTerminate方法会尝试终止线程池。依据线程池的状态,在对应状态会中断空暇工作线程,调用terminated()钩子方法,设置状态为TERMINATED。
final void tryTerminate() { for (;;) { int c = ctl.get(); if (isRunning(c) || runStateAtLeast(c, TIDYING) || (runStateOf(c) == SHUTDOWN && ! workQueue.isEmpty())) return; if (workerCountOf(c) != 0) { // Eligible to terminate interruptIdleWorkers(ONLY_ONE); return; } final ReentrantLock mainLock = this.mainLock; mainLock.lock(); try { if (ctl.compareAndSet(c, ctlOf(TIDYING, 0))) { try { terminated(); } finally { ctl.set(ctlOf(TERMINATED, 0)); termination.signalAll(); } return; } } finally { mainLock.unlock(); } // else retry on failed CAS } }
最后说明一下,JVM的守护进程仅仅有当全部派生出来的线程都结束后才会退出,使用ThreadPoolExecutor线程池时。假设有的任务一直运行,而且不响应中断,那么会一直占用线程,那么JVM也会一直工作。不会退出。