• 记一次CountDownLatch引发的问题


    背景:使用countDownLatch,10个线程计算结果,最后通过主线程来汇总结果;

    有个spring scheduler的程序,运行一段时间后,不运行了,查看日志发现CountDownLatch无法归零,导致整个主线程Hang在那里,此时如果调用其外部接口还能响应:

    怀疑是不是死锁了? 查一下jstack发现没有出现死锁:

    "Attach Listener" daemon prio=10 tid=0x00007fc1b4001000 nid=0x162e4 waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "Abandoned connection cleanup thread" daemon prio=10 tid=0x00007fc1801e5800 nid=0x1e632 in Object.wait() [0x00007fc1ec79a000]
       java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x0000000703f97058> (a java.lang.ref.ReferenceQueue$Lock)
        at com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:43)
    
    "Tomcat JDBC Pool Cleaner[1485406391:1509516002697]" daemon prio=10 tid=0x00007fc1801b8000 nid=0x1e631 in Object.wait() [0x00007fc1ec89b000]
       java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.util.TimerThread.mainLoop(Timer.java:552)
        - locked <0x000000070406b8d8> (a java.util.TaskQueue)
        at java.util.TimerThread.run(Timer.java:505)
    
    "http-nio-8119-Acceptor-0" daemon prio=10 tid=0x00007fc19ec27800 nid=0x2d11 runnable [0x00007fc1ba16b000]
       java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
        - locked <0x000000070408f7d8> (a java.lang.Object)
        at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:686)
        at java.lang.Thread.run(Thread.java:745)
    
    "http-nio-8119-ClientPoller-1" daemon prio=10 tid=0x00007fc19f9a2800 nid=0x2d10 runnable [0x00007fc1cc1af000]
       java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x000000070408fbe0> (a sun.nio.ch.Util$2)
        - locked <0x000000070408fbf0> (a java.util.Collections$UnmodifiableSet)
        - locked <0x000000070408fb98> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1049)
        at java.lang.Thread.run(Thread.java:745)
    
    "http-nio-8119-ClientPoller-0" daemon prio=10 tid=0x00007fc19d143000 nid=0x2d0f runnable [0x00007fc1cc2b0000]
       java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x000000070416a268> (a sun.nio.ch.Util$2)
        - locked <0x000000070416a278> (a java.util.Collections$UnmodifiableSet)
        - locked <0x000000070416a220> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1049)
        at java.lang.Thread.run(Thread.java:745)
    
    "NioBlockingSelector.BlockPoller-1" daemon prio=10 tid=0x00007fc19e89b000 nid=0x2d0e runnable [0x00007fc1cc3b1000]
       java.lang.Thread.State: RUNNABLE
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
        - locked <0x0000000703d3c3a0> (a sun.nio.ch.Util$2)
        - locked <0x0000000703d3c390> (a java.util.Collections$UnmodifiableSet)
        - locked <0x0000000703d3c258> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
        at org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:342)
    
    "pool-1-thread-1" prio=10 tid=0x00007fc1a0749800 nid=0x2d0b waiting on condition [0x00007fc1cc4b1000]
       java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007047385d8> (a java.util.concurrent.CountDownLatch$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236)
        at com.didi.bigdataqa.tools.HiveBaseTool.doHealthyScore(HiveBaseTool.java:139)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65)
        at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
        at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:81)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
    
    "pool-2-thread-1" daemon prio=10 tid=0x00007fc1a0381000 nid=0x2ceb waiting on condition [0x00007fc1cc5b3000]
       java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007037d9338> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
        at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
    
    "cluster-ClusterId{value='59f823ec4d62e12bc9d4a556', description='null'}-localhost:27017" daemon prio=10 tid=0x00007fc1a1734800 nid=0x2cea waiting on condition [0x00007fc1ec497000]
       java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007037c1d30> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
        at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
        at com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable.waitForSignalOrTimeout(DefaultServerMonitor.java:224)
        at com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable.waitForNext(DefaultServerMonitor.java:205)
        at com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:153)
        - locked <0x00000007037c19e8> (a com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable)
        at java.lang.Thread.run(Thread.java:745)
    
    "commons-pool-EvictionTimer" daemon prio=10 tid=0x00007fc19ecc5800 nid=0x2cbb in Object.wait() [0x00007fc1ec396000]
       java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.util.TimerThread.mainLoop(Timer.java:552)
        - locked <0x0000000703b48458> (a java.util.TaskQueue)
        at java.util.TimerThread.run(Timer.java:505)
    
    "container-0" prio=10 tid=0x00007fc19dfd7800 nid=0x2c95 waiting on condition [0x00007fc1ec598000]
       java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.catalina.core.StandardServer.await(StandardServer.java:407)
        at org.springframework.boot.context.embedded.tomcat.TomcatEmbeddedServletContainer$1.run(TomcatEmbeddedServletContainer.java:139)
    
    "ContainerBackgroundProcessor[StandardEngine[Tomcat].StandardHost[localhost].StandardContext[/healthy_score]]" daemon prio=10 tid=0x00007fc199247800 nid=0x2c94 waiting on condition [0x00007fc1ec699000]
       java.lang.Thread.State: TIMED_WAITING (sleeping)
        at java.lang.Thread.sleep(Native Method)
        at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1345)
        at java.lang.Thread.run(Thread.java:745)
    
    "DestroyJavaVM" prio=10 tid=0x00007fc208009000 nid=0x2bca waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "Service Thread" daemon prio=10 tid=0x00007fc208244800 nid=0x2bdc runnable [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "C2 CompilerThread1" daemon prio=10 tid=0x00007fc208242000 nid=0x2bdb waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "C2 CompilerThread0" daemon prio=10 tid=0x00007fc20823f000 nid=0x2bda waiting on condition [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "Signal Dispatcher" daemon prio=10 tid=0x00007fc20823d000 nid=0x2bd9 runnable [0x0000000000000000]
       java.lang.Thread.State: RUNNABLE
    
    "Finalizer" daemon prio=10 tid=0x00007fc208210000 nid=0x2bd7 in Object.wait() [0x00007fc1edb29000]
       java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x000000070001e8f0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
    
    "Reference Handler" daemon prio=10 tid=0x00007fc20820e000 nid=0x2bd6 in Object.wait() [0x00007fc1edc2a000]
       java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        - locked <0x00000006fff96908> (a java.lang.ref.Reference$Lock)
    
    "VM Thread" prio=10 tid=0x00007fc208209800 nid=0x2bd5 runnable 
    
    "GC task thread#0 (ParallelGC)" prio=10 tid=0x00007fc20801f000 nid=0x2bcb runnable 
    
    "GC task thread#1 (ParallelGC)" prio=10 tid=0x00007fc208021000 nid=0x2bcc runnable 
    
    "GC task thread#2 (ParallelGC)" prio=10 tid=0x00007fc208023000 nid=0x2bcd runnable 
    
    "GC task thread#3 (ParallelGC)" prio=10 tid=0x00007fc208025000 nid=0x2bce runnable 
    
    "GC task thread#4 (ParallelGC)" prio=10 tid=0x00007fc208026800 nid=0x2bcf runnable 
    
    "GC task thread#5 (ParallelGC)" prio=10 tid=0x00007fc208028800 nid=0x2bd0 runnable 
    
    "GC task thread#6 (ParallelGC)" prio=10 tid=0x00007fc20802a800 nid=0x2bd1 runnable 
    
    "GC task thread#7 (ParallelGC)" prio=10 tid=0x00007fc20802c800 nid=0x2bd2 runnable 
    
    "GC task thread#8 (ParallelGC)" prio=10 tid=0x00007fc20802e800 nid=0x2bd3 runnable 
    
    "GC task thread#9 (ParallelGC)" prio=10 tid=0x00007fc208030800 nid=0x2bd4 runnable 
    
    "VM Periodic Task Thread" prio=10 tid=0x00007fc20824f800 nid=0x2bdd waiting on condition 
    
    JNI global references: 151
    View Code

    多次输入jstack观察:

    jstack 输出 nid=0x2d0f  查看这个线程ID,转换为十进制为11535

    查看进程下的线程,下面这几个都是没有释放的线程

     ps -T -p 11209

     

     11209  11534 ?        00:00:15 java

     11209  11535 ?        00:00:17 java

     11209  11536 ?        00:00:17 java

     11209  11537 ?        00:00:00 java

     

    在使用countDownLatch的await时,加入一个超时等待时间,这样就可以避免这个问题,但是问题的根源是什么呢?

    cnt.await(1, TimeUnit.HOURS);

    继续定位,怀疑是哪个地方没有设置超时机制,导致一直等待,或许哪里出现了死循环? jstack打印任务的线程都已经看不到了,只能看到进程
    代码里面没看到死循环的地方,怀疑是GC的问题,导致停机了
    最后再定位GC,发现有个List的超级大,最大的时候有30多w,堆设置的有点小。。。
    果断加大堆内存空间得到解决
  • 相关阅读:
    vuex 数据持久化
    vue中通过第三方代理解决跨域问题
    谷歌浏览器格式化插件
    mongodb安装配置
    Nodejs express中创建ejs项目
    elementui tree 组件实现鼠标移入节点,节点后面显示添加删除按钮
    iframe页面无法跳转问题
    elementui table组件,根据数据的不同,显示不同的内容
    elementui tree组件自定义图标
    Aborting a running program
  • 原文地址:https://www.cnblogs.com/dpains/p/7808864.html
Copyright © 2020-2023  润新知