• 转 JVM jstack命令用法及相关知识小结


    https://www.jianshu.com/p/99562e9391b5

    前言

    很久没有写过与JVM相关的文章了。今天搬砖有点累,不太想啃源码,写一篇实用性比较强的吧。

    在日常工作中,我们如果遇到JVM方面的问题,一般是采用各种现成的工具辅助定位解决,如VisualVM、JProfiler、Eclipse MAT、Arthas等。但是,我们也有必要了解JVM原生提供的那些命令行工具,本文就针对常用的jstack命令做个简单的总结,顺便聊一些与Java线程与并发相关的周边知识。今后有时间的话,也会将jmap、jhat、jstat等命令的用法总结出来。

    jstack简介

    jstack命令用来生成JVM中的线程快照(thread dump),其中包含有每个线程的方法调用栈以及其状态、锁信息等。其用法说明如下所示。

    ~ jstack -h
    Usage:
        jstack [-l] <pid>
            (to connect to running process)
        jstack -F [-m] [-l] <pid>
            (to connect to a hung process)
        jstack [-m] [-l] <executable> <core>
            (to connect to a core file)
        jstack [-m] [-l] [server_id@]<remote server IP or hostname>
            (to connect to a remote debug server)
    
    Options:
        -F  to force a thread dump. Use when jstack <pid> does not respond (process is hung)
        -m  to print both java and native frames (mixed mode)
        -l  long listing. Prints additional information about locks
        -h or -help to print this help message
    

    说明一下三个参数的含义:

    • -F:如果正常执行jstack命令没有响应(比如进程hung住了),可以加上此参数强制执行thread dump。
    • -m:除了打印Java的方法调用栈之外,还会输出native方法的栈帧。
    • -l:打印与锁有关的附加信息。使用此参数会导致JVM停止时间变长,在生产环境需慎用。

    jstack是在线程级别定位JVM问题的利器,但前提是得读懂thread dump,我们举例说明。

    线程快照

    下面是对一个正常运行的Spark Streaming作业执行jstack命令产生的线程快照的一部分。

    ~ jstack 18747
    Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.172-b11 mixed mode):
    
    "SparkUI-1510250" #1510250 daemon prio=5 os_prio=0 tid=0x00007f9a6c00c800 nid=0x45cc waiting on condition [0x00007f9ce86e7000]
       java.lang.Thread.State: TIMED_WAITING (parking)
            at sun.misc.Unsafe.park(Native Method)
            - parking to wait for  <0x00000000c0420db8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
            at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
            at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
            at org.spark_project.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:392)
            at org.spark_project.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:563)
            at org.spark_project.jetty.util.thread.QueuedThreadPool.access$800(QueuedThreadPool.java:48)
            at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
            at java.lang.Thread.run(Thread.java:748)
    
    "shuffle-server-6-7" #190 daemon prio=5 os_prio=0 tid=0x00007f9b44009000 nid=0x4d80 runnable [0x00007f9ce99f8000]
       java.lang.Thread.State: RUNNABLE
            at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
            at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
            at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
            at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
            - locked <0x00000000c58e6498> (a io.netty.channel.nio.SelectedSelectionKeySet)
            - locked <0x00000000c59c1528> (a java.util.Collections$UnmodifiableSet)
            - locked <0x00000000c59c1450> (a sun.nio.ch.EPollSelectorImpl)
            at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
            at io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:62)
            at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:753)
            at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:409)
            at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
            at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
            at java.lang.Thread.run(Thread.java:748)
    
    "SparkUI-84-acceptor-3@3b331d23-ServerConnector@9826a7d{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}" #84 daemon prio=3 os_prio=0 tid=0x00007f9d7decc800 nid=0x4500 waiting for monitor entry [0x00007f9d112c8000]
       java.lang.Thread.State: BLOCKED (on object monitor)
            at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:234)
            - waiting to lock <0x00000000c045e868> (a java.lang.Object)
            at org.spark_project.jetty.server.ServerConnector.accept(ServerConnector.java:373)
            at org.spark_project.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:593)
            at org.spark_project.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
            at org.spark_project.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
            at java.lang.Thread.run(Thread.java:748)
    
    "org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" #25 daemon prio=5 os_prio=0 tid=0x00007f9d7d8da000 nid=0x44c2 in Object.wait() [0x00007f9d19dc0000]
       java.lang.Thread.State: WAITING (on object monitor)
            at java.lang.Object.wait(Native Method)
            at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
            - locked <0x00000000c0031c98> (a java.lang.ref.ReferenceQueue$Lock)
            at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
            at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3213)
            at java.lang.Thread.run(Thread.java:748)
    
    "RecurringTimer - JobGenerator" #120 daemon prio=5 os_prio=0 tid=0x00007f9b04045000 nid=0x4cc3 waiting on condition [0x00007f9d10cd4000]
       java.lang.Thread.State: TIMED_WAITING (sleeping)
            at java.lang.Thread.sleep(Native Method)
            at org.apache.spark.util.SystemClock.waitTillTime(Clock.scala:63)
            at org.apache.spark.streaming.util.RecurringTimer.triggerActionForNextInterval(RecurringTimer.scala:93)
            at org.apache.spark.streaming.util.RecurringTimer.org$apache$spark$streaming$util$RecurringTimer$$loop(RecurringTimer.scala:106)
            at org.apache.spark.streaming.util.RecurringTimer$$anon$1.run(RecurringTimer.scala:29)
    

    在每个线程的快照的第一行,包含有线程名、是否为守护线程、优先级、线程ID等信息,第二行则是线程状态,下面就是方法调用栈了。下图是Java线程状态转换的示意,老生常谈。

     
     

    jstack线程快照中的状态与图示相同,只是没有NEW状态而已。我们逐一进行分析,在分析之前,先放出Java管程对象ObjectMonitor的简图。看官也可以通过我之前写的这篇文章来了解管程。

     
     
    RUNNABLE

    线程正在运行。如果在其调用栈中看到locked <地址>的提示,表示该方法正持有锁,即该线程位于Owner区内。

    BLOCKED

    线程处于阻塞状态,即正在等待锁被其他线程释放。在其调用栈的栈顶方法会看到waiting to lock <地址>的提示,表示该方法试图持有锁,线程正在Entry Set区等待。

    WAITING

    线程处于无限等待的状态。又分为两种情况:

    • on object monitor:线程已经获得锁,调用了不带超时参数的Object.wait()/Thread.join()方法,线程进入管程的Wait Set区。在其调用栈中会看到locked <地址>的提示。
    • parking:调用了LockSupport.park()方法,线程直接进入挂起状态(park是Unsafe提供的低级原语)。在其调用栈的栈顶方法会看到parking to wait for <地址>的提示。
    TIMED_WAITING

    线程处于有限等待的状态。它分为三种情况,除了与WAITING相同的on object monitor(获得锁并调用带超时的Object.wait()/Thread.join()方法)和parking(调用带超时的LockSupport.parkNanos()/parkUntil()方法)之外,还有一种sleep,即通过Thread.sleep()使线程睡眠。

    通过分析线程快照的状态和调用栈,可以让我们快速地定位造成Java程序表现异常的症结,如死锁、热锁(很多线程竞争同一块临界区造成大量BLOCKED)、高CPU占用、I/O长时间阻塞(注意此时线程状态可能是RUNNABLE)等。下面举两个具体的例子。

    用jstack诊断死锁

    死锁(deadlock)是操作系统理论中的基础概念,即在并发环境下,一个或多个线程在等待资源,但该资源又被其他进程所占用的困局。死锁的四个必要条件是:

    • 互斥(mutual exclusion)
    • 不可抢占(no preemption)
    • 持有并等待(hold and wait)
    • 循环等待(circular wait)

    下面我们用Java造一个死锁,并用jstack来诊断它。

    public class DeadlockExample {
      private static final Object lock1 = new Object();
      private static final Object lock2 = new Object();
    
      public static void main(String[] args) throws Exception {
        new Thread(() -> {
          for (int i = 0; i < 100; i++) {
            synchronized (lock1) {
              System.out.println("thread1 synchronized lock1");
              synchronized (lock2) {
                System.out.println("thread1 synchronized lock2");
              }
            }
          }
        }, "thread1").start();
    
        new Thread(() -> {
          for (int i = 0; i < 100; i++) {
            synchronized (lock2) {
              System.out.println("thread2 synchronized lock2");
              synchronized (lock1) {
                System.out.println("thread2 synchronized lock1");
              }
            }
          }
        }, "thread2").start();
      }
    }
    

    运行之,发现只输出了几句就停止了。

    thread1 synchronized lock1
    thread1 synchronized lock2
    thread1 synchronized lock1
    thread1 synchronized lock2
    thread1 synchronized lock1
    thread2 synchronized lock2
    

    用jstack打印线程快照,节选部分如下。

    "thread2" #20 prio=5 os_prio=31 tid=0x00007fad74020800 nid=0x6203 waiting for monitor entry [0x0000700006364000]
       java.lang.Thread.State: BLOCKED (on object monitor)
        at me.lmagics.DeadlockExample.lambda$main$1(DeadlockExample.java:28)
        - waiting to lock <0x00000007157d2a58> (a java.lang.Object)
        - locked <0x00000007157d2a68> (a java.lang.Object)
        at me.lmagics.DeadlockExample$$Lambda$2/501263526.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:748)
    
    "thread1" #19 prio=5 os_prio=31 tid=0x00007fad7401c000 nid=0x9903 waiting for monitor entry [0x0000700006261000]
       java.lang.Thread.State: BLOCKED (on object monitor)
        at me.lmagics.DeadlockExample.lambda$main$0(DeadlockExample.java:17)
        - waiting to lock <0x00000007157d2a68> (a java.lang.Object)
        - locked <0x00000007157d2a58> (a java.lang.Object)
        at me.lmagics.DeadlockExample$$Lambda$1/1394438858.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:748)
    
    Found one Java-level deadlock:
    =============================
    "thread2":
      waiting to lock monitor 0x00007fad65004168 (object 0x00000007157d2a58, a java.lang.Object),
      which is held by "thread1"
    "thread1":
      waiting to lock monitor 0x00007fad650056b8 (object 0x00000007157d2a68, a java.lang.Object),
      which is held by "thread2"
    
    Java stack information for the threads listed above:
    ===================================================
    "thread2":
        at me.lmagics.DeadlockExample.lambda$main$1(DeadlockExample.java:28)
        - waiting to lock <0x00000007157d2a58> (a java.lang.Object)
        - locked <0x00000007157d2a68> (a java.lang.Object)
        at me.lmagics.DeadlockExample$$Lambda$2/501263526.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:748)
    "thread1":
        at me.lmagics.DeadlockExample.lambda$main$0(DeadlockExample.java:17)
        - waiting to lock <0x00000007157d2a68> (a java.lang.Object)
        - locked <0x00000007157d2a58> (a java.lang.Object)
        at me.lmagics.DeadlockExample$$Lambda$1/1394438858.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:748)
    
    Found 1 deadlock.
    

    可见,我们不仅能够发现两个线程都处于BLOCKED状态,并且jstack还直接给出了死锁的详细信息,方便我们修改代码消除死锁。

    用jstack诊断高CPU占用

    接下来造一个有死循环的程序,模拟异常的CPU占用。

    public class InfiniteLoopExample {
      private static final Object lock = new Object();
    
      static class InfiniteLoopRunnable implements Runnable {
        @Override
        public void run() {
          synchronized (lock) {
            long l = 0;
            while (true) {
              l++;
            }
          }
        }
      }
    
      public static void main(String[] args) throws Exception {
        new Thread(new InfiniteLoopRunnable(), "thread1").start();
        new Thread(new InfiniteLoopRunnable(), "thread2").start();
      }
    }
    

    运行该程序,用jps命令找出该Java进程的PID,然后利用top -Hp <PID>命令找出吃CPU最多的那个线程。

     
     

    使用jstack导出线程快照到文件中。由于线程ID是十六进制表示的,所以我们要将线程ID转换成十六进制再grep。

     
     

    这下我们就可以定位到异常代码的位置,并进行修改了。

    The End

    晚安晚安。



    作者:LittleMagic
    链接:https://www.jianshu.com/p/99562e9391b5
    来源:简书
    著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
  • 相关阅读:
    PHP字符串中的变量解析(+教你如何在PHP字符串中加入变量)
    **【ci框架】PHP的CI框架集成Smarty的最佳方式
    六步实现Rest风格的API
    jfinal想用到中大型项目中的项目经验分享
    ***iOS 项目的目录结构能看出你的开发经验
    非常好!!!Linux源代码阅读——环境准备【转】
    非常好!!!Linux源代码阅读——中断【转】
    linux中断申请之request_threaded_irq【转】
    linux设备驱动归纳总结(六):2.分享中断号【转】
    一些不错的文档网址--笔记【原创】
  • 原文地址:https://www.cnblogs.com/python-xiakaibi/p/13411662.html
Copyright © 2020-2023  润新知