• Ftrace的部分使用方法


    ftrace主要是用于调试linux kernel调度相关的一个工具,也可用于分析部分kernel性能问题。

    相关ftrace的介绍可以参考:kernel/msm-4.9/Documentation/trace/ftrace.txt 以及同级目录下的其他介绍文件。

    kernel.org的ftrace文档也可以参考(信息较多):https://www.kernel.org/doc/html/latest/trace/ftrace.html

    谷歌官网的ftrace使用教程(请自带梯子):https://source.android.google.cn/devices/tech/debug/ftrace

     IBM developer的blog(介绍每个跟踪器,并且写了如何看ftrace的log,很有用,很有用,很有用):https://www.ibm.com/developerworks/cn/linux/l-cn-ftrace2/index.html

    一、trace_event

    trace event作为ftrace的一部分,它主要通过已有函数插桩的tracepoint,来跟踪相关事件。

    例如,在进程调度关键函数:__schedule()中:

    static void __sched notrace __schedule(bool preempt)
    {
    ...
        if (likely(prev != next)) {
            if (!prev->on_rq)
                prev->last_sleep_ts = wallclock;
    
            update_task_ravg(prev, rq, PUT_PREV_TASK, wallclock, 0);
            update_task_ravg(next, rq, PICK_NEXT_TASK, wallclock, 0);
            rq->nr_switches++;
            rq->curr = next;
            ++*switch_count;
    
            trace_sched_switch(preempt, prev, next);
            rq = context_switch(rq, prev, next, cookie); /* unlocks the rq */
        } else {
            update_task_ravg(prev, rq, TASK_UPDATE, wallclock, 0);
            lockdep_unpin_lock(&rq->lock, cookie);
            raw_spin_unlock_irq(&rq->lock);
        }
    ...
    }

    利用trace event,我们能够监控某些事件,是否发生;或者在某段时间内,发生了什么。

    案例:在dim idle下,耗流偏大约10mA。

    dim idle是屏幕亮着,但亮度最低,系统进入idle状态,而未进入深度休眠的状态。

    1、在抓取了power waveform对比正常的机器,发生耗流波形的bottom与正常的相似。但是耗流bottom很不稳定。

    (异常)

     (正常)

    2、在检查所有外设后,并未发现有明显漏电。

    3、此时,就利用ftrace的trace event,抓取在此情况下,系统到底在运行什么。

    抓取方法:

    Step 1
    
    Connect USB and execute below commands one by one , some times all commands in batch may not work due to adb file system issues
    Verify Buffer size is reflected as the one that is set
    
    adb shell "echo 0 > /sys/kernel/debug/tracing/tracing_on"
    adb shell "cat /sys/kernel/debug/tracing/tracing_on"
    adb shell "echo 150000 > /sys/kernel/debug/tracing/buffer_size_kb"
    adb shell "cat /sys/kernel/debug/tracing/buffer_size_kb"
    
    adb shell "echo  > /sys/kernel/debug/tracing/set_event"
    adb shell cat /sys/kernel/debug/tracing/set_event
    adb shell "echo  > /sys/kernel/debug/tracing/trace"
    adb shell cat /sys/kernel/debug/tracing/trace
    adb shell sync
    
    
    Step 2
    
    Enable below trace events
    
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/cpu_idle/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/cpu_frequency/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/cpu_frequency_switch_start/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/*/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/sched/sched_switch/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/sched/sched_enq_deq_task/enable"
    
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_bus/bus_update_request/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/clk/clk_set_rate/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/clk/clk_enable/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/clk/clk_disable/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/bw_hwmon_update/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/bw_hwmon_meas/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/memlat_dev_meas/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/memlat_dev_update/enable"
    
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/cpu_idle/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/cpu_frequency/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/power/cpu_frequency_switch_start/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/*/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/sched/sched_switch/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/sched/sched_wakeup_new/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/sched/sched_enq_deq_task/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cluster_enter/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cluster_exit/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cluster_pred_hist/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cluster_pred_select/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cpu_idle_enter/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cpu_idle_exit/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cpu_power_select/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cpu_pred_hist/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/msm_low_power/cpu_pred_select/enable"
    
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/adreno_cmdbatch_queued/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/adreno_cmdbatch_submitted/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/adreno_cmdbatch_retired/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_gpubusy/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_pwr_request_state/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_pwr_set_state/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_pwrstats/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_buslevel/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_pwrlevel/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_clk/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_bus/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_rail/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/dispatch_queue_context/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_user_pwrlevel_constraint/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_clock_throttling/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_constraint/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/adreno_gpu_fault/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/adreno_cmdbatch_fault/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/adreno_cmdbatch_sync/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_issueibcmds/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_context_create/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_context_destroy/enable"
    adb shell "echo 1 > /sys/kernel/debug/tracing/events/kgsl/kgsl_context_detach/enable"
    adb shell "echo 1 > /d/tracing/events/msm_bus/bus_update_request/enable"
    adb shell "echo 1 > /d/tracing/events/msm_bus/bus_agg_bw/enable"
    
    Step 3 Verify the trace events and also remove any previous trace file
    adb shell cat /sys/kernel/debug/tracing/set_event
    adb shell rm /data/local/trace.txt
    
    
    Step 4:  Start Usecase, Execute below commands with usb and disconnect usb within 5 seconds
    
    adb shell
    cd /d/tracing
    sleep 10 && echo 0 > tracing_on && echo "" > trace && echo 1 > tracing_on && sleep 10 && echo 0 > tracing_on && cat trace > /data/local/trace.txt &
    
    Step 5 : After 20 seconds, re-connect usb and pull the trace file
    
    adb pull /data/local/trace.txt 

    4、我们找到了qsee_logger在每60ms,就会运行一次。这个是一个debug qsee的log工具,应该在版本中剔除。

             <idle>-0     [000] dns3 17167.350289: sched_wakeup: comm=qsee_logger pid=6004 prio=120 target_cpu=000
              <idle>-0     [000] dns3 17167.410357: sched_wakeup: comm=qsee_logger pid=6004 prio=120 target_cpu=001
              <idle>-0     [001] dns3 17167.470370: sched_wakeup: comm=qsee_logger pid=6004 prio=120 target_cpu=001
              <idle>-0     [001] dns3 17167.538874: sched_wakeup: comm=qsee_logger pid=6004 prio=120 target_cpu=001
              <idle>-0     [001] dns3 17167.608888: sched_wakeup: comm=qsee_logger pid=6004 prio=120 target_cpu=001
              <idle>-0     [001] dns3 17167.671269: sched_wakeup: comm=qsee_logger pid=6004 prio=120 target_cpu=001

    5、最后使用performance ROM,验证pass。再安装qsee_logger,又出现bottom耗流不稳定的waveform。但是平均耗流仍然会比之前差的时候好。

    所以,得出结论:1. qsee_logger导致bottom耗流不稳定,影响耗流

            2. 非performance ROM也影响了耗流

    在此案例中,我们利用了trace event查看了系统非休眠情况下,cpu调度的相关事件,并且找出了qsee_logger影响了底电流的稳定。

    二、trace function 和function_graph

    需要使用这2个功能,内核配置中需要添加以下内容:CONFIG_DYNAMIC_FTRACE=y、CONFIG_FUNCTION_TRACER=y、CONFIG_IRQSOFF_TRACER=y、CONFIG_FUNCTION_PROFILER=y 和 CONFIG_PREEMPT_TRACER=y

    1)trace function可以设定自己需要监测的函数,并且可以打印出其调用栈。

    查看tracer,验证是否生效。

    cat /d/tracing/available_tracers
    blk function_graph preemptirqsoff preemptoff irqsoff function nop

    1. 由于编译器有时会对函数进行重命名,因此通过使用以下命令确认(这里用例为:htc_batt_schedule_batt_info_update,看到确实后缀加了.part.8):

    cat /d/tracing/available_filter_functions | grep htc_batt_schedule_batt_info_update
    htc_batt_schedule_batt_info_update.part.8

    2. 确认存在之后,将其用作 ftrace 过滤器:

    echo htc_batt_schedule_batt_info_update.part.8 > /d/tracing/set_ftrace_filter

    3. 开启函数分析器:

    echo function > /d/tracing/current_tracer

    4. 开启堆栈跟踪:

    echo func_stack_trace > /d/tracing/trace_options

    5. 增加缓冲区大小:

    echo 64000 > /d/tracing/buffer_size_kb

    6. 最后开启抓取,抓完后关闭,并获取记录文件:

    echo 1 > /d/tracing/trace_on
    执行复现问题的动作
    echo 0 > /d/tracing/trace_on
    cat /d/tracing/trace > /data/local/tmp/trace

    我们可以从抓取的文件中可以看到什么时刻会调用到监测的函数,并会打印出调用栈:

               <...>-7363  [001] ...1  6384.885600: htc_batt_schedule_batt_info_update.part.8 <-htc_battery_info_update
               <...>-7363  [001] ...1  6384.885612: <stack trace>
     => smb2_batt_get_prop
     => power_supply_get_property
     => status_change_work
     => process_one_work
     => worker_thread
     => kthread
     => ret_from_fork
    ...
    ...
         kworker/1:2-7034  [001] ...1  6395.535869: htc_batt_schedule_batt_info_update.part.8 <-htc_battery_info_update
         kworker/1:2-7034  [001] ...1  6395.535880: <stack trace>
     => smb2_usb_set_prop
     => power_supply_set_property
     => dwc3_msm_gadget_vbus_draw
     => dwc3_msm_vbus_draw_work
     => process_one_work
     => worker_thread
     => kthread
     => ret_from_fork

    2)function_graph 是将调用栈打印出来,并且会显示每个阶段的运行的用时。这对于分析一些内核造成的卡顿问题有所帮助。

    在上面那些config基础上,还需要打开配置:CONFIG_FUNCTION_GRAPH_TRACER=y

    1. 设置tracer:

    echo function_graph > /d/tracing/current_tracer

    2. 最后开启抓取,抓完后关闭,并获取记录文件(设置buffer等和上面类似,不赘述):

    echo 1 > /d/tracing/trace_on
    执行复现问题的动作
    echo 0 > /d/tracing/trace_on
    cat /d/tracing/trace > /data/local/tmp/trace

    如下,我们可以看到每天函数的执行用时,这个时间是包含了sleep time的,所以可以用来定位系统延迟发生在哪个函数中。:

    # tracer: function_graph
    #
    # CPU  DURATION                  FUNCTION CALLS
    # |     |   |                     |   |   |   |
     0) + 76.615 us   |          } /* vfs_open */
     0)   0.156 us    |          open_check_o_direct();
     0)               |          terminate_walk() {
     0)   0.104 us    |            drop_links.isra.37();
     0)               |            dput() {
     0)               |              __might_sleep() {
     0)   0.156 us    |                ___might_sleep();
     0)   3.750 us    |              }
     0)   0.105 us    |              __rcu_read_lock();
     0)   0.156 us    |              __rcu_read_unlock();
     0) + 17.396 us   |            }
     0)               |            mntput() {
     0)               |              mntput_no_expire() {
     0)   0.156 us    |                __rcu_read_lock();
     0)   0.833 us    |                preempt_count_add();
     0)   0.208 us    |                preempt_count_sub();
     0)   0.157 us    |                __rcu_read_unlock();
     0) + 18.229 us   |              }
     0) + 21.875 us   |            }
     0)               |            dput() {
     0)               |              __might_sleep() {
     0)   0.157 us    |                ___might_sleep();
     0)   3.854 us    |              }
     0)   0.104 us    |              __rcu_read_lock();
     0)   0.156 us    |              __rcu_read_unlock();
     0) + 15.990 us   |            }

    其中计时前面的符号是一种提醒,让我们能注意到哪些地方的用时比较长,可能存在问题:

      + means that the function exceeded 10 usecs.
      ! means that the function exceeded 100 usecs.
      # means that the function exceeded 1000 usecs.
      * means that the function exceeded 10 msecs.
      @ means that the function exceeded 100 msecs.
      $ means that the function exceeded 1 sec.

    3)动态ftrace

    由于上面两种方法都会监测很多其他函数,会对系统造成较大负担。所以,需要使用动态过滤监测函数的方式。

    如果我们想只跟踪特定函数,可以把需要配置跟踪的函数写入到节点:

    echo htc_batt_schedule_batt_info_update.part.8 > /d/tracing/set_ftrace_filter ------- function tracer
    echo htc_batt_schedule_batt_info_update.part.8 > /d/tracing/set_graph_function ------ function_graph tracer

    或者想把一些函数排除在外,可以写入如下节点:

    /sys/kernel/debug/tracing/set_ftrace_notrace ------- function tracer
    /sys/kernel/debug/tracing/set_graph_notrace ------ function_graph tracer

    BTW,trace抓取之后,想再次抓取,默认是会把上次抓取的也包含在内的。所以,如果不需要上一次的信息,需要清空trace之后,重新再开始抓取:

    echo '' > /d/tracing/trace 
  • 相关阅读:
    ZZNU 正约数之和 2094
    ZZNUOJ 2022 摩斯密码
    POJ
    NYOJ 1277Decimal integer conversion (第九届河南省省赛)
    hrbust 2080链表 【贪心】
    hdu-5707-Combine String
    POJ 2442-Sequence(优先队列)
    Reversion Count
    python 07篇 内置函数和匿名函数
    python 06篇 常用模块
  • 原文地址:https://www.cnblogs.com/lingjiajun/p/11913376.html
Copyright © 2020-2023  润新知