• 如何找bug


    将发送端sleep时间定为0, 即sleep(0).

    事例率也在700Hz持续3分钟后降为0.查看log文件:

    1. SFI 的log文件:

    WARNING 2017-Feb-08 15:39:55 [DC::StatusWord EventAssembly::EventCompleted(...) at SFI/src/EventAssembly.cxx:480] Problem with the flow of data: Event with LVL1ID 130745 misses 1 data fragment(s) from: ROS-Eth-07,
    WARNING 2017-Feb-08 15:39:55 [DC::StatusWord EventAssembly::EventCompleted(...) at SFI/src/EventAssembly.cxx:480] Problem with the flow of data: Event with LVL1ID 130749 misses 1 data fragment(s) from: ROS-Eth-07,
    WARNING 2017-Feb-08 15:39:55 [DC::StatusWord EventAssembly::EventCompleted(...) at SFI/src/EventAssembly.cxx:480] Problem with the flow of data: Event with LVL1ID 130753 misses 1 data fragment(s) from: ROS-Eth-07,

    2. ROS 的log文件:

    Warning could not load libEthClientSequentialReadoutModule.so, trying libROSEthClientSequentialReadoutModule.so
    ERROR 2017-Feb-08 15:44:10 [virtual void ROS::RequestHandler::run(...) at ROSCore/src/RequestHandler.cpp:274] Unclassified error: RequestHandler caught exception, deleting Request
    was caused by: ERROR 2017-Feb-08 15:44:10 [unknown at unknown:0] No pages available

    找到ROSCore/src/RequestHandler.cpp:274:

      catch (std::exception& issue) {
    274          ENCAPSULATE_ROS_EXCEPTION(newIssue, CoreException, UNCLASSIFIED, issue, "RequestHandler caught exception, deleting Request    ");  
    275          ers::error(newIssue);
    276          deleteRequest();
    277       }

    先找到 ENCAPSULATE_ROS_EXCEPTION的定义:并没有看出啥,只看出来例外的类型是CoreException, errorCode是UNCLASSIFIED.

    /** def ENCAPSULATE_ROS_EXCEPTION(instanceName, exceptionClass, errorCode, oldException, messageContent)
        Macro to encapsulate an exception inside a ROSException along with
        some explanatory text.
    */
    #define ENCAPSULATE_ROS_EXCEPTION(instanceName, exceptionClass, errorCode, oldException, messageContent) 
        std::ostringstream instanceName##_tStream; 
        instanceName##_tStream << messageContent; 
        exceptionClass instanceName(oldException, exceptionClass::errorCode, instanceName##_tStream.str(), ERS_HERE)
    
    #endif

    然后再查看try里面的语句

    243       try {
    244          TS_RECORD(TS_H1,stamp_offset + 2100);
    245          int requestOk;
    246          //unsigned int tsdata = m_handlerId * 100 + 3000000000;
    247          //TS_RECORD(TS_H5,tsdata);
    248 
    249          requestOk = m_request->execute();
    250          TS_RECORD(TS_H1,stamp_offset + 2200);
    251 
    252          if (requestOk == Request::REQUEST_OK) {
    253             m_requestsHandled++;
    254             deleteRequest();
    255             TS_RECORD(TS_H1, stamp_offset + 2300);
    256          }
    257          else if (requestOk == Request::REQUEST_TIMEOUT) {
    258             m_requestsTimedOut++;
    259             deleteRequest();
    260             TS_RECORD(TS_H1, stamp_offset + 2300);
    261          }
    262          else {
    263             TS_RECORD(TS_H1, stamp_offset + 2400);
    264             DEBUG_TEXT(DFDB_ROSCORE, 15, "RequestHandler(" << m_handlerId <<")::run: Request non completed: will be requeued.");
    265             m_requestsFailed++;
    266       m_request = m_requestQueue->swap(m_request);
    267             DFThread::yieldOrCancel();
    268             TS_RECORD(TS_H1, stamp_offset + 2500);
    269          }
    270 
    271          TS_RECORD(TS_H1, stamp_offset + 2990);
    272       }

    整个try...catch语句是在requesthandler的run函数里面,run函数是requesthandler的执行线程,从request queue里面拿出一个request, 执行这个request,然后再删除这个request.

    通过加打印得到:try语句里执行了第一个if语句,证明request被执行后返回了REQUEST_OK,在执行if内部的语句deleteRequest();时,出现了例外。

    再查看deleteRequest()函数里面写了什么:

    193 void RequestHandler::deleteRequest()
    194 {
    195   if (m_request != 0) {
    197     m_mutex->lock() ;   // Memory deallocation has to be protected as
    198        delete(m_request);    // it is thread-unsafe!
    199        m_request=0;
    200        m_mutex->unlock();
    201    }
    202 }

    在 if 里面每条语句里面加打印,在requesthandler的析构函数里面加打印,并查看log文件

    void RequestHandler::deleteRequest()
    194 {
    195   if (m_request != 0) {
    196     std::cout << "m_request is not 0:1!" << std::endl;
    197     m_mutex->lock() ;   // Memory deallocation has to be protected as
    198     std::cout << "m_request is not 0:2!" << std::endl;
    199     delete(m_request);    // it is thread-unsafe!
    200     std::cout << "m_request is not 0:3!" << std::endl;
    201     m_request=0;
    202     std::cout << "m_request is not 0:4!" << std::endl;
    203     m_mutex->unlock();
    204     std::cout << "m_request is not 0:5!" << std::endl;
    205   }
    206 }

    查看ROS-Eth-04的log文件,发现:

    [lhaaso@cmm03node01 part_dk_ef]$ grep "m_request is not 0:1" ROS-Eth-04_cmm03node01_1486545277.out |wc -l
    261133
    [lhaaso@cmm03node01 part_dk_ef]$ grep "m_request is not 0:2" ROS-Eth-04_cmm03node01_1486545277.out |wc -l
    261239

    [lhaaso@cmm03node01 part_dk_ef]$ grep "~Request" ROS-Eth-04_cmm03node01_1486545277.out |wc -l
    130624
    [lhaaso@cmm03node01 part_dk_ef]$ grep "m_request is not 0:3" ROS-Eth-04_cmm03node01_1486545277.out |wc -l
    261239
    [lhaaso@cmm03node01 part_dk_ef]$ grep "m_request is not 0:4" ROS-Eth-04_cmm03node01_1486545277.out |wc -l
    261239
    [lhaaso@cmm03node01 part_dk_ef]$ grep "m_request is not 0:5" ROS-Eth-04_cmm03node01_1486545277.out |wc -l
    261238

    这几条语句的执行次数,发现deleteRequest操作的次数比其前后打印的次数要少很多,证明有很多次delteRequest并没有成功,抛出例外,并导致程序报错了。

     (批注:实际上是Request有两个子类,FragmentRequest 和 ReleaseFragment类, RequestHandler执行的时候根据收到SFI的消息种类判断执行哪一种Request。)

  • 相关阅读:
    ESP8266 A0的使用
    电脑总是被乱装各种软件怎么办?那就设置一个密码吧!
    笔记本光驱位改装固态系统硬盘教程
    任务管理器无法呼出
    Python stomp 介绍与代码
    Power(x,y)
    旋转图像
    字符串相乘
    缺失的正数
    外观数列
  • 原文地址:https://www.cnblogs.com/zengtx/p/6378547.html
Copyright © 2020-2023  润新知