• Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0) CUBLAS_STATUS_EXECUTION_FAILED


    [ RUN      ] PowerLayerTest/3.TestPowerOneGradient
    F0319 15:50:19.414253 22426 math_functions.cu:92] Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0)  CUBLAS_STATUS_EXECUTION_FAILED
    *** Check failure stack trace: ***
        @     0x7f574fe7e78d  google::LogMessage::Fail()
        @     0x7f574fe80d43  google::LogMessage::SendToLog()
        @     0x7f574fe7e31b  google::LogMessage::Flush()
        @     0x7f574fe7fc8e  google::LogMessageFatal::~LogMessageFatal()
        @     0x7f574e348e4e  caffe::caffe_gpu_scal<>()
        @     0x7f574e336906  caffe::PowerLayer<>::Forward_gpu()
        @           0x455772  caffe::Layer<>::Forward()
        @           0x496300  caffe::GradientChecker<>::CheckGradientSingle()
        @           0x50ea94  caffe::GradientChecker<>::CheckGradientEltwise()
        @           0x6b69b5  caffe::PowerLayerTest<>::TestBackward()
        @           0x7ac6b3  testing::internal::HandleExceptionsInMethodIfSupported<>()
        @           0x7a5cca  testing::Test::Run()
        @           0x7a5e18  testing::TestInfo::Run()
        @           0x7a5ef5  testing::TestCase::Run()
        @           0x7a71cf  testing::internal::UnitTestImpl::RunAllTests()
        @           0x7a74f3  testing::UnitTest::Run()
        @           0x44bfa9  main
        @     0x7f574d5a8830  __libc_start_main
        @           0x451c39  _start
    Makefile:478: recipe for target 'runtest' failed
    make: *** [runtest] 已放弃 (core dumped)

     首先明确,这是make runtest的错误,所以一定不能是代码问题。一定是我的配置问题。虽然是segnet作者改得caffe,但是应该没有问题。不过我还是打算用官方的caffe跑一下试试。

    我之前那篇教程,安装教程http://www.cnblogs.com/SweetBeens/p/8525131.html,提到如果ubuntu16.04装9,可能会有问题,那么问题来了。官网给出的doc提到了这个问题

    The GPU program failed to execute. This is often caused by a launch failure of the kernel on the GPU, which can be caused by multiple reasons.

    To correct: check that the hardware, an appropriate version of the driver, and the cuBLAS library are correctly installed.

    github上有个比较火的讨论贴:

    https://github.com/BVLC/caffe/issues/2417

    大家的解决方法是装cuda8.0,还有说不用cudnn。

    但由于我不到黄河心不死,不想用cuda8

    那么有如下思路:

    1,跑官网caffe,看看是不是因为segnet_caffe版本太低或者什么的

    2.改下gcc版本

    3.更换driver和cuda版本

    等我试试。而且为什么别人的有一些可用16.04+9.1,奇怪了。

    拟解决过程

    1.我跑了官网的caffe,出现如下错误:

    [  FAILED  ] EmbedLayerTest/3.TestForward, where TypeParam = caffe::GPUDevice<double> (1 ms)
    [ RUN      ] EmbedLayerTest/3.TestGradient
    [       OK ] EmbedLayerTest/3.TestGradient (101 ms)
    [ RUN      ] EmbedLayerTest/3.TestSetUp
    [       OK ] EmbedLayerTest/3.TestSetUp (0 ms)
    [ RUN      ] EmbedLayerTest/3.TestForwardWithBias
    F0319 17:26:25.959848 30839 math_functions.cu:42] Check failed: status == CUBLAS_STATUS_SUCCESS (13 vs. 0)  CUBLAS_STATUS_EXECUTION_FAILED
    *** Check failure stack trace: ***
        @     0x7f787028d78d  google::LogMessage::Fail()
        @     0x7f787028fd43  google::LogMessage::SendToLog()
        @     0x7f787028d31b  google::LogMessage::Flush()
        @     0x7f787028ec8e  google::LogMessageFatal::~LogMessageFatal()
        @     0x7f786e105672  caffe::caffe_gpu_gemm<>()
        @     0x7f786e13d75a  caffe::EmbedLayer<>::Forward_gpu()
        @           0x476522  caffe::Layer<>::Forward()
        @           0x4f5ef6  caffe::EmbedLayerTest_TestForwardWithBias_Test<>::TestBody()
        @           0x90b393  testing::internal::HandleExceptionsInMethodIfSupported<>()
        @           0x9049aa  testing::Test::Run()
        @           0x904af8  testing::TestInfo::Run()
        @           0x904bd5  testing::TestCase::Run()
        @           0x905eaf  testing::internal::UnitTestImpl::RunAllTests()
        @           0x9061d3  testing::UnitTest::Run()
        @           0x469fed  main
        @     0x7f786d2ec830  __libc_start_main
        @           0x471a69  _start
    Makefile:532: recipe for target 'runtest' failed
    make: *** [runtest] 已放弃 (core dumped)

     问题解决

    cuda版本问题。在我的电脑配置环境下,必须是ubuntu 16.04+cuda8。不可以用9.1.

    那么借鉴这个思路,请各位看下自己的版本有没有不按要求整的。基本限制就是cuda,python。

    我简化的安装步骤,更方便查漏补缺:

    http://www.cnblogs.com/SweetBeens/p/8652083.html

    详细的安装版本制约:

    http://www.cnblogs.com/SweetBeens/p/8525131.html

    以及利用anaconda2随意配置多个版本:

    http://www.cnblogs.com/SweetBeens/p/8650460.html

    卸载cuda9.1安装cuda8

    http://www.cnblogs.com/SweetBeens/p/8616797.html

    本博客专注于错误锦集,在作死的边缘试探
  • 相关阅读:
    【用程序思维学习英语】
    【python3】修饰器简单理解
    【FLASK】发送QQ邮件
    【FLASK】数据库迁移
    【python3】with的用法
    【flask】工厂函数和蓝本的作用
    使用Python中的xltpl模块填充excel表格模板文件
    Python添加excel表格的批注
    在原有表格基础上面进行添加内容修改格式等操作
    Python操作excel表格库的介绍
  • 原文地址:https://www.cnblogs.com/SweetBeens/p/8603306.html
Copyright © 2020-2023  润新知