【CUDA开发】 Check failed: error == cudaSuccess (8 vs. 0) invalid device function

【CUDA开发】 Check failed: error == cudaSuccess (8 vs. 0) invalid device function
最近在复现R-CNN一系列的实验时，配置代码环境真是花费了不少时间。由于对MATLAB不熟悉，实验采用的都是github上rbg大神的Python版本。在配置Faster R-CNN时，编译没有问题，一运行 ./tools/demo.py --net zf 就会出现如下错误：

<span style="font-size:14px;">Loaded network ./data/faster_rcnn_models/ZF_faster_rcnn_final.caffemodel
F1008 roi_pooling_layer.cu:91] Check failed: error == cudaSuccess (8 vs. 0) invalid device function
*** Check failure stack trace: *** </span>

但是采用CPU mode运行时可以成功。

最后在https://github.com/rbgirshick/py-faster-rcnn/issues/2 找到了我想要的答案，有兴趣的可以慢慢阅读。

不想看的话，就直接按照我下面的方式修改。

一般情况下都是因为显卡的计算能力不同而导致的，修改 py-faster-rcnn/lib/setup.py 的第135行，将arch改为与你显卡相匹配的数值，（比如我的GTX 760，计算能力是3.0，就将sm_35改成了sm_30）然后删除utils/bbox.c，nms/cpu_nms.c ，nms/gpu_nms.cpp 重新编译即可

我看到有些人说还有其他的问题，那么可以在最开始的makefile.config文件中就开始修改，不过我没有试过,具体步骤如下
1. <span style="font-size:14px;">As below, there is my solution (thress steps):
2. 1 if you're using the GPU instance on AWS, then please change the architecture setting into:
3. # CUDA architecture setting: going with all of them.
4. # For CUDA < 6.0, comment the *_50 lines for compatibility.
5. CUDA_ARCH := -gencode arch=compute_30,code=sm_30
6. -gencode arch=compute_50,code=sm_50
7. -gencode arch=compute_50,code=compute_50
8. Because the GPU in AWS does not support compute_35
9. 2 I changed sm_35 into sm_30 in lib/setup.py file
10. 3 cd lib, remove these files: utils/bbox.c nms/cpu_nms.c nms/gpu_nms.cpp, if they exist.
11. And then make && cd ../caffe/ && make clean && make -j8 && make pycaffe -j8 </span>
相关阅读:
服务器安装宝塔面板
 CentOS7操作SSH/SSHD服务(查看/启动/重启/自启)
CentOS服务器升级Linux版本
 阿里云云服务器漏洞修复
 Linux服务器安装Docker
服务器安装Nginx
服务器端简单Demo
实现博客内容折叠
 [转]珍惜时间，做好规划——致大学过半的你们
 【LeetCode】9. 回文数
原文地址：https://www.cnblogs.com/huty/p/8517107.html