找不到cublas....:
在/etc/ld.so.conf文件夹中新建cuda.conf,里面添加/usr/local/cuda/lib64,然后sudo /sbin/ldconfig -v。
cannot find lopencv_xxxx:
apt-cache search opencv
sudo apt-get install yyy
一次解决烦恼?如果用的是opencv3+的话,要面临更多痛苦。
这里的问题应当是make之后没有install,在sudo gedit /etc/ld.so.conf.d/opencv.conf中添加/usr/local/lib和/usr/local/lib/x86_64-linux-gnu, 然后sudo /sbin/ldconfig -v
Check failed: fd != -1 (-1 vs. -1):
文件路径不恰当,一般把[caffe]/当做项目根,其他文件以此做相对路径。
Check failed: net_->num_inputs() == 1 (0 vs. 1) Network should have exactly one input:
引错参数文件
Check failed: error == cudaSuccess (2 vs. 0) out of memory:
减小batch_size(每次迭代送入的样本数),最好为8的倍数
设定test_iter(测试时调入的batch数量) = TEST样本总量/batch_size(TEST的) (进一法)
增大snapshot(每迭代xx次生成一个模型)
Check failed: labels_.size() == output_layer->channels() (4 vs. 5) Number of labels is different from the output layer dimension.
train_val.prototxt和deploy.prototxt内,标签数据不一致
Check failed: status == CUDNN_STATUS_SUCCESS (8 vs. 0) CUDNN_STATUS_EXECUTION_FAILED
刷新一下nvidia信息之类的,再train就好了,不是很懂.jpg
Check failed: (11 vs. 0)
设置的cuda属性与显卡实际算力不匹配
Check failed: mdb_status == 0 (13 vs. 0) Permission denied
sudo
Check failed: error == cudaSuccess (30 vs. 0) unknown error 或 Cannot create Cublas handle:
sudo
Check failed: error == cudaSuccess (73 vs. 0):
重运行几次就好了,原因未知
Check failed: error == cudaSuccess (74 vs. 0):
调大max_iter,建议保持为test_interval的倍数
Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered:
nvidia-smi检查进程,sudo kill -9 [PID],然后无sudo前缀运行一次cuda程序,再加上sudo前缀运行cuda程序。
注意prototxt中需求的空间和文件夹是否存在
Segmentation fault (core dumped) 或 malloc: memory coruption :
修改源文件,排查出错误的行,换用安全的方法重写。(这一般是悲剧的开始)
corrupted size vs. prev_size:
???(悲剧达到高潮)
caffe/proto/caffe.pb.h not such file:
QT项目的.pro中 INCLUDEPATH += [caffe]/build/src
或复制该文件到[caffe]/src/caffe/proto
Error parsing text-format caffe.SolverParameter:
看具体报错行号,参考 http://www.cnblogs.com/denny402/p/5074212.html 和 http://www.cnblogs.com/denny402/p/5074049.html 修正