caffe训练自己的数据集

默认caffe已经编译好了，并且编译好了pycaffe

1 数据准备

首先准备训练和测试数据集，这里准备两类数据，分别放在文件夹0和文件夹1中（之所以使用0和1命名数据类别，是因为方便标注数据类别，直接用文件夹的名字即可）。即训练数据集：/data/train/0、/data/train/1 训练数据集：/data/val/0、/data/val/1。

数据准备好之后，创建记录数据文件和对应标签的txt文件

（1）创建训练数据集的train.txt

 1 import os
 2 f =open(r'train.txt',"w")
 3 path = os.getcwd()+'/data/train/'
 4 for filename in os.listdir(path) :
 5     count = 0
 6     for file in os.listdir(path+filename) :
 7         count = count + 1
 8         ff='/'+filename+"/"+file+" "+filename+"
"
 9         f.write(ff)
10     print '{} class: {}'.format(filename,count)
11 f.close()

（2）创建测试数据集val.txt

 1 import os
 2 f =open(r'val.txt',"w")
 3 path = os.getcwd()+'/data/val/'
 4 for filename in os.listdir(path) :
 5     count = 0
 6     for file in os.listdir(path+filename) :
 7         count = count + 1
 8         ff='/'+filename+"/"+file+" "+filename+"
"
 9         f.write(ff)
10     print '{} class: {}'.format(filename,count)
11 f.close()

注意，txt中文件的路径为： /类别文件夹名/文件名（空格，不能是制表符）类别

2 创建LMDB数据文件

创建createlmdb.sh使用caffe自带的（bulid/tools下的）convert_imageset创建LMDB数据文件，主要是注意数据文件以及上一步生成的txt文件的位置，注意数据文件的RESIZE，后边在进行训练和测试的时候还要用到，其余就是文件的路径的问题了。

 1 #!/usr/bin/env sh
 2 
 3 CAFFE_ROOT=/home/caf/object/caffe
 4 TOOLS=$CAFFE_ROOT/build/tools
 5 TRAIN_DATA_ROOT=/home/caf/wk/learn/data/train
 6 VAL_DATA_ROOT=/home/caf/wk/learn/data/val
 7 DATA=/home/caf/wk/learn/data
 8 EXAMPLE=/home/caf/wk/learn/data/lmdb
 9 # Set RESIZE=true to resize the images to 60 x 60. Leave as false if images have
10 # already been resized using another tool.
11 RESIZE=true
12 if $RESIZE; then
13   RESIZE_HEIGHT=227
14   RESIZE_WIDTH=227
15 else
16   RESIZE_HEIGHT=0
17   RESIZE_WIDTH=0
18 fi
19 
20 if [ ! -d "$TRAIN_DATA_ROOT" ]; then
21   echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
22   echo "Set the TRAIN_DATA_ROOT variable in create_face_48.sh to the path" 
23        "where the face_48 training data is stored."
24   exit 1
25 fi
26 
27 if [ ! -d "$VAL_DATA_ROOT" ]; then
28   echo "Error: VAL_DATA_ROOT is not a path to a directory: $VAL_DATA_ROOT"
29   echo "Set the VAL_DATA_ROOT variable in create_face_48.sh to the path" 
30        "where the face_48 validation data is stored."
31   exit 1
32 fi
33 
34 echo "Creating train lmdb..."
35 
36 GLOG_logtostderr=1 $TOOLS/convert_imageset 
37     --resize_height=$RESIZE_HEIGHT 
38     --resize_width=$RESIZE_WIDTH 
39     --shuffle 
40     $TRAIN_DATA_ROOT 
41     $DATA/train.txt 
42     $EXAMPLE/face_train_lmdb
43 
44 echo "Creating val lmdb..."
45 
46 GLOG_logtostderr=1 $TOOLS/convert_imageset 
47     --resize_height=$RESIZE_HEIGHT 
48     --resize_width=$RESIZE_WIDTH 
49     --shuffle 
50     $VAL_DATA_ROOT 
51     $DATA/val.txt 
52     $EXAMPLE/face_val_lmdb
53 
54 echo "Done."

3 定义网络

caffe接受的网络模型是prototxt文件，对于caffe网络的定义语法有详细的解释，本次实验用的是AlexNet，保存在train_val.prototxt

  1 name: "AlexNet"
  2 layer {
  3   name: "data"
  4   type: "Data"
  5   top: "data"
  6   top: "label"
  7   include {
  8     phase: TRAIN
  9   }
 10   data_param {
 11     source: "/home/caf/wk/learn/data/lmdb/face_train_lmdb"
 12     batch_size: 256
 13     backend: LMDB
 14   }
 15 }
 16 layer {
 17   name: "data"
 18   type: "Data"
 19   top: "data"
 20   top: "label"
 21   include {
 22     phase: TEST
 23   }
 24   data_param {
 25     source: "/home/caf/wk/learn/data/lmdb/face_val_lmdb"
 26     batch_size: 50
 27     backend: LMDB
 28   }
 29 }
 30 layer {
 31   name: "conv1"
 32   type: "Convolution"
 33   bottom: "data"
 34   top: "conv1"
 35   param {
 36     lr_mult: 1
 37     decay_mult: 1
 38   }
 39   param {
 40     lr_mult: 2
 41     decay_mult: 0
 42   }
 43   convolution_param {
 44     num_output: 96
 45     kernel_size: 11
 46     stride: 4
 47     weight_filler {
 48       type: "gaussian"
 49       std: 0.01
 50     }
 51     bias_filler {
 52       type: "constant"
 53       value: 0
 54     }
 55   }
 56 }
 57 layer {
 58   name: "relu1"
 59   type: "ReLU"
 60   bottom: "conv1"
 61   top: "conv1"
 62 }
 63 layer {
 64   name: "norm1"
 65   type: "LRN"
 66   bottom: "conv1"
 67   top: "norm1"
 68   lrn_param {
 69     local_size: 5
 70     alpha: 0.0001
 71     beta: 0.75
 72   }
 73 }
 74 layer {
 75   name: "pool1"
 76   type: "Pooling"
 77   bottom: "norm1"
 78   top: "pool1"
 79   pooling_param {
 80     pool: MAX
 81     kernel_size: 3
 82     stride: 2
 83   }
 84 }
 85 layer {
 86   name: "conv2"
 87   type: "Convolution"
 88   bottom: "pool1"
 89   top: "conv2"
 90   param {
 91     lr_mult: 1
 92     decay_mult: 1
 93   }
 94   param {
 95     lr_mult: 2
 96     decay_mult: 0
 97   }
 98   convolution_param {
 99     num_output: 256
100     pad: 2
101     kernel_size: 5
102     group: 2
103     weight_filler {
104       type: "gaussian"
105       std: 0.01
106     }
107     bias_filler {
108       type: "constant"
109       value: 0.1
110     }
111   }
112 }
113 layer {
114   name: "relu2"
115   type: "ReLU"
116   bottom: "conv2"
117   top: "conv2"
118 }
119 layer {
120   name: "norm2"
121   type: "LRN"
122   bottom: "conv2"
123   top: "norm2"
124   lrn_param {
125     local_size: 5
126     alpha: 0.0001
127     beta: 0.75
128   }
129 }
130 layer {
131   name: "pool2"
132   type: "Pooling"
133   bottom: "norm2"
134   top: "pool2"
135   pooling_param {
136     pool: MAX
137     kernel_size: 3
138     stride: 2
139   }
140 }
141 layer {
142   name: "conv3"
143   type: "Convolution"
144   bottom: "pool2"
145   top: "conv3"
146   param {
147     lr_mult: 1
148     decay_mult: 1
149   }
150   param {
151     lr_mult: 2
152     decay_mult: 0
153   }
154   convolution_param {
155     num_output: 384
156     pad: 1
157     kernel_size: 3
158     weight_filler {
159       type: "gaussian"
160       std: 0.01
161     }
162     bias_filler {
163       type: "constant"
164       value: 0
165     }
166   }
167 }
168 layer {
169   name: "relu3"
170   type: "ReLU"
171   bottom: "conv3"
172   top: "conv3"
173 }
174 layer {
175   name: "conv4"
176   type: "Convolution"
177   bottom: "conv3"
178   top: "conv4"
179   param {
180     lr_mult: 1
181     decay_mult: 1
182   }
183   param {
184     lr_mult: 2
185     decay_mult: 0
186   }
187   convolution_param {
188     num_output: 384
189     pad: 1
190     kernel_size: 3
191     group: 2
192     weight_filler {
193       type: "gaussian"
194       std: 0.01
195     }
196     bias_filler {
197       type: "constant"
198       value: 0.1
199     }
200   }
201 }
202 layer {
203   name: "relu4"
204   type: "ReLU"
205   bottom: "conv4"
206   top: "conv4"
207 }
208 layer {
209   name: "conv5"
210   type: "Convolution"
211   bottom: "conv4"
212   top: "conv5"
213   param {
214     lr_mult: 1
215     decay_mult: 1
216   }
217   param {
218     lr_mult: 2
219     decay_mult: 0
220   }
221   convolution_param {
222     num_output: 256
223     pad: 1
224     kernel_size: 3
225     group: 2
226     weight_filler {
227       type: "gaussian"
228       std: 0.01
229     }
230     bias_filler {
231       type: "constant"
232       value: 0.1
233     }
234   }
235 }
236 layer {
237   name: "relu5"
238   type: "ReLU"
239   bottom: "conv5"
240   top: "conv5"
241 }
242 layer {
243   name: "pool5"
244   type: "Pooling"
245   bottom: "conv5"
246   top: "pool5"
247   pooling_param {
248     pool: MAX
249     kernel_size: 3
250     stride: 2
251   }
252 }
253 layer {
254   name: "fc6"
255   type: "InnerProduct"
256   bottom: "pool5"
257   top: "fc6"
258   param {
259     lr_mult: 1
260     decay_mult: 1
261   }
262   param {
263     lr_mult: 2
264     decay_mult: 0
265   }
266   inner_product_param {
267     num_output: 4096
268     weight_filler {
269       type: "gaussian"
270       std: 0.005
271     }
272     bias_filler {
273       type: "constant"
274       value: 0.1
275     }
276   }
277 }
278 layer {
279   name: "relu6"
280   type: "ReLU"
281   bottom: "fc6"
282   top: "fc6"
283 }
284 layer {
285   name: "drop6"
286   type: "Dropout"
287   bottom: "fc6"
288   top: "fc6"
289   dropout_param {
290     dropout_ratio: 0.5
291   }
292 }
293 layer {
294   name: "fc7"
295   type: "InnerProduct"
296   bottom: "fc6"
297   top: "fc7"
298   param {
299     lr_mult: 1
300     decay_mult: 1
301   }
302   param {
303     lr_mult: 2
304     decay_mult: 0
305   }
306   inner_product_param {
307     num_output: 4096
308     weight_filler {
309       type: "gaussian"
310       std: 0.005
311     }
312     bias_filler {
313       type: "constant"
314       value: 0.1
315     }
316   }
317 }
318 layer {
319   name: "relu7"
320   type: "ReLU"
321   bottom: "fc7"
322   top: "fc7"
323 }
324 layer {
325   name: "drop7"
326   type: "Dropout"
327   bottom: "fc7"
328   top: "fc7"
329   dropout_param {
330     dropout_ratio: 0.5
331   }
332 }
333 layer {
334   name: "fc8"
335   type: "InnerProduct"
336   bottom: "fc7"
337   top: "fc8"
338   param {
339     lr_mult: 1
340     decay_mult: 1
341   }
342   param {
343     lr_mult: 2
344     decay_mult: 0
345   }
346   inner_product_param {
347     num_output: 2
348     weight_filler {
349       type: "gaussian"
350       std: 0.01
351     }
352     bias_filler {
353       type: "constant"
354       value: 0
355     }
356   }
357 }
358 layer {
359   name: "accuracy"
360   type: "Accuracy"
361   bottom: "fc8"
362   bottom: "label"
363   top: "accuracy"
364   include {
365     phase: TEST
366   }
367 }
368 layer {
369   name: "loss"
370   type: "SoftmaxWithLoss"
371   bottom: "fc8"
372   bottom: "label"
373   top: "loss"
374 }
375 layer {
376   name: "prob"
377   type: "Softmax"
378   bottom: "fc8"
379   top: "prob"
380 }

View Code

创建超参数文件slover.prototxt，主要定义训练的参数，包括迭代次数，每迭代多少次保存模型文件，学习率等等，net就是刚才定义的训练网络，这里训练和测试使用同一个网络。

 1 net: "train_val.prototxt"
 2 test_iter: 2
 3 test_interval: 10
 4 base_lr: 0.001
 5 lr_policy: "step"
 6 gamma: 0.1
 7 stepsize: 100
 8 display: 20
 9 max_iter: 100
10 momentum: 0.9
11 weight_decay: 0.005
12 solver_mode: GPU
13 snapshot: 20
14 snapshot_prefix: "model/"

4 训练模型

创建train.sh使用GPU进行训练，否则太慢！！！

1 #!/usr/bin/env sh
2 CAFFE_ROOT=/home/caf/object/caffe
3 SLOVER_ROOT=/home/caf/wk/learn
4 $CAFFE_ROOT/build/tools/caffe train --solver=$SLOVER_ROOT/slover.prototxt --gpu=0

在model文件夹下会生成caffemodel文件，使用这些文件用于图像的分类等操作。

4 测试

创建deploy.prototxt进行测试，和训练网络一样，只不过用于实际分类的网络并不需要训练网络那些参数了，因此需要重新定义一个模型文件，测试的图片在该模型中进行。

deploy.prototxt文件和train_val.prototxt文件不同的地方在于：

（1）输入的数据不再是LMDB，也不分为测试集和训练集，输入的类型为Input，定义的维度，和训练集的数据维度保持一致，227*227，否则会报错；

（2）去掉weight_filler和bias_filler，这些参数已经存在于caffemodel中了，由caffemodel进行初始化。

（3）去掉最后的Accuracy层和loss层，换位Softmax层，表示分为某一类的概率。

  1 name: "AlexNet"
  2 layer {               
  3   name: "data"
  4   type: "Input"
  5   top: "data"
  6   input_param { shape: { dim: 1 dim: 3 dim: 227 dim: 227 } } 
  7 }
  8 layer {
  9   name: "conv1"
 10   type: "Convolution"
 11   bottom: "data"
 12   top: "conv1"
 13   param {
 14     lr_mult: 1
 15     decay_mult: 1
 16   }
 17   param {
 18     lr_mult: 2
 19     decay_mult: 0
 20   }
 21   convolution_param {
 22     num_output: 96
 23     kernel_size: 11
 24     stride: 4
 25   }
 26 }
 27 layer {
 28   name: "relu1"
 29   type: "ReLU"
 30   bottom: "conv1"
 31   top: "conv1"
 32 }
 33 layer {
 34   name: "norm1"
 35   type: "LRN"
 36   bottom: "conv1"
 37   top: "norm1"
 38   lrn_param {
 39     local_size: 5
 40     alpha: 0.0001
 41     beta: 0.75
 42   }
 43 }
 44 layer {
 45   name: "pool1"
 46   type: "Pooling"
 47   bottom: "norm1"
 48   top: "pool1"
 49   pooling_param {
 50     pool: MAX
 51     kernel_size: 3
 52     stride: 2
 53   }
 54 }
 55 layer {
 56   name: "conv2"
 57   type: "Convolution"
 58   bottom: "pool1"
 59   top: "conv2"
 60   param {
 61     lr_mult: 1
 62     decay_mult: 1
 63   }
 64   param {
 65     lr_mult: 2
 66     decay_mult: 0
 67   }
 68   convolution_param {
 69     num_output: 256
 70     pad: 2
 71     kernel_size: 5
 72     group: 2
 73   }
 74 }
 75 layer {
 76   name: "relu2"
 77   type: "ReLU"
 78   bottom: "conv2"
 79   top: "conv2"
 80 }
 81 layer {
 82   name: "norm2"
 83   type: "LRN"
 84   bottom: "conv2"
 85   top: "norm2"
 86   lrn_param {
 87     local_size: 5
 88     alpha: 0.0001
 89     beta: 0.75
 90   }
 91 }
 92 layer {
 93   name: "pool2"
 94   type: "Pooling"
 95   bottom: "norm2"
 96   top: "pool2"
 97   pooling_param {
 98     pool: MAX
 99     kernel_size: 3
100     stride: 2
101   }
102 }
103 layer {
104   name: "conv3"
105   type: "Convolution"
106   bottom: "pool2"
107   top: "conv3"
108   param {
109     lr_mult: 1
110     decay_mult: 1
111   }
112   param {
113     lr_mult: 2
114     decay_mult: 0
115   }
116   convolution_param {
117     num_output: 384
118     pad: 1
119     kernel_size: 3
120   }
121 }
122 layer {
123   name: "relu3"
124   type: "ReLU"
125   bottom: "conv3"
126   top: "conv3"
127 }
128 layer {
129   name: "conv4"
130   type: "Convolution"
131   bottom: "conv3"
132   top: "conv4"
133   param {
134     lr_mult: 1
135     decay_mult: 1
136   }
137   param {
138     lr_mult: 2
139     decay_mult: 0
140   }
141   convolution_param {
142     num_output: 384
143     pad: 1
144     kernel_size: 3
145     group: 2
146   }
147 }
148 layer {
149   name: "relu4"
150   type: "ReLU"
151   bottom: "conv4"
152   top: "conv4"
153 }
154 layer {
155   name: "conv5"
156   type: "Convolution"
157   bottom: "conv4"
158   top: "conv5"
159   param {
160     lr_mult: 1
161     decay_mult: 1
162   }
163   param {
164     lr_mult: 2
165     decay_mult: 0
166   }
167   convolution_param {
168     num_output: 256
169     pad: 1
170     kernel_size: 3
171     group: 2
172   }
173 }
174 layer {
175   name: "relu5"
176   type: "ReLU"
177   bottom: "conv5"
178   top: "conv5"
179 }
180 layer {
181   name: "pool5"
182   type: "Pooling"
183   bottom: "conv5"
184   top: "pool5"
185   pooling_param {
186     pool: MAX
187     kernel_size: 3
188     stride: 2
189   }
190 }
191 layer {
192   name: "fc6"
193   type: "InnerProduct"
194   bottom: "pool5"
195   top: "fc6"
196   param {
197     lr_mult: 1
198     decay_mult: 1
199   }
200   param {
201     lr_mult: 2
202     decay_mult: 0
203   }
204   inner_product_param {
205     num_output: 4096
206   }
207 }
208 layer {
209   name: "relu6"
210   type: "ReLU"
211   bottom: "fc6"
212   top: "fc6"
213 }
214 layer {
215   name: "drop6"
216   type: "Dropout"
217   bottom: "fc6"
218   top: "fc6"
219   dropout_param {
220     dropout_ratio: 0.5
221   }
222 }
223 layer {
224   name: "fc7"
225   type: "InnerProduct"
226   bottom: "fc6"
227   top: "fc7"
228   param {
229     lr_mult: 1
230     decay_mult: 1
231   }
232   param {
233     lr_mult: 2
234     decay_mult: 0
235   }
236   inner_product_param {
237     num_output: 4096
238   }
239 }
240 layer {
241   name: "relu7"
242   type: "ReLU"
243   bottom: "fc7"
244   top: "fc7"
245 }
246 layer {
247   name: "drop7"
248   type: "Dropout"
249   bottom: "fc7"
250   top: "fc7"
251   dropout_param {
252     dropout_ratio: 0.5
253   }
254 }
255 layer {
256   name: "fc8"
257   type: "InnerProduct"
258   bottom: "fc7"
259   top: "fc8"
260   param {
261     lr_mult: 1
262     decay_mult: 1
263   }
264   param {
265     lr_mult: 2
266     decay_mult: 0
267   }
268   inner_product_param {
269     num_output: 2
270   }
271 }
272 layer {
273   name: "prob"
274   type: "Softmax"
275   bottom: "fc8"
276   top: "prob"
277 }

View Code

用于训练的python代码，使用caffe中python的接口，主要定义好自己训练好的参数文件，模型文件的位置，以及均值文件的位置。

 1 import numpy as np
 2 import matplotlib.pyplot as plt
 3 
 4 import sys
 5 caffe_root="/home/caf/object/caffe/"
 6 sys.path.insert(0,caffe_root+'python')
 7 import caffe
 8 caffe.set_device(0)
 9 caffe.set_mode_gpu()
10 model_def = 'deploy.prototxt'
11 model_weights = 'model/_iter_100.caffemodel'
12 net = caffe.Net(model_def,
13                 model_weights,  
14                 caffe.TEST)     
15 mu = np.load(caffe_root + 'python/caffe/imagenet/ilsvrc_2012_mean.npy')
16 mu = mu.mean(1).mean(1)
17 #print 'mean-subtracted values:', zip('BGR', mu)
18 transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
19 transformer.set_transpose('data', (2,0,1))
20 transformer.set_mean('data', mu)
21 transformer.set_raw_scale('data', 255)    
22 transformer.set_channel_swap('data', (2,1,0))
23 net.blobs['data'].reshape(3,227, 227)
24 image = caffe.io.load_image('test.jpg')
25 transformed_image = transformer.preprocess('data', image)
26 #plt.imshow(image)
27 #plt.show()
28 net.blobs['data'].data[...] = transformed_image
29 output = net.forward()  
30 output_prob = output['prob']
31 print output_prob
32 print 'predicted class is:', output_prob.argmax()

遇到的问题

（1）标签文件不能用制表符，必须是空格，否则会找不到数据文件

（2）CUDA问题，报一个类似叫CUDASuccess的错误，说明GPU空间不够，需要释放空间，使用 nvidia-smi 命令查看那个程序占用GPU过高，使用 kill -9 PID结束掉即可

（3）由于caffe版本的问题，层的定义有layer和layers，使用layer定义，type需要加双引号，是字符格式；使用layers定义，type不用加双引号，变为全大写字母

相关阅读:
How to Create a site at the specified URL and new database (CommandLine Operation)
Using Wppackager to Package and Deploy Web Parts for Microsoft SharePoint Products and Technologies
SQL Server Monitor v0.5 [Free tool]
How to build Web Part
Deploy web part in a virtual server by developing a Web Part Package file(.cab)
How to recreate "sites" link if you delete it accidentally
SharePoint Portal Server管理匿名访问设置
 Monitor sql connection from .Net SqlClient Data Provider
Brief installation instruction of Sharepoint Portal Server
How to Use SharePoint Alternate URL Access
原文地址：https://www.cnblogs.com/wktwj/p/6715110.html