• caffe模型转pytorchLSTM


    caffe模型转pytorch---LSTM
    本文官方链接https://www.cnblogs.com/yanghailin/p/15599428.html,未经授权勿转载
    先来个总结:
    具体的可以看博客:
    https://www.cnblogs.com/yanghailin/p/15599428.html
    caffe提取权重搭建pytorch网络,实现lstm转换。
    pytorch1.0,cuda8.0,libtorch1.0
    pytorch1.0上面是可以的,精度一致,但是转libtorch的时候也没有问题,没有任何提示,转pt是没有问题的。
    但是就是最后精度不对,找到问题就是lstm那层精度不对。上面一层精度还是对的。无解。
    然后又试验了pytorch1.1.0环境,没问题。
    github提的issue:
    https://github.com/pytorch/pytorch/issues/68864

    之前完成了几个网络的caffe转pytorch。
    refinenet https://www.cnblogs.com/yanghailin/p/13096258.html
    refinedet https://www.cnblogs.com/yanghailin/p/12965695.html
    上面那个是提取caffe权重然后转libtorch,下面是直接对应的pytorch版本转libtorch,大量的后处理用libtorch完成,后来同事也完成了直接拿caffe权重转libtorch。
    无出意外,上面的都是需要编译caffe的python接口完成。但是一般的工程场景是我们只用caffe的c++,有时候没有对应的python工程。然后编译python接口并调用会有一些麻烦。
    后来我想为啥我们要多此一举,直接用caffe跑前向的c++工程难道不行吗?
    其实是可以的,只是caffe的源码复杂,一开始看不懂。
    本系列的博客就是直接用caffe的c++工程直接提取权重,搭建同样的pytorch网络,把caffe权重填充过来就可以直接运行跑前向推理。

    我是这么处理的,首先编译caffe lstm的cpu版本,可以在clion里面debug,我是在/caffe_ocr/tools/caffe.cpp 把caffe.cpp原有的都删了,然后换上了lstm 跑前向推理的代码,这样编译出来的caffe源码。
    然后我就可以打断点调试了。

    caffe源码是一个高度抽象的工程,通过Layer作为基类,其他所有算法模块都是在这个Layer上派生出来的。
    net类是一个很重要的类,它管理与统筹了整个网络,在net类中可以拿到网络所有中间feature map结果,可以拿到每个层对应的权重。
    由于我的目的是需要转lstm到pytorch。所有把lstm这个算子实现方法整明白至关重要。一看不知道,再看直接傻眼。lstm实现真是复杂啊!它内部自己整了一个net类!!!双向lstm就是整了2个net类,派生于RecurrentLayer这个类。
    lstm原理的话就是那6个公式,看这个博客就可以:
    https://www.jianshu.com/p/9dc9f41f0b29
    https://colah.github.io/posts/2015-08-Understanding-LSTMs/





    本文并不打算仔细讲解caffe源码与lstm具体实现方式。后面有机会单独开一个博客吧。
    本文具体讲解从caffemodel提取各个层的权重。权重是一般是很大的一个矩阵,比如[64,3,7,7], 需要把这些权重保存起来供Python读取。
    一开始我也在c++想有啥办法和Python numpy一样的方便处理矩阵,想过了用json,xml或者直接用caffe自带的blob类,但是不会用啊!用caffe的proto应该是可以的,但是不会用。
    然后就用最直接的方法吧,就是把权重直接一行一个保存在本地txt中,文件命名就直接是该层的层名,比如该层层名是conv1,那么就是conv1_weight_0.txt,conv1_weight_1.txt。第一行放形状,比如64,3,7,7。
    由于权重也是以blob形式存在的,所以我在blob源码里面加上了保存该blob数据到本地txt的函数,只需要提供保存的地址就可以。如下:

    void save_data_to_txt(const string path_txt,bool b_save_shape = true)
      {
        std::ofstream fOut(path_txt);
        if (!fOut)
        {
          std::cout << "Open output file faild." << std::endl;
        }
        if(b_save_shape)
        {
          for(int i=0;i<shape_.size();i++)
          {
            fOut << shape_[i];
            if(i == shape_.size()-1)
            {
              fOut<<std::endl;
            }else
            {
              fOut<<",";
            }
          }
        }
    
        const Dtype* data_vec = cpu_data();
        for (int i = 0; i < count_; ++i) {
          fOut << data_vec[i] << std::endl;
        }
        fOut.close();
      }
    

    下面直接上我的代码,保存每层权重到txt的代码如下:

     std::cout<<"\n\n\n\n============2021-11-18======================================="<<std::endl;
          shared_ptr<Net<float> > net_ = classifier.get_net(); //这里是从跑前向的类里面拿Net类
          vector<shared_ptr<Layer<float> > >  layers = net_->layers(); //拿到每层Layer算子的指针
          vector<shared_ptr<Blob<float> > > params = net_->params();//拿到所有权重指针
          vector<vector<Blob<float>*> > bottom_vecs_ = net_->bottom_vecs();//拿到所有bottom feature map
          vector<vector<Blob<float>*> > top_vecs_ = net_->top_vecs();//拿到所有top feature map //注意这里面的layers和bottom_vecs_ top_vecs_都是一一对应的
          std::cout<<"size layer=" << layers.size()<<std::endl;
          std::cout<<"size params=" << params.size()<<std::endl;
          string path_save_dir = "/data_1/Yang/project/save_weight/";
    
          for(int i=0;i<layers.size();i++)
          {
              shared_ptr<Layer<float> > layer = layers[i];
              string name_layer = layer->layer_param().name();//当前层层名
              std::cout<<i<<"   layer_name="<<name_layer<<"    type="<<layer->layer_param().type()<<std::endl;
              int bottom_name_size = layer->layer_param().bottom().size();
              std::cout<<"=================bottom================"<<std::endl;
              if(bottom_name_size>0)
              {
                  for(int ii=0;ii<bottom_name_size;ii++)
                  {
                      std::cout<<ii<<" ::bottom name="<<layer->layer_param().bottom(ii)<<std::endl;
                      Blob<float>* ptr_blob = bottom_vecs_[i][ii];
                      std::cout<<"bottom shape="<<ptr_blob->shape_string()<<std::endl;
                  }
              } else{
                  std::cout<<"no bottom"<<std::endl;
              }
              std::cout<<"=================top================"<<std::endl;
              int top_name_size = layer->layer_param().top().size();
              if(top_name_size>0)
              {
                  for(int ii=0;ii<top_name_size;ii++)
                  {
                      std::cout<<ii<<" ::top name="<<layer->layer_param().top(ii)<<std::endl;
                      Blob<float>* ptr_blob = top_vecs_[i][ii];
                      std::cout<<"top shape="<<ptr_blob->shape_string()<<std::endl;
                  }
              } else{
                  std::cout<<"no top"<<std::endl;
              }
    
    
              vector<shared_ptr<Blob<float> > > params = layer->blobs();
              std::cout<<"=================params ================"<<std::endl;
              std::cout<<"params size= "<<params.size()<<std::endl;
              if(0 == params.size())
              {
                  std::cout<<"has no params"<<std::endl;
              } else
              {
                  for(int j=0;j<params.size();j++)
                  {
                      std::cout<<"params_"<<j<<" shape="<<params[j]->shape_string()<<std::endl;
    
                      params[j]->save_data_to_txt(path_save_dir + name_layer + "_weight_" + std::to_string(j)+".txt");
                  }
              }
              std::cout<<std::endl;
          }
    
    
          //这里是为了对比caffe和pytorch的某一层输出是否一致,先保存caffe的某层feature map输出。
          string name_aim_top = "premuted_fc";
          const shared_ptr<Blob<float>> feature_map = net_->blob_by_name(name_aim_top);
          bool b_save_shape = false;
          std::cout<<"featuremap shape="<<std::endl;
          std::cout<<feature_map->shape_string()<<std::endl;
          feature_map->save_data_to_txt("/data_1/Yang/project/myfile/blob_val/"+name_aim_top+".txt",b_save_shape);
    

    看caffe网络的话,可以直接把prototxt文件复制到网页上面查看。
    http://ethereon.github.io/netscope/quickstart.html
    这样看比较直观。

    这里需要特别注意的是一个,就地操作。就是比如图上网络连在一起的conv1,conv1_bn,conv1_scale,conv1_relu由于它们的bottom和top名字一样,导致经过该层的运算结果直接会覆盖bottom,就是共用了一块内存。
    这里是一个坑,之前一个同事也在做类似的工作,然后不同框架之间对比检查精度,发现刚开始的几层精度就对不上了,苦苦找问题找了一周都没有找到,最后让我帮忙看了看,我看了大半天才发现是这个就地操作导致的,你想拿conv1的feature map的结果是拿不到,你拿的实际已经是经过了conv1,conv1_bn,conv1_scale,conv1_relu这4步操作之后的结果了!

    以上,就会生成每层权重,如果该层有多个权重,就直接是文件名末尾计数0,1,2来区分的,命名方式是layerName+_weight_cnt.txt。文件txt第一行是权重的shape,比如64,64,1,1。

    完事之后,在Python端,我先写了一个脚本,读取txt把这些权重保存在一个字典里面。

    import os
    import numpy as np
    
    #这个类主要是为了能够多重字典赋值
    class AutoVivification(dict):
        """Implementation of perl's autovivification feature."""
        def __getitem__(self, item):
            try:
                return dict.__getitem__(self, item)
            except KeyError:
                value = self[item] = type(self)()
                return value
    
    
    def get_weight_numpy(path_dir):
        out = AutoVivification()
        list_txt = os.listdir(path_dir)
        for cnt,txt in enumerate(list_txt):
            print(cnt, "  ", txt)
            txt_ = txt.replace(".txt","")
            layer_name, idx = txt_.split("_weight_")
            path_txt = path_dir + txt
            with open(path_txt, 'r') as fr:
                lines = fr.readlines()
                data = []
                shape_line = []
                for cnt_1, line in enumerate(lines):
                    if(0 == cnt_1):
                        shape_line = []
                        shape_line = line.strip().split(",")
                    else:
                        data.append(float(line))
    
                shape_line = map(eval, shape_line)
                data = np.array(data).reshape(shape_line)
                # new_dict = {}
                out[layer_name][int(idx)] = data
    
        return out
    
    if __name__ == "__main__":
        path_dir = "/data_1/Yang/project/save_weight/"
        out = get_weight_numpy(path_dir)
        conv1_weight = out['conv1'][0]
        conv1_bias = out['conv1'][1]
    

    下面直接给出把caffe保存的权重怼到搭建的pytorch 层上:

    # coding=utf-8
    import torch
    import torchvision
    from torch import nn
    import torch.nn.functional as F
    
    import cv2
    import numpy as np
    from weight_numpy import get_weight_numpy
    
    
    
    class lstm_general(nn.Module):  # SfSNet = PS-Net in SfSNet_deploy.prototxt
        def __init__(self):
            super(lstm_general, self).__init__()
            # self.conv1_1 = nn.Conv2d(3, 64, 3, 1, 1)
            self.data_bn = nn.BatchNorm2d(3)
            self.conv1 = nn.Conv2d(3, 64, 7, 2, 3)
            self.conv1_bn = nn.BatchNorm2d(64)
    
            self.conv1_pool = nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True)
    
            self.layer_64_1_conv1 = nn.Conv2d(64, 64, 1, 1, 0, bias = False)
            self.layer_64_1_bn2 = nn.BatchNorm2d(64)
    
            self.layer_64_1_conv2 = nn.Conv2d(64, 64, 3, 1, 1, bias=False)
            self.layer_64_1_bn3 = nn.BatchNorm2d(64)
    
            self.layer_64_1_conv3 = nn.Conv2d(64, 256, 1, 1, 0, bias=False)
            self.layer_64_1_conv_expand = nn.Conv2d(64, 256, 1, 1, 0, bias=False)
    
            self.layer_128_1_bn1 = nn.BatchNorm2d(256)
    
            self.layer_128_1_conv1 = nn.Conv2d(256, 128, 1, 1, 0, bias=False)
            self.layer_128_1_bn2 = nn.BatchNorm2d(128)
    
            self.layer_128_1_conv2 = nn.Conv2d(128, 128, 3, 1, 1, bias=False)
            self.layer_128_1_bn3 = nn.BatchNorm2d(128)
    
            self.layer_128_1_conv3 = nn.Conv2d(128, 512, 1, 1, 0, bias=False)
            self.layer_128_1_conv_expand = nn.Conv2d(256, 512, 1, 1, 0, bias=False)
    
            self.last_bn = nn.BatchNorm2d(512)
    
    
            # self.lstm_1 = nn.LSTM(512 * 8, 100, 1, bidirectional=False)
            self.lstm_lr = nn.LSTM(512 * 8, 100, 1, bidirectional=True)
    
            self.fc1x1_r2_v2_a = nn.Linear(200,7118)
    
    
        def forward(self, inputs):
            # x = F.relu(self.bn1_1(self.conv1_1(inputs)))
            x = self.data_bn(inputs)
            x = F.relu(self.conv1_bn(self.conv1(x)))
            x = self.conv1_pool(x) #[1,64,8,80]
    
            x = F.relu(self.layer_64_1_bn2(self.layer_64_1_conv1(x)))  # 1 64 8 80
            layer_64_1_conv1 = x
    
            x = F.relu(self.layer_64_1_bn3(self.layer_64_1_conv2(x)))
    
            x = self.layer_64_1_conv3(x)
    
            layer_64_1_conv_expand = self.layer_64_1_conv_expand(layer_64_1_conv1)
            layer_64_3_sum = x + layer_64_1_conv_expand  #1 256 8 80
    
            x = F.relu(self.layer_128_1_bn1(layer_64_3_sum))
            layer_128_1_bn1 = x
    
            x = F.relu(self.layer_128_1_bn2(self.layer_128_1_conv1(x)))
            x = F.relu(self.layer_128_1_bn3(self.layer_128_1_conv2(x)))
            x = self.layer_128_1_conv3(x) #1, 512, 8, 80
            layer_128_1_conv_expand = self.layer_128_1_conv_expand(layer_128_1_bn1)  #1, 512, 8, 80
            layer_128_4_sum = x + layer_128_1_conv_expand
    
            x = F.relu(self.last_bn(layer_128_4_sum))
            x = F.dropout(x, p=0.7, training=False) #1 512 8 80
            x = x.permute(3,0,1,2) # 80 1 512 8
            x = x.reshape(80,1,512*8)
            #
            # merge_lstm_rlstmx, (hn, cn) = self.lstm_r(x)
    
            lstm_out,(_,_) = self.lstm_lr(x) #(80,1,200)
            out = self.fc1x1_r2_v2_a(lstm_out) #(80,1,7118)
    
            return out
    
    
    
    def save_tensor(tensor_in,path_save):
        tensor_in = tensor_in.contiguous().view(-1,1)
        np_tensor = tensor_in.cpu().detach().numpy()
        # np_tensor = np_tensor.view()
        np.savetxt(path_save,np_tensor,fmt='%.12e')
    
    
    
    def access_pixels(frame):
        print(frame.shape)  # shape内包含三个元素:按顺序为高、宽、通道数
        height = frame.shape[0]
        weight = frame.shape[1]
        channels = frame.shape[2]
        print("weight : %s, height : %s, channel : %s" % (weight, height, channels))
    
        with open("/data_1/Yang/project/myfile/blob_val/img_stand_python.txt", "w") as fw:
            for row in range(height):  # 遍历高
                for col in range(weight):  # 遍历宽
                    for c in range(channels):  # 便利通道
                        pv = frame[row, col, c]
                        fw.write(str(int(pv)))
                        fw.write("\n")
    
    
    
    
    def LstmImgStandardization(img, ratio=10.0, stand_w=320, stand_h=32):
        img_h, img_w, _ = img.shape
        if img_h < 2 or img_w < 2:
            return
        # if 32 == img_h and 320 == img_w:
        #     return img
    
        ratio_now = img_w * 1.0 / img_h
        if ratio_now <= ratio:
            mask = np.ones((img_h, int(img_h * ratio), 3), dtype=np.uint8) * 255
            mask[0:img_h,0:img_w,:] = img
        else:
            mask = np.ones((int(img_w*1.0/ratio), img_w, 3), dtype=np.uint8) * 255
            mask[0:img_h, 0:img_w, :] = img
    
        mask_stand = cv2.resize(mask,(stand_w, stand_h),interpolation=cv2.INTER_LINEAR)
    
        # access_pixels(mask_stand)
        return mask_stand
    
    
    
    
    if __name__ == '__main__':
    
        device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
    
        net = lstm_general()
        # net.eval()
    
        index = 0
        print("*" * 50)
        for name, param in list(net.named_parameters()):
            print(str(index) + ':', name, param.size())
            index += 1
        print("*" * 50)
    
        ##搭建完网络就可以通过这里看到网络所需要的参数名字
        for k, v in net.state_dict().items():
            print(k)
            print(v.shape)
    
            # print(k,v)
        print("@" * 50)
    
        # aaa = np.zeros((400,1))
    
    
    
    
    
    
        path_dir = "/data_1/Yang/project/OCR/3rdlib/caffe_ocr_2021/myfile/save_weight/"
        weight_numpy_dict = get_weight_numpy(path_dir)
        from torch import from_numpy
        state_dict = {}
        state_dict['data_bn.running_mean'] = from_numpy(weight_numpy_dict["data_bn"][0] / weight_numpy_dict["data_bn"][2])
        state_dict['data_bn.running_var'] = from_numpy(weight_numpy_dict["data_bn"][1] / weight_numpy_dict["data_bn"][2])
        state_dict['data_bn.weight'] = from_numpy(weight_numpy_dict['data_scale'][0])
        state_dict['data_bn.bias'] = from_numpy(weight_numpy_dict['data_scale'][1])
    
        state_dict['conv1.weight'] = from_numpy(weight_numpy_dict['conv1'][0])
        state_dict['conv1.bias'] = from_numpy(weight_numpy_dict['conv1'][1])
        state_dict['conv1_bn.running_mean'] = from_numpy(weight_numpy_dict["conv1_bn"][0] / weight_numpy_dict["conv1_bn"][2])
        state_dict['conv1_bn.running_var'] = from_numpy(weight_numpy_dict["conv1_bn"][1] / weight_numpy_dict["conv1_bn"][2])
        state_dict['conv1_bn.weight'] = from_numpy(weight_numpy_dict['conv1_scale'][0])
        state_dict['conv1_bn.bias'] = from_numpy(weight_numpy_dict['conv1_scale'][1])
    
        state_dict['layer_64_1_conv1.weight'] = from_numpy(weight_numpy_dict['layer_64_1_conv1'][0])
        state_dict['layer_64_1_bn2.running_mean'] = from_numpy(weight_numpy_dict["layer_64_1_bn2"][0] / weight_numpy_dict["layer_64_1_bn2"][2])
        state_dict['layer_64_1_bn2.running_var'] = from_numpy(weight_numpy_dict["layer_64_1_bn2"][1] / weight_numpy_dict["layer_64_1_bn2"][2])
        state_dict['layer_64_1_bn2.weight'] = from_numpy(weight_numpy_dict['layer_64_1_scale2'][0])
        state_dict['layer_64_1_bn2.bias'] = from_numpy(weight_numpy_dict['layer_64_1_scale2'][1])
    
    
        state_dict['layer_64_1_conv2.weight'] = from_numpy(weight_numpy_dict['layer_64_1_conv2'][0])
        state_dict['layer_64_1_bn3.running_mean'] = from_numpy(weight_numpy_dict["layer_64_1_bn3"][0] / weight_numpy_dict["layer_64_1_bn3"][2])
        state_dict['layer_64_1_bn3.running_var'] = from_numpy(weight_numpy_dict["layer_64_1_bn3"][1] / weight_numpy_dict["layer_64_1_bn3"][2])
        state_dict['layer_64_1_bn3.weight'] = from_numpy(weight_numpy_dict['layer_64_1_scale3'][0])
        state_dict['layer_64_1_bn3.bias'] = from_numpy(weight_numpy_dict['layer_64_1_scale3'][1])
    
        state_dict['layer_64_1_conv3.weight'] = from_numpy(weight_numpy_dict['layer_64_1_conv3'][0])
        state_dict['layer_64_1_conv_expand.weight'] = from_numpy(weight_numpy_dict['layer_64_1_conv_expand'][0])
    
        state_dict['layer_128_1_bn1.running_mean'] = from_numpy(weight_numpy_dict["layer_128_1_bn1"][0] / weight_numpy_dict["layer_128_1_bn1"][2])
        state_dict['layer_128_1_bn1.running_var'] = from_numpy(weight_numpy_dict["layer_128_1_bn1"][1] / weight_numpy_dict["layer_128_1_bn1"][2])
        state_dict['layer_128_1_bn1.weight'] = from_numpy(weight_numpy_dict['layer_128_1_scale1'][0])
        state_dict['layer_128_1_bn1.bias'] = from_numpy(weight_numpy_dict['layer_128_1_scale1'][1])
    
        state_dict['layer_128_1_conv1.weight'] = from_numpy(weight_numpy_dict['layer_128_1_conv1'][0])
        state_dict['layer_128_1_bn2.running_mean'] = from_numpy(weight_numpy_dict["layer_128_1_bn2"][0] / weight_numpy_dict["layer_128_1_bn2"][2])
        state_dict['layer_128_1_bn2.running_var'] = from_numpy(weight_numpy_dict["layer_128_1_bn2"][1] / weight_numpy_dict["layer_128_1_bn2"][2])
        state_dict['layer_128_1_bn2.weight'] = from_numpy(weight_numpy_dict['layer_128_1_scale2'][0])
        state_dict['layer_128_1_bn2.bias'] = from_numpy(weight_numpy_dict['layer_128_1_scale2'][1])
    
        state_dict['layer_128_1_conv2.weight'] = from_numpy(weight_numpy_dict['layer_128_1_conv2'][0])
        state_dict['layer_128_1_bn3.running_mean'] = from_numpy(weight_numpy_dict["layer_128_1_bn3"][0] / weight_numpy_dict["layer_128_1_bn3"][2])
        state_dict['layer_128_1_bn3.running_var'] = from_numpy(weight_numpy_dict["layer_128_1_bn3"][1] / weight_numpy_dict["layer_128_1_bn3"][2])
        state_dict['layer_128_1_bn3.weight'] = from_numpy(weight_numpy_dict['layer_128_1_scale3'][0])
        state_dict['layer_128_1_bn3.bias'] = from_numpy(weight_numpy_dict['layer_128_1_scale3'][1])
    
        state_dict['layer_128_1_conv3.weight'] = from_numpy(weight_numpy_dict['layer_128_1_conv3'][0])
        state_dict['layer_128_1_conv_expand.weight'] = from_numpy(weight_numpy_dict['layer_128_1_conv_expand'][0])
    
        state_dict['last_bn.running_mean'] = from_numpy(weight_numpy_dict["last_bn"][0] / weight_numpy_dict["last_bn"][2])
        state_dict['last_bn.running_var'] = from_numpy(weight_numpy_dict["last_bn"][1] / weight_numpy_dict["last_bn"][2])
        state_dict['last_bn.weight'] = from_numpy(weight_numpy_dict['last_scale'][0])
        state_dict['last_bn.bias'] = from_numpy(weight_numpy_dict['last_scale'][1])
    
        ## caffe i f o g
        ## pytorch i f g o
    
        ww = from_numpy(weight_numpy_dict['lstm1x_r2'][0])  # [400,4096]
        ww_200_if = ww[:200,:] #[200,4096]
        ww_100_o = ww[200:300,:] #[100,4096]
        ww_100_g = ww[300:400,:]#[100,4096]
        ww_cat_ifgo = torch.cat((ww_200_if,ww_100_g,ww_100_o),0)
        state_dict['lstm_lr.weight_ih_l0'] = ww_cat_ifgo
    
        bb = from_numpy(weight_numpy_dict['lstm1x_r2'][1])  # [400]
        bb_200_if = bb[:200]
        bb_100_o = bb[200:300]
        bb_100_g = bb[300:400]
        bb_cat_ifgo = torch.cat((bb_200_if, bb_100_g, bb_100_o), 0)
        state_dict['lstm_lr.bias_ih_l0'] = bb_cat_ifgo
    
        ww = from_numpy(weight_numpy_dict['lstm1x_r2'][2])  # [400,100]
        ww_200_if = ww[:200, :]  # [200,100]
        ww_100_o = ww[200:300, :]  # [100,100]
        ww_100_g = ww[300:400, :]  # [100,100]
        ww_cat_ifgo = torch.cat((ww_200_if, ww_100_g, ww_100_o), 0)
        state_dict['lstm_lr.weight_hh_l0'] = ww_cat_ifgo
    
        state_dict['lstm_lr.bias_hh_l0'] = from_numpy(np.zeros((400)))
    
        ##########################################
        ww = from_numpy(weight_numpy_dict['lstm2x_r2'][0])  # [400,4096]
        ww_200_if = ww[:200, :]  # [200,4096]
        ww_100_o = ww[200:300, :]  # [100,4096]
        ww_100_g = ww[300:400, :]  # [100,4096]
        ww_cat_ifgo = torch.cat((ww_200_if, ww_100_g, ww_100_o), 0)
        state_dict['lstm_lr.weight_ih_l0_reverse'] = ww_cat_ifgo
    
        bb = from_numpy(weight_numpy_dict['lstm2x_r2'][1])  # [400]
        bb_200_if = bb[:200]
        bb_100_o = bb[200:300]
        bb_100_g = bb[300:400]
        bb_cat_ifgo = torch.cat((bb_200_if, bb_100_g, bb_100_o), 0)
        state_dict['lstm_lr.bias_ih_l0_reverse'] = bb_cat_ifgo
    
        ww = from_numpy(weight_numpy_dict['lstm2x_r2'][2])  # [400,100]
        ww_200_if = ww[:200, :]  # [200,100]
        ww_100_o = ww[200:300, :]  # [100,100]
        ww_100_g = ww[300:400, :]  # [100,100]
        ww_cat_ifgo = torch.cat((ww_200_if, ww_100_g, ww_100_o), 0)
        state_dict['lstm_lr.weight_hh_l0_reverse'] = ww_cat_ifgo
    
        state_dict['lstm_lr.bias_hh_l0_reverse'] = from_numpy(np.zeros((400)))
    
        state_dict['fc1x1_r2_v2_a.weight'] = from_numpy(weight_numpy_dict['fc1x1_r2_v2_a'][0])
        state_dict['fc1x1_r2_v2_a.bias'] = from_numpy(weight_numpy_dict['fc1x1_r2_v2_a'][1])
    
    
    
        ####input########################################
        path_img = "/data_2/project/1.jpg"
        img = cv2.imread(path_img)
        # access_pixels(img)
    
        img_stand = LstmImgStandardization(img, ratio=10.0, stand_w=320, stand_h=32)
    
    
        img_stand = img_stand.astype(np.float32)
        # img = (img / 255. - config.DATASET.MEAN) / config.DATASET.STD
        img_stand = img_stand.transpose([2, 0, 1])
        img_stand = img_stand[None,:,:,:]
        img_stand = torch.from_numpy(img_stand)
    
        img_stand = img_stand.type(torch.FloatTensor)
    
        img_stand = img_stand.to(device)
        # img_stand = img_stand.view(1, *img.size())
    
    
    
        #######net##########################
        net.load_state_dict(state_dict)
        net.cuda()
        net.eval()
    
        preds = net(img_stand)
        print("out shape=",preds.shape)
    
    
        torch.save(net.state_dict(), './lstm_model.pth')
    
    
    
        # name_top_caffe_layer = "fc1x_a"  #"merge_lstm_rlstmx"  #"#"data_bn"
        # path_save = "/data_1/Yang/project/myfile/blob_val/" + name_top_caffe_layer + "_torch.txt"
        # save_tensor(preds, path_save)
    
    
        aaa = 0
    
    

    这里需要注意一下caffe里面的bn层有三个参数,前面两个是均值和方差,第三个参数是一个系数,均值和方差都需要除以这个系数,这个系数是一个固定值999.982

    caffe中的scale层就是图中下面这个公式系数。

    这里还需要讲下lstm这个算法。在caffe中设定的time_step为80,设定的hidden为100,输入到lstm之前的feature map大小是80,1,512,8.
    然后我通过层的权重看到lstm有3个权重,大小分别是[400,4096] [400] [400,100]
    lstm通过查看源码发现有参数的就是2个全连接层,[400,4096] [400] 这两个是对输入进行inner所需要的参数,400是100*4得到的,至于为什么是4,这个需要看lstm原理,这里简单说下就是用h,x有4组相乘。
    [400,100]是隐含h进行inner所需要的权重。
    查看pytorch手册关于lstm介绍。
    https://pytorch.org/docs/1.0.1/nn.html?highlight=lstm#torch.nn.LSTM。输入参数介绍。



    然后根据输入参数,单独写了一个lstm算子测试看看:

    import  torch
    import torch.nn as nn
    
    
    
    # rnn = nn.LSTM(512*8, 100, 1, False)
    # input = torch.randn(80, 1, 512*8)
    #
    # output, (hn, cn) = rnn(input)
    #
    #
    # for name,parameters in rnn.named_parameters():
    #   print(name,':',parameters.size())
    #   # parm[name]=parameters.detach().numpy()
    #
    # aa = 0
    
    
    rnn = nn.LSTM(512*8, 100, 1, bidirectional=True)
    input = torch.randn(80, 1, 512*8)
    
    output, (hn, cn) = rnn(input)
    print("out shape=",output.shape)
    
    for name,parameters in rnn.named_parameters():
      print(name,':',parameters.size())
      # parm[name]=parameters.detach().numpy()
    
    aa = 0
    

    输出如下:

    ('out shape=', (80, 1, 200))
    ('weight_ih_l0', ':', (400, 4096))
    ('weight_hh_l0', ':', (400, 100))
    ('bias_ih_l0', ':', (400,))
    ('bias_hh_l0', ':', (400,))
    ('weight_ih_l0_reverse', ':', (400, 4096))
    ('weight_hh_l0_reverse', ':', (400, 100))
    ('bias_ih_l0_reverse', ':', (400,))
    ('bias_hh_l0_reverse', ':', (400,))
    
    Process finished with exit code 0
    

    可以看到pytorch的lstm所需要的参数基本与caffe一致,不过caffe的一个lstm参数是3个,pytorch的lstm参数是4个,显然是因为caffe隐含层的inner没用偏置,到时候直接把一个pytorch的偏置放为0就可以!

    然而事情并不是一帆风顺的,上面给出的代码是成功的,但是在此之前我把所有的参数都怼上,但是精度是不对的。后面仔细看lstm源码,发现caffe的计算顺序:
    lstm_unit_layer.cpp

    template <typename Dtype>
    void LSTMUnitLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
        const vector<Blob<Dtype>*>& top) {
      const int num = bottom[0]->shape(1);//1
      const int x_dim = hidden_dim_ * 4;
      const Dtype* C_prev = bottom[0]->cpu_data();
      const Dtype* X = bottom[1]->cpu_data();
      const Dtype* cont = bottom[2]->cpu_data();
      Dtype* C = top[0]->mutable_cpu_data();
      Dtype* H = top[1]->mutable_cpu_data();
      for (int n = 0; n < num; ++n) { //1
        for (int d = 0; d < hidden_dim_; ++d) {//100
          const Dtype i = sigmoid(X[d]);
          const Dtype f = (*cont == 0) ? 0 :
              (*cont * sigmoid(X[1 * hidden_dim_ + d]));weight_ih_l[k] – the learnable input-hidden weights of the \text{k}^{th}k 
    th
      layer (W_ii|W_if|W_ig|W_io), of shape (4*hidden_size x input_size)
    weight_hh_l[k] – the learnable hidden-hidden weights of the \text{k}^{th}k 
    th
      layer (W_hi|W_hf|W_hg|W_ho), of shape (4*hidden_size x hidden_size)
    bias_ih_l[k] – the learnable input-hidden bias of the \text{k}^{th}k 
    th
      layer (b_ii|b_if|b_ig|b_io), of shape (4*hidden_size)
    bias_hh_l[k] – the learnable hidden-hidden bias of the \text{k}^{th}k 
    th
      layer (b_hi|b_hf|b_hg|b_ho), of shape (4*hidden_size)
          const Dtype o = sigmoid(X[2 * hidden_dim_ + d]);
          const Dtype g = tanh(X[3 * hidden_dim_ + d]);
          const Dtype c_prev = C_prev[d];
          const Dtype c = f * c_prev + i * g;
          C[d] = c;
          const Dtype tanh_c = tanh(c);
          H[d] = o * tanh_c;
        }
        C_prev += hidden_dim_;
        X += x_dim;
        C += hidden_dim_;
        H += hidden_dim_;
        ++cont;
      }
    }
    

    发现caffe的计算顺序是ifog。
    看pytorch说明文档介绍权重的顺序是

    weight_ih_l[k] – the learnable input-hidden weights of the \text{k}^{th}k 
    th
      layer (W_ii|W_if|W_ig|W_io), of shape (4*hidden_size x input_size)
    weight_hh_l[k] – the learnable hidden-hidden weights of the \text{k}^{th}k 
    th
      layer (W_hi|W_hf|W_hg|W_ho), of shape (4*hidden_size x hidden_size)
    bias_ih_l[k] – the learnable input-hidden bias of the \text{k}^{th}k 
    th
      layer (b_ii|b_if|b_ig|b_io), of shape (4*hidden_size)
    bias_hh_l[k] – the learnable hidden-hidden bias of the \text{k}^{th}k 
    th
      layer (b_hi|b_hf|b_hg|b_ho), of shape (4*hidden_size)
    

    有点儿不一样,那么我只需要把caffe的权重顺序改下和pytorch一致试试。所有就有了上面的代码:

     ## caffe i f o g
        ## pytorch i f g o
    
        ww = from_numpy(weight_numpy_dict['lstm1x_r2'][0])  # [400,4096]
        ww_200_if = ww[:200,:] #[200,4096]
        ww_100_o = ww[200:300,:] #[100,4096]
        ww_100_g = ww[300:400,:]#[100,4096]
        ww_cat_ifgo = torch.cat((ww_200_if,ww_100_g,ww_100_o),0)
        state_dict['lstm_lr.weight_ih_l0'] = ww_cat_ifgo
    

    这样一整,成功了,精度一致!! 给出测试精度的代码。
    不同框架下验证精度 https://www.cnblogs.com/yanghailin/p/15593614.html
    给出我跑出结果的代码:

    # -*- coding: utf-8
    import torch
    from torch import nn
    import torch.nn.functional as F
    
    import cv2
    import numpy as np
    import os
    
    from chn_tab import chn_tab
    
    
    
    class lstm_general(nn.Module):  # SfSNet = PS-Net in SfSNet_deploy.prototxt
        def __init__(self):
            super(lstm_general, self).__init__()
            # self.conv1_1 = nn.Conv2d(3, 64, 3, 1, 1)
            self.data_bn = nn.BatchNorm2d(3)
            self.conv1 = nn.Conv2d(3, 64, 7, 2, 3)
            self.conv1_bn = nn.BatchNorm2d(64)
    
            self.conv1_pool = nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True)
    
            self.layer_64_1_conv1 = nn.Conv2d(64, 64, 1, 1, 0, bias = False)
            self.layer_64_1_bn2 = nn.BatchNorm2d(64)
    
            self.layer_64_1_conv2 = nn.Conv2d(64, 64, 3, 1, 1, bias=False)
            self.layer_64_1_bn3 = nn.BatchNorm2d(64)
    
            self.layer_64_1_conv3 = nn.Conv2d(64, 256, 1, 1, 0, bias=False)
            self.layer_64_1_conv_expand = nn.Conv2d(64, 256, 1, 1, 0, bias=False)
    
            self.layer_128_1_bn1 = nn.BatchNorm2d(256)
    
            self.layer_128_1_conv1 = nn.Conv2d(256, 128, 1, 1, 0, bias=False)
            self.layer_128_1_bn2 = nn.BatchNorm2d(128)
    
            self.layer_128_1_conv2 = nn.Conv2d(128, 128, 3, 1, 1, bias=False)
            self.layer_128_1_bn3 = nn.BatchNorm2d(128)
    
            self.layer_128_1_conv3 = nn.Conv2d(128, 512, 1, 1, 0, bias=False)
            self.layer_128_1_conv_expand = nn.Conv2d(256, 512, 1, 1, 0, bias=False)
    
            self.last_bn = nn.BatchNorm2d(512)
    
    
    
    
    
    
            # self.lstm_1 = nn.LSTM(512 * 8, 100, 1, bidirectional=False)
            self.lstm_lr = nn.LSTM(512 * 8, 100, 1, bidirectional=True)
    
    
    
            self.fc1x1_r2_v2_a = nn.Linear(200,7118)
    
    
        def forward(self, inputs):
            # x = F.relu(self.bn1_1(self.conv1_1(inputs)))
            x = self.data_bn(inputs)
            x = F.relu(self.conv1_bn(self.conv1(x)))
            x = self.conv1_pool(x) #[1,64,8,80]
    
            x = F.relu(self.layer_64_1_bn2(self.layer_64_1_conv1(x)))  # 1 64 8 80
            layer_64_1_conv1 = x
    
            x = F.relu(self.layer_64_1_bn3(self.layer_64_1_conv2(x)))
    
            x = self.layer_64_1_conv3(x)
    
            layer_64_1_conv_expand = self.layer_64_1_conv_expand(layer_64_1_conv1)
            layer_64_3_sum = x + layer_64_1_conv_expand  #1 256 8 80
    
            x = F.relu(self.layer_128_1_bn1(layer_64_3_sum))
            layer_128_1_bn1 = x
    
            x = F.relu(self.layer_128_1_bn2(self.layer_128_1_conv1(x)))
            x = F.relu(self.layer_128_1_bn3(self.layer_128_1_conv2(x)))
            x = self.layer_128_1_conv3(x) #1, 512, 8, 80
            layer_128_1_conv_expand = self.layer_128_1_conv_expand(layer_128_1_bn1)  #1, 512, 8, 80
            layer_128_4_sum = x + layer_128_1_conv_expand
    
            x = F.relu(self.last_bn(layer_128_4_sum))###acc ok
    
            x = F.dropout(x, p=0.7, training=False) #1 512 8 80
            x = x.permute(3,0,1,2) # 80 1 512 8
            x = x.reshape(80,1,512*8)###acc ok
    
    
            #
            # merge_lstm_rlstmx, (hn, cn) = self.lstm_r(x)
    
            lstm_out,(_,_) = self.lstm_lr(x) #(80,1,200)
    
            return lstm_out
    
    
            out = self.fc1x1_r2_v2_a(lstm_out) #(80,1,7118)
    
            return out
    
    
    def LstmImgStandardization(img, ratio=10.0, stand_w=320, stand_h=32):
        img_h, img_w, _ = img.shape
        if img_h < 2 or img_w < 2:
            return
        # if 32 == img_h and 320 == img_w:
        #     return img
    
        ratio_now = img_w * 1.0 / img_h
        if ratio_now <= ratio:
            mask = np.ones((img_h, int(img_h * ratio), 3), dtype=np.uint8) * 255
            mask[0:img_h,0:img_w,:] = img
        else:
            mask = np.ones((int(img_w*1.0/ratio), img_w, 3), dtype=np.uint8) * 255
            mask[0:img_h, 0:img_w, :] = img
    
        mask_stand = cv2.resize(mask,(stand_w, stand_h),interpolation=cv2.INTER_LINEAR)
    
        # access_pixels(mask_stand)
        return mask_stand
    
    
    
    
    if __name__ == '__main__':
        path_model = "/data_1/everyday/1118/pytorch_lstm_test/lstm_model.pth"
        path_img = "/data_2/project_202009/chejian/test_data/model_test/rec_general/1.jpg"
        blank_label = 7117
        prev_label = blank_label
    
    
        device = torch.device('cuda:0') if torch.cuda.is_available() else torch.device('cpu')
    
        img = cv2.imread(path_img)
        img_stand = LstmImgStandardization(img, ratio=10.0, stand_w=320, stand_h=32)
        img_stand = img_stand.astype(np.float32)
        img_stand = img_stand.transpose([2, 0, 1])
        img_stand = img_stand[None, :, :, :]
        img_stand = torch.from_numpy(img_stand)
        img_stand = img_stand.type(torch.FloatTensor)
        img_stand = img_stand.to(device)
    
        net = lstm_general()
        checkpoint = torch.load(path_model)
        net.load_state_dict(checkpoint)
        net.cuda()
        net.eval()
    
        # traced_script_module = torch.jit.trace(net, img_stand)
        # traced_script_module.save("./lstm.pt")
    
        preds = net(img_stand)
        # print("out shape=", preds.shape)
    
        preds_1 = preds.squeeze()
        # print("preds_1 out shape=", preds_1.shape)
        val, pos = torch.max(preds_1,1)
        pos = pos.cpu().numpy()
    
    
        rec = ""
        for predict_label in pos:
            if predict_label != blank_label and predict_label != prev_label:
                # print("predict_label=",predict_label)
                print(chn_tab[predict_label])
                rec += chn_tab[predict_label]
            prev_label = predict_label
    
    
        # print("rec=",rec)
        print(rec)
    

    弄成功了,但是只高兴了一天。

    我最终目的是能在c++下面跑,于是转libtorch,本来我以为这是轻而易举的事情,但是事情并没有那么简单。
    我发现我的libtorch代码经过lstm这层之后精度就对不上了,在此之前都是可以对上的。!!!无解。
    可能和版本有关系,因为我用高版本的libtorch之前是转成功一个crnn的,是没有问题的。
    https://github.com/wuzuowuyou/crnn_libtorch
    这个是pytorch1.7版本的,而我现在是用的1.0版本的。我试了很久发现还是精度不对,这就无法解决了,也不知道从何下手去解决这个问题。翻遍了pytorch github上面的issue,没人遇到和我一样的问题。。。除非看pytorch源码去找问题,这太难了。
    在pytorch的github提了issue
    https://github.com/pytorch/pytorch/issues/68864
    我知道这也会石沉大海的。

    以下是我凌乱的,未完工的代码:

    #include <torch/script.h> // One-stop header.
    #include "torch/torch.h"
    #include "torch/jit.h"
    #include <memory>
    #include "opencv2/opencv.hpp"
    #include <queue>
    
    #include <dirent.h>
    #include <iostream>
    #include <cstdlib>
    #include <cstring>
    
    #include <opencv2/opencv.hpp>
    #include <opencv2/core/core.hpp>
    #include <opencv2/highgui/highgui.hpp>
    #include <opencv2/imgproc/imgproc.hpp>
    using namespace cv;
    using namespace std;
    
    // cv::Mat m_stand;
    
    #define TABLE_SIZE 7117
    static string chn_tab[TABLE_SIZE+1] = {"啊","阿","埃"
    
                                            。。。
                                            。。。
                                            。。。
                                           "0","1","2","3","4","5","6","7","8","9",
                                           ":",";","<","=",">","?","@",
                                           "A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z",
                                           "[","\\","]","^","_","`",
                                           "a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z",
                                           "{","|","}","~",
                                           " "};
    
    bool LstmImgStandardization_src_1(const cv::Mat &src, const float &ratio, int standard_w, int standard_h, cv::Mat &dst)
    {
        if(src.empty())return false;
        float width=src.cols;
        float height=src.rows;
        float  a=width/ height;
    
        if(a <=ratio)
        {
            Mat mask(height, ratio*height, CV_8UC3, cv::Scalar(255, 255, 255));
            Mat imageROI = mask(Rect(0, 0, width, height));
            src.copyTo(imageROI);
            dst=mask.clone();
        }
        else
        {
            Mat mask(width/ratio, width, CV_8UC3, cv::Scalar(255, 255, 255));
            Mat imageROI = mask(Rect(0, 0, width, height));
            src.copyTo(imageROI);
            dst=mask.clone();
        }
    
        //cv::resize(dst, dst, cv::Size(standard_w,standard_h));
        cv::resize(dst, dst, cv::Size(standard_w,standard_h),0,0,cv::INTER_AREA);
        return true;
    }
    
    bool lstm_img_standardization(cv::Mat src, cv::Mat &dst,float ratio)
    {
        if(src.empty())return false;
        double width=src.cols;
        double height=src.rows;
        double a=width/height;
    
        if(a <=ratio)//6
        {
            Mat mask(height, ratio*height, CV_8UC3, Scalar(255, 255, 255));
            Mat imageROI = mask(Rect(0, 0, width, height));
            src.copyTo(imageROI);
            dst=mask.clone();
        }
        else
        {
            Mat mask(width/ratio, width, CV_8UC3, Scalar(255, 255, 255));
            Mat imageROI = mask(Rect(0, 0, width, height));
            src.copyTo(imageROI);
            dst=mask.clone();
        }
    
    //    cv::resize(dst, dst, cv::Size(360,60));
        cv::resize(dst, dst, cv::Size(320,32));
    
        return true;
    }
    
    //torch::Tensor pre_img(cv::Mat &img)
    //{
    //    cv::Mat m_stand;
    //    float ratio = 10.0;
    //    if(1 == img.channels()) { cv::cvtColor(img,img,CV_GRAY2BGR); }
    //    lstm_img_standardization(img, m_stand, ratio);
    //
    //    std::vector<int64_t> sizes = {m_stand.rows, m_stand.cols, m_stand.channels()};
    //    torch::TensorOptions options = torch::TensorOptions().dtype(torch::kByte);
    //    torch::Tensor tensor_image = torch::from_blob(m_stand.data, torch::IntList(sizes), options);
    //    // Permute tensor, shape is (C, H, W)
    //    tensor_image = tensor_image.permute({2, 0, 1});
    //
    //
    //    // Convert tensor dtype to float32, and range from [0, 255] to [0, 1]
    //    tensor_image = tensor_image.toType(torch::ScalarType::Float);
    //
    //
    ////    tensor_image = tensor_image.div_(255.0);
    ////    // Subtract mean value
    ////    for (int i = 0; i < std::min<int64_t>(v_mean.size(), tensor_image.size(0)); i++) {
    ////        tensor_image[i] = tensor_image[i].sub_(v_mean[i]);
    ////    }
    ////    // Divide by std value
    ////    for (int i = 0; i < std::min<int64_t>(v_std.size(), tensor_image.size(0)); i++) {
    ////        tensor_image[i] = tensor_image[i].div_(v_std[i]);
    ////    }
    //    //[c,h,w]  -->  [1,c,h,w]
    //    tensor_image.unsqueeze_(0);
    //    std::cout<<tensor_image;
    //    return tensor_image;
    //}
    
    
    
    bool pre_img(cv::Mat &img, torch::Tensor &input_tensor)
    {
        static cv::Mat m_stand;
        float ratio = 10.0;
    //    if(1 == img.channels()) { cv::cvtColor(img,img,CV_GRAY2BGR); }
        lstm_img_standardization(img, m_stand, ratio);
        m_stand.convertTo(m_stand, CV_32FC3);
    
    
    //    imshow("m_stand",m_stand);
    //    waitKey(0);
    
    //    Mat m_stand_new;
    //        m_stand.convertTo(m_stand_new, CV_32FC3);
    
    //        int rowNumber = m_stand_new.rows;  //行数
    //        int colNumber = m_stand_new.cols*m_stand_new.channels();  //列数 x 通道数=每一行元素的个数
    //        std::ofstream out_file("/data_1/everyday/1123/img_acc/after_CV_32FC3-float-111.txt");
    //        //双重循环,遍历所有的像素值
    //        for (int i = 0; i < rowNumber; i++)  //行循环
    //        {
    //            uchar *data = m_stand_new.ptr<uchar>(i);  //获取第i行的首地址
    //            for (int j = 0; j < colNumber; j++)   //列循环
    //            {
    //                // ---------【开始处理每个像素】-------------
    //                int pix = int(data[j]);
    //                out_file << pix << std::endl;
    //            }
    //        }
    //
    //        out_file.close();
    //        std::cout<<"==m_stand.convertTo(m_stand, CV_32FC3);=="<<std::endl;
    //        while(1);
    
    
    
    
        int stand_row = m_stand.rows;
        int stand_cols = m_stand.cols;
    
        input_tensor = torch::from_blob(
                m_stand.data, {stand_row, stand_cols, 3}).toType(torch::kFloat);
        input_tensor = input_tensor.permute({2,0,1});
        input_tensor = input_tensor.unsqueeze(0);//.to(torch::kFloat);
    
    //    std::cout<<input_tensor;
        return true;
    }
    
    
    
    void GetFileInDir(string dirName, vector<string> &v_path)
    {
        DIR* Dir = NULL;
        struct dirent* file = NULL;
        if (dirName[dirName.size()-1] != '/')
        {
            dirName += "/";
        }
        if ((Dir = opendir(dirName.c_str())) == NULL)
        {
            cerr << "Can't open Directory" << endl;
            exit(1);
        }
        while (file = readdir(Dir))
        {
            //if the file is a normal file
            if (file->d_type == DT_REG)
            {
                v_path.push_back(dirName + file->d_name);
            }
                //if the file is a directory
            else if (file->d_type == DT_DIR && strcmp(file->d_name, ".") != 0 && strcmp(file->d_name, "..") != 0)
            {
                GetFileInDir(dirName + file->d_name,v_path);
            }
        }
    }
    
    string str_replace(const string &str,const string &str_find,const string &str_replacee)
    {
        string str_tmp=str;
        size_t pos = str_tmp.find(str_find);
        while (pos != string::npos)
        {
            str_tmp.replace(pos, str_find.length(), str_replacee);
    
            size_t pos_t=pos+str_replacee.length();
            string str_sub=str_tmp.substr(pos_t,str_tmp.length()-pos_t);
    
            size_t pos_tt=str_sub.find(str_find);
            if(string::npos != pos_tt)
            {
                pos =pos_t + str_sub.find(str_find);
            }else
            {
                pos=string::npos;
            }
        }
        return str_tmp;
    }
    
    string get_ans(const string path)
    {
        int pos_1 = path.find_last_of("_");
        int pos_2 = path.find_last_of(".");
        string ans = path.substr(pos_1+1,pos_2-pos_1-1);
        ans = str_replace(ans,"@","/");
        return ans;
    }
    
    bool save_tensor_txt(torch::Tensor tensor_in_,string path_txt)
    {
    #include "fstream"
        ofstream outfile(path_txt);
        torch::Tensor tensor_in = tensor_in_.clone();
        tensor_in = tensor_in.view({-1,1});
        tensor_in = tensor_in.to(torch::kCPU);
    
        auto result_data = tensor_in.accessor<float, 2>();
    
        for(int i=0;i<result_data.size(0);i++)
        {
            float val = result_data[i][0];
    //        std::cout<<"val="<<val<<std::endl;
            outfile<<val<<std::endl;
    
        }
    
        return true;
    }
    
    
    
    int main()
    {
        std::string path_pt = "/data_1/everyday/1118/pytorch_lstm_test/lstmunidirectional20211124.pt";//"/data_1/everyday/1118/pytorch_lstm_test/lstm20211124.pt";//"/data_1/everyday/1118/pytorch_lstm_test/lstm10000.pt";//"/data_1/everyday/1118/pytorch_lstm_test/lstm.pt";
        std::string path_img_dir = "/data_1/2020biaozhushuju/2021_rec/general/test";//"/data_1/everyday/1118/pytorch_lstm_test/test_data";
        int blank_label = 7117;
    
    
        std::ifstream list("/data_1/everyday/1123/list.txt");
    
        int standard_w = 320;
        int standard_h = 32;
    
    //    vector<string> v_path;
    //    GetFileInDir(path_img_dir, v_path);
    //    for(int i=0;i<v_path.size();i++)
    //    {
    //        std::cout<<i<<"  "<<v_path[i]<<std::endl;
    //    }
    
    
        torch::Device m_device(torch::kCUDA);
    //    torch::Device m_device(torch::kCPU);
        std::shared_ptr<torch::jit::script::Module> m_model = torch::jit::load(path_pt);
    
        torch::NoGradGuard no_grad;
    
        m_model->to(m_device);
        std::cout<<"success load model"<<std::endl;
    
        int cnt_all = 0;
        int cnt_right = 0;
        double start = getTickCount();
        string file;
        while(list >> file)
        {
            file = "/data_1/everyday/1123/img/bxd_39_发动机号码.jpg";
            cout<<cnt_all++<<" :: "<<file<<endl;
            string jpg=".jpg";
            string::size_type idx = file.find( jpg );
            if ( idx == string::npos )
                continue;
    
            int pos_1 = file.find_last_of("_");
            int pos_2 = file.find_last_of(".");
            string answer = file.substr(pos_1+1,pos_2-pos_1-1);
    
            cv::Mat img = cv::imread(file);
    //        int rowNumber = img.rows;  //行数
    //        int colNumber = img.cols*img.channels();  //列数 x 通道数=每一行元素的个数
    //        std::ofstream out_file("/data_1/everyday/1123/img_acc/libtorch_img.txt");
    //        //双重循环,遍历所有的像素值
    //        for (int i = 0; i < rowNumber; i++)  //行循环
    //        {
    //            uchar *data = img.ptr<uchar>(i);  //获取第i行的首地址
    //            for (int j = 0; j < colNumber; j++)   //列循环
    //            {
    //                // ---------【开始处理每个像素】-------------
    //                int pix = int(data[j]);
    //                out_file << pix << std::endl;
    //            }
    //        }
    //
    //        out_file.close();
    //        while(1);
    
    
    
    
            torch::Tensor tensor_input;
            pre_img(img, tensor_input);
            tensor_input = tensor_input.to(m_device);
            tensor_input.print();
    
            std::cout<<tensor_input[0][2][12][25]<<std::endl;
            std::cout<<tensor_input[0][1][15][100]<<std::endl;
            std::cout<<tensor_input[0][0][16][132]<<std::endl;
            std::cout<<tensor_input[0][1][17][156]<<std::endl;
            std::cout<<tensor_input[0][2][5][256]<<std::endl;
            std::cout<<tensor_input[0][0][14][205]<<std::endl;
    
            save_tensor_txt(tensor_input, "/data_1/everyday/1124/acc/libtorch_input-100.txt");
    
            torch::Tensor output = m_model->forward({tensor_input}).toTensor();
            output.print();
    //        output = output.squeeze();//80,7118
    //        output.print();
    
            save_tensor_txt(output, "/data_1/everyday/1124/acc/libtorch-out-100.txt");
    ////        std::cout<<output<<std::endl;
            while(1);
    //
            torch::Tensor index = torch::argmax(output,1).cpu();//.to(torch::kInt);
            index.print();
    //        std::cout<<index<<std::endl;
    //        while(1);
    
    
            int prev_label = blank_label;
            string result;
            auto result_data = index.accessor<long, 1>();
            for(int i=0;i<result_data.size(0);i++)
            {
    //            std::cout<<result_data[i]<<std::endl;
                  int predict_label = result_data[i];
                if (predict_label != blank_label && predict_label != prev_label )
                {
                    {
                        result = result + chn_tab[predict_label];
                    }
                }
                prev_label = predict_label;
            }
    
            cout << "answer: " << answer << endl;
            cout << "result : " << result << endl;
    
            imshow("src",img);
            waitKey(0);
    
    
    //        while(1);
    
    
        }
    
    
    //    for(int i=0;i<v_path.size();i++)
    //    {
    //        cnt_all += 1;
    //        std::string path_img = v_path[i];
    //        string ans = get_ans(path_img);
    //        std::cout<<i<<"  path="<<path_img<<"    ans="<<ans<<std::endl;
    //        cv::Mat img = cv::imread(path_img);
    
    
    
    //        torch::Tensor input = pre_img(img, v_mean, v_std, standard_w, standard_h);
    //        input = input.to(m_device);
    //        torch::Tensor output = m_module.forward({input}).toTensor();
    //
    //        std::string rec = get_label(output);
    //#if 1   //for show
    //        std::cout<<"rec="<<rec<<std::endl;
    //        std::cout<<"ans="<<ans<<std::endl;
    //        cv::imshow("img",img);
    //        cv::waitKey(0);
    //#endif
    //
    //#if 0   //In order to test the accuracy
    //        std::cout<<"rec="<<rec<<std::endl;
    //        std::cout<<"ans="<<ans<<std::endl;
    //        if(ans == rec)
    //        {
    //            cnt_right += 1;
    //        }
    //        std::cout<<"cnt_right="<<cnt_right<<std::endl;
    //        std::cout<<"cnt_all="<<cnt_all<<std::endl;
    //        std::cout<<"ratio="<<cnt_right * 1.0 / cnt_all<<std::endl;
    //#endif
    //    }
    //    double time_cunsume = ((double)getTickCount() - start) / getTickFrequency();
    //    std::cout<<"ave time="<< time_cunsume * 1.0 / cnt_all * 1000 <<"ms"<<std::endl;
    
        return 0;
    }
    
    

    ------------------2021年11月25日10:18:54
    早上来看到github有人回复建议我升级到最新版本看看。
    没办法,我本地有pytorch1.1.0, cuda10.0, libtorch1.1.0的环境,我就直接用这个环境再来一遍,先生成pth,看模型输出是正确的,然后再生成pt,然后配置libtorch的cmakelist,然后再跑,发现没问题!!!
    也就是说确实是libtorch1.0的问题了。无解。
    这里再配上我cmakelist

    cmake_minimum_required(VERSION 2.6)
    
    project(libtorch_lstm_1.1.0)
    set(CMAKE_BUILD_TYPE Debug)
    set(CMAKE_BUILD_TYPE Debug CACHE STRING "set build type to debug")
    
    #add_definitions(-std=c++11)
    set(CMAKE_CXX_STANDARD 11)
    set(CMAKE_CXX_STANDARD_REQUIRED ON)
    
    option(CUDA_USE_STATIC_CUDA_RUNTIME OFF)
    #set(CMAKE_CXX_STANDARD 11)
    set(CMAKE_BUILD_TYPE Debug)
    
    # cuda10
    include_directories(${CMAKE_SOURCE_DIR}/3rdparty/cuda/include)
    link_directories(${CMAKE_SOURCE_DIR}/3rdparty/cuda/lib64)
    
    ###libtorch1.1.0
    set(TORCH_ROOT ${CMAKE_SOURCE_DIR}/3rdparty/libtorch)
    set(CMAKE_PREFIX_PATH ${CMAKE_SOURCE_DIR}/3rdparty/libtorch)
    include_directories(${TORCH_ROOT}/include)
    include_directories(${TORCH_ROOT}/include/torch/csrc/api/include)
    link_directories(${TORCH_ROOT}/lib)
    
    #OpenCv3.4.10
    set(OPENCV_ROOT ${CMAKE_SOURCE_DIR}/3rdparty/opencv-3.4.10)
    include_directories(${OPENCV_ROOT}/include)
    link_directories(${OPENCV_ROOT}/lib)
    
    
    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -Wall -Ofast -Wfatal-errors -D_MWAITXINTRIN_H_INCLUDED")
    
    add_executable(libtorch_lstm ${PROJECT_SOURCE_DIR}/lstm.cpp)
    target_link_libraries(libtorch_lstm opencv_calib3d opencv_core opencv_imgproc opencv_highgui opencv_imgcodecs)
    target_link_libraries(libtorch_lstm  torch c10 caffe2)
    target_link_libraries(libtorch_lstm  nvrtc cuda)
    #target_link_libraries(crnn c10 c10_cuda torch torch_cuda torch_cpu "-Wl,--no-as-needed -ltorch_cuda")
    
    add_definitions(-O2 -pthread)
    
    
    #include <torch/script.h> // One-stop header.
    #include "torch/torch.h"
    #include "torch/jit.h"
    #include <memory>
    #include "opencv2/opencv.hpp"
    #include <queue>
    
    #include <dirent.h>
    #include <iostream>
    #include <cstdlib>
    #include <cstring>
    
    #include <opencv2/opencv.hpp>
    #include <opencv2/core/core.hpp>
    #include <opencv2/highgui/highgui.hpp>
    #include <opencv2/imgproc/imgproc.hpp>
    using namespace cv;
    using namespace std;
    
    // cv::Mat m_stand;
    
    #define TABLE_SIZE 7117
    static string chn_tab[TABLE_SIZE+1] = {"啊","阿","埃","挨","哎","唉",
    。。。
    。。。
    。。。
                                           "∴","♂","♀","°","′","″","℃","$","¤","¢","£","‰","§","№","☆","★",
                                           "○","●","◎","◇","◆","□","■","△","▲","※","→","←","↑","↓","〓",
                                           "⒈","⒉","⒊","⒋","⒌","⒍","⒎","⒏","⒐","⒑","⒒","⒓","⒔","⒕","⒖",
                                           "⒗","⒘","⒙","⒚","⒛","⑴","⑵","⑶","⑷","⑸","⑹","⑺","⑻","⑼","⑽","⑾",
                                           "⑿","⒀","⒁","⒂","⒃","⒄","⒅","⒆","⒇","①","②","③","④","⑤","⑥","⑦",
                                           "⑧","⑨","⑩","㈠","㈡","㈢","㈣","㈤","㈥","㈦","㈧","㈨","㈩",
                                           "Ⅰ","Ⅱ","Ⅲ","Ⅳ","Ⅴ","Ⅵ","Ⅶ","Ⅷ","Ⅸ","Ⅹ","Ⅺ","Ⅻ",
                                           "!",""","#","¥","%","&","'","(",")","*","+",",","-",".","/",
                                           "0","1","2","3","4","5","6","7","8","9",":",";","<","=",">","?",
                                           "@","A","B","C","D","E","F","G","H","I","J","K","L","M","N","O",
                                           "P","Q","R","S","T","U","V","W","X","Y","Z","[","\","]","^","_",
                                           "`","a","b","c","d","e","f","g","h","i","j","k","l","m","n","o",
                                           "p","q","r","s","t","u","v","w","x","y","z","{","|","}"," ̄",
                                           "!","\"","#","$","%","&","'","(",")","*","+",",","-",".","/", //========ascii========//
                                           "0","1","2","3","4","5","6","7","8","9",
                                           ":",";","<","=",">","?","@",
                                           "A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z",
                                           "[","\\","]","^","_","`",
                                           "a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z",
                                           "{","|","}","~",
                                           " "};
    
    bool LstmImgStandardization_src_1(const cv::Mat &src, const float &ratio, int standard_w, int standard_h, cv::Mat &dst)
    {
        if(src.empty())return false;
        float width=src.cols;
        float height=src.rows;
        float  a=width/ height;
    
        if(a <=ratio)
        {
            Mat mask(height, ratio*height, CV_8UC3, cv::Scalar(255, 255, 255));
            Mat imageROI = mask(Rect(0, 0, width, height));
            src.copyTo(imageROI);
            dst=mask.clone();
        }
        else
        {
            Mat mask(width/ratio, width, CV_8UC3, cv::Scalar(255, 255, 255));
            Mat imageROI = mask(Rect(0, 0, width, height));
            src.copyTo(imageROI);
            dst=mask.clone();
        }
    
        //cv::resize(dst, dst, cv::Size(standard_w,standard_h));
        cv::resize(dst, dst, cv::Size(standard_w,standard_h),0,0,cv::INTER_AREA);
        return true;
    }
    
    bool lstm_img_standardization(cv::Mat src, cv::Mat &dst,float ratio)
    {
        if(src.empty())return false;
        double width=src.cols;
        double height=src.rows;
        double a=width/height;
    
        if(a <=ratio)//6
        {
            Mat mask(height, ratio*height, CV_8UC3, Scalar(255, 255, 255));
            Mat imageROI = mask(Rect(0, 0, width, height));
            src.copyTo(imageROI);
            dst=mask.clone();
        }
        else
        {
            Mat mask(width/ratio, width, CV_8UC3, Scalar(255, 255, 255));
            Mat imageROI = mask(Rect(0, 0, width, height));
            src.copyTo(imageROI);
            dst=mask.clone();
        }
    
    //    cv::resize(dst, dst, cv::Size(360,60));
        cv::resize(dst, dst, cv::Size(320,32));
    
        return true;
    }
    
    //torch::Tensor pre_img(cv::Mat &img)
    //{
    //    cv::Mat m_stand;
    //    float ratio = 10.0;
    //    if(1 == img.channels()) { cv::cvtColor(img,img,CV_GRAY2BGR); }
    //    lstm_img_standardization(img, m_stand, ratio);
    //
    //    std::vector<int64_t> sizes = {m_stand.rows, m_stand.cols, m_stand.channels()};
    //    torch::TensorOptions options = torch::TensorOptions().dtype(torch::kByte);
    //    torch::Tensor tensor_image = torch::from_blob(m_stand.data, torch::IntList(sizes), options);
    //    // Permute tensor, shape is (C, H, W)
    //    tensor_image = tensor_image.permute({2, 0, 1});
    //
    //
    //    // Convert tensor dtype to float32, and range from [0, 255] to [0, 1]
    //    tensor_image = tensor_image.toType(torch::ScalarType::Float);
    //
    //
    ////    tensor_image = tensor_image.div_(255.0);
    ////    // Subtract mean value
    ////    for (int i = 0; i < std::min<int64_t>(v_mean.size(), tensor_image.size(0)); i++) {
    ////        tensor_image[i] = tensor_image[i].sub_(v_mean[i]);
    ////    }
    ////    // Divide by std value
    ////    for (int i = 0; i < std::min<int64_t>(v_std.size(), tensor_image.size(0)); i++) {
    ////        tensor_image[i] = tensor_image[i].div_(v_std[i]);
    ////    }
    //    //[c,h,w]  -->  [1,c,h,w]
    //    tensor_image.unsqueeze_(0);
    //    std::cout<<tensor_image;
    //    return tensor_image;
    //}
    
    
    
    bool pre_img(cv::Mat &img, torch::Tensor &input_tensor)
    {
        static cv::Mat m_stand;
        float ratio = 10.0;
    //    if(1 == img.channels()) { cv::cvtColor(img,img,CV_GRAY2BGR); }
        lstm_img_standardization(img, m_stand, ratio);
        m_stand.convertTo(m_stand, CV_32FC3);
    
    
    //    imshow("m_stand",m_stand);
    //    waitKey(0);
    
    //    Mat m_stand_new;
    //        m_stand.convertTo(m_stand_new, CV_32FC3);
    
    //        int rowNumber = m_stand_new.rows;  //行数
    //        int colNumber = m_stand_new.cols*m_stand_new.channels();  //列数 x 通道数=每一行元素的个数
    //        std::ofstream out_file("/data_1/everyday/1123/img_acc/after_CV_32FC3-float-111.txt");
    //        //双重循环,遍历所有的像素值
    //        for (int i = 0; i < rowNumber; i++)  //行循环
    //        {
    //            uchar *data = m_stand_new.ptr<uchar>(i);  //获取第i行的首地址
    //            for (int j = 0; j < colNumber; j++)   //列循环
    //            {
    //                // ---------【开始处理每个像素】-------------
    //                int pix = int(data[j]);
    //                out_file << pix << std::endl;
    //            }
    //        }
    //
    //        out_file.close();
    //        std::cout<<"==m_stand.convertTo(m_stand, CV_32FC3);=="<<std::endl;
    //        while(1);
    
    
    
    
        int stand_row = m_stand.rows;
        int stand_cols = m_stand.cols;
    
        input_tensor = torch::from_blob(
                m_stand.data, {stand_row, stand_cols, 3}).toType(torch::kFloat);
        input_tensor = input_tensor.permute({2,0,1});
        input_tensor = input_tensor.unsqueeze(0);//.to(torch::kFloat);
    
    //    std::cout<<input_tensor;
        return true;
    }
    
    
    
    void GetFileInDir(string dirName, vector<string> &v_path)
    {
        DIR* Dir = NULL;
        struct dirent* file = NULL;
        if (dirName[dirName.size()-1] != '/')
        {
            dirName += "/";
        }
        if ((Dir = opendir(dirName.c_str())) == NULL)
        {
            cerr << "Can't open Directory" << endl;
            exit(1);
        }
        while (file = readdir(Dir))
        {
            //if the file is a normal file
            if (file->d_type == DT_REG)
            {
                v_path.push_back(dirName + file->d_name);
            }
                //if the file is a directory
            else if (file->d_type == DT_DIR && strcmp(file->d_name, ".") != 0 && strcmp(file->d_name, "..") != 0)
            {
                GetFileInDir(dirName + file->d_name,v_path);
            }
        }
    }
    
    string str_replace(const string &str,const string &str_find,const string &str_replacee)
    {
        string str_tmp=str;
        size_t pos = str_tmp.find(str_find);
        while (pos != string::npos)
        {
            str_tmp.replace(pos, str_find.length(), str_replacee);
    
            size_t pos_t=pos+str_replacee.length();
            string str_sub=str_tmp.substr(pos_t,str_tmp.length()-pos_t);
    
            size_t pos_tt=str_sub.find(str_find);
            if(string::npos != pos_tt)
            {
                pos =pos_t + str_sub.find(str_find);
            }else
            {
                pos=string::npos;
            }
        }
        return str_tmp;
    }
    
    string get_ans(const string path)
    {
        int pos_1 = path.find_last_of("_");
        int pos_2 = path.find_last_of(".");
        string ans = path.substr(pos_1+1,pos_2-pos_1-1);
        ans = str_replace(ans,"@","/");
        return ans;
    }
    
    bool save_tensor_txt(torch::Tensor tensor_in_,string path_txt)
    {
    #include "fstream"
        ofstream outfile(path_txt);
        torch::Tensor tensor_in = tensor_in_.clone();
        tensor_in = tensor_in.view({-1,1});
        tensor_in = tensor_in.to(torch::kCPU);
    
        auto result_data = tensor_in.accessor<float, 2>();
    
        for(int i=0;i<result_data.size(0);i++)
        {
            float val = result_data[i][0];
    //        std::cout<<"val="<<val<<std::endl;
            outfile<<val<<std::endl;
    
        }
    
        return true;
    }
    
    
    
    int main()
    {
        std::string path_pt = "/data_1/everyday/1125/lstm/lstm20211125.pt";//"/data_1/everyday/1118/pytorch_lstm_test/lstmunidirectional20211124.pt";//"/data_1/everyday/1118/pytorch_lstm_test/lstm20211124.pt";//"/data_1/everyday/1118/pytorch_lstm_test/lstm10000.pt";//"/data_1/everyday/1118/pytorch_lstm_test/lstm.pt";
        std::string path_img_dir = "/data_1/2020biaozhushuju/2021_rec/general/test";//"/data_1/everyday/1118/pytorch_lstm_test/test_data";
        int blank_label = 7117;
    
    
        std::ifstream list("/data_1/everyday/1123/list.txt");
    
        int standard_w = 320;
        int standard_h = 32;
    
    //    vector<string> v_path;
    //    GetFileInDir(path_img_dir, v_path);
    //    for(int i=0;i<v_path.size();i++)
    //    {
    //        std::cout<<i<<"  "<<v_path[i]<<std::endl;
    //    }
    
    
        torch::Device m_device(torch::kCUDA);
    //    torch::Device m_device(torch::kCPU);
        std::shared_ptr<torch::jit::script::Module> m_model = torch::jit::load(path_pt);
    
        torch::NoGradGuard no_grad;
    
        m_model->to(m_device);
        std::cout<<"success load model"<<std::endl;
    
        int cnt_all = 0;
        int cnt_right = 0;
        double start = getTickCount();
        string file;
        while(list >> file)
        {
            file = "/data_2/project_202009/chejian/test_data/model_test/rec_general/1.jpg";
            cout<<cnt_all++<<" :: "<<file<<endl;
            string jpg=".jpg";
            string::size_type idx = file.find( jpg );
            if ( idx == string::npos )
                continue;
    
            int pos_1 = file.find_last_of("_");
            int pos_2 = file.find_last_of(".");
            string answer = file.substr(pos_1+1,pos_2-pos_1-1);
    
            cv::Mat img = cv::imread(file);
    //        int rowNumber = img.rows;  //行数
    //        int colNumber = img.cols*img.channels();  //列数 x 通道数=每一行元素的个数
    //        std::ofstream out_file("/data_1/everyday/1123/img_acc/libtorch_img.txt");
    //        //双重循环,遍历所有的像素值
    //        for (int i = 0; i < rowNumber; i++)  //行循环
    //        {
    //            uchar *data = img.ptr<uchar>(i);  //获取第i行的首地址
    //            for (int j = 0; j < colNumber; j++)   //列循环
    //            {
    //                // ---------【开始处理每个像素】-------------
    //                int pix = int(data[j]);
    //                out_file << pix << std::endl;
    //            }
    //        }
    //
    //        out_file.close();
    //        while(1);
    
    
    
    
            torch::Tensor tensor_input;
            pre_img(img, tensor_input);
            tensor_input = tensor_input.to(m_device);
            tensor_input.print();
    
            std::cout<<tensor_input[0][2][12][25]<<std::endl;
            std::cout<<tensor_input[0][1][15][100]<<std::endl;
            std::cout<<tensor_input[0][0][16][132]<<std::endl;
            std::cout<<tensor_input[0][1][17][156]<<std::endl;
            std::cout<<tensor_input[0][2][5][256]<<std::endl;
            std::cout<<tensor_input[0][0][14][205]<<std::endl;
    
            save_tensor_txt(tensor_input, "/data_1/everyday/1124/acc/libtorch_input-100.txt");
    
            torch::Tensor output = m_model->forward({tensor_input}).toTensor();
            output.print();
            output = output.squeeze();//80,7118
            output.print();
    
    //        save_tensor_txt(output, "/data_1/everyday/1124/acc/libtorch-out-100.txt");
    //////        std::cout<<output<<std::endl;
    //        while(1);
    //
            torch::Tensor index = torch::argmax(output,1).cpu();//.to(torch::kInt);
            index.print();
    //        std::cout<<index<<std::endl;
    //        while(1);
    
    
            int prev_label = blank_label;
            string result;
            auto result_data = index.accessor<long, 1>();
            for(int i=0;i<result_data.size(0);i++)
            {
    //            std::cout<<result_data[i]<<std::endl;
                  int predict_label = result_data[i];
                if (predict_label != blank_label && predict_label != prev_label )
                {
                    {
                        result = result + chn_tab[predict_label];
                    }
                }
                prev_label = predict_label;
            }
    
            cout << "answer: " << answer << endl;
            cout << "result : " << result << endl;
    
            imshow("src",img);
            waitKey(0);
    
    
    //        while(1);
    
    
        }
    
    
    //    for(int i=0;i<v_path.size();i++)
    //    {
    //        cnt_all += 1;
    //        std::string path_img = v_path[i];
    //        string ans = get_ans(path_img);
    //        std::cout<<i<<"  path="<<path_img<<"    ans="<<ans<<std::endl;
    //        cv::Mat img = cv::imread(path_img);
    
    
    
    //        torch::Tensor input = pre_img(img, v_mean, v_std, standard_w, standard_h);
    //        input = input.to(m_device);
    //        torch::Tensor output = m_module.forward({input}).toTensor();
    //
    //        std::string rec = get_label(output);
    //#if 1   //for show
    //        std::cout<<"rec="<<rec<<std::endl;
    //        std::cout<<"ans="<<ans<<std::endl;
    //        cv::imshow("img",img);
    //        cv::waitKey(0);
    //#endif
    //
    //#if 0   //In order to test the accuracy
    //        std::cout<<"rec="<<rec<<std::endl;
    //        std::cout<<"ans="<<ans<<std::endl;
    //        if(ans == rec)
    //        {
    //            cnt_right += 1;
    //        }
    //        std::cout<<"cnt_right="<<cnt_right<<std::endl;
    //        std::cout<<"cnt_all="<<cnt_all<<std::endl;
    //        std::cout<<"ratio="<<cnt_right * 1.0 / cnt_all<<std::endl;
    //#endif
    //    }
    //    double time_cunsume = ((double)getTickCount() - start) / getTickFrequency();
    //    std::cout<<"ave time="<< time_cunsume * 1.0 / cnt_all * 1000 <<"ms"<<std::endl;
    
        return 0;
    }
    
    

    这里再说下遇到的一些坑,因为不同框架之间做转移,就是需要对比每一个环节的精度,一开始遇到一个问题精度对不上,然后一步步找问题,看哪一个环节精度开始不对的,最终定位在两边opencv imread之后的图像像素就开始不一样了!
    原来是opencv版本不一样,一个版本是opencv3.3的,一个是opencv3.4.10的。所以做这些还需要版本严格一致,要不然会带来意想不到的问题。

    好记性不如烂键盘---点滴、积累、进步!
  • 相关阅读:
    Analog power pin UPF defination
    动态功耗计算
    静态功耗 计算
    Innovus 对multibit 的支持
    P &R 12
    P & R 11
    power-plan如何定
    P & R 10
    P & R 9
    线程基础
  • 原文地址:https://www.cnblogs.com/yanghailin/p/15599428.html
Copyright © 2020-2023  润新知