• LibTorch实战六:U2Net实战部署<三>


    •  导读
    • 一、数据标注
    • 二、模型评价
    • 三、源码解读
    • 四、Libtorch部署
    • 五、性能分析 
    • 六、问题记录

    导读

    U2-Net模型分为两种:

    • U2NET---173.6 MB (参数量:4千万)
    • U2NEP---4.7 MB    (参数量:1 百万)
    (5s为700万个参数,VGG-16有4000万,ResNet 1.3亿个参数)

    项目地址:https://github.com/xuebinqin/U-2-Net

    1、人类分割模型:u2net_human_seg.pth ,下载上述模型到文件夹下./saved_models/u2net_human_seg/,没有就自己建,
    2、把图片复制到./test_data/test_human_images/ 目录下
    3、运行脚本python u2net_human_seg_test.py,效果图自动保存在./test_data/u2net_test_human_images_results/
    (注:这个模型训练的时候(基于U2Net做了一些改进,比如数据增强),样本标注精度不是那么高,但是也比官方U2Net基于DUST-TR数据集训练得出的效果好,话说回来
    这个模型用于通用人类检测分割,效果也是很牛逼,这个模型是基于数据集(Supervisely Person Dataset)预训练,数据集由5711张图片组成,有6884个高质量的标注的人体实例)
      有很多人将U2-Net活学活用,比如:人类肖像绘画[1],素描,去除背景等等。其余的不多逼逼,自己去看github介绍
    咱们这里仅讨论语义分割,不是实例分割。


    一、数据标注

    labelImg,标注完是json格式,自己完成json文件 -> mask图片功能

      U2-Net主要测试多组数据集:

    训练数据集:在DUTS-TR上训练的网络,它是DUTS数据集的一部分。DUTS-TR包含共10553张图片。目前,它是最大的用于显著目标检测的常用训练数据集。训练之前,做了平翻转来扩充这个数据集,也就是21106张图像。

    评估数据集:六个常用测试数据集用于测试我们的模型,包括:DUT-OMRON、DUTS-TE、HKU-IS、ECSSD,PASCAL-S,SOD。

      DUT-OMRON:包括5168图像,其中大多数包含一个或多个前景。

      DUTS:数据集由两部分组成:DUTS-TR(训练集)和DUTS-TE(测试集)。DUTS-TE有5019张图像,用于测试。

      HKU-IS:有4447张图片,其中有多张图片地面物体。ECSSDContains1000结构复杂图像和其中许多包含大型前景对象。

      PASCAL-S:包含850幅前景复杂的图像物体和杂乱的背景。草皮只含300图像。但这是一个巨大的挑战。因为它本来就是专为图像分割而设计,很多图像都很低对比度或包含重叠的复杂前景对象图像边界。

      SOD只含300图像。但这是一个巨大的挑战。因为它本来就是专为图像分割而设计,很多图像都很低对比度或包含重叠的复杂前景对象

    二、模型评价请参考原文)

    2.1、损失函数

      首先讨论语义分割的loss计算,其实就是逐像素计算交叉熵,(二分类:语义分割,多分类:实例分割),

        上式中,权重项pos_weight作用是:平衡正负样本不均衡问题,YOLOV1目标函数中有提过,不多说。下面只讨论语义分割,不讨论实例分割。

      在计算loss过程中,都是逐像素计算loss,进行二分类。但是,对于一个区域分割、识别,边界外边的是负样本(背景),边界里边的是正样本(前景),一般都很好区分,唯独边界上的像素难以区分,那怎么解决呢?请看下面Focal loss类型损失函数。

      下式中的r(读:gamma),一般取值2,例如:当正样本标注概率为0.95的时候,采用公式(1 - p)^r降低其概率值为0.0025,这么做的初衷是:希望这种容易识别的样本像素别对最终模型产生太大贡献;再如:像素标注概率为0.5的时候,同理得出概率值为0.25,意思是:本来0.5就不高,降为0.25,相对前面0.0025,对网络贡献大得多,那么网络会对“概率为0.5”这类不易识别的像素更加重视。

      下图最后一个公式中α = 负样本/正样本

     2.2、评价指标

    IOU:如下图右边,Y轴表示标注类别,X轴表示网络预测类别,中间网络中数字表述各类别像素数量。例如:绿色框表示当前标注区域ROI1(记为true_dog)包含像素总数,黄色框表示预测区域ROI2(记为predict_dog)像素总数,所以iou_dog计算公式如下:

       如下图,坐标是人像标注区域,右边是模型预测区域。

       下面左图就是上述两图的交基、并集。

       一般地,在实例分割中,多余多个类别:a、b、c等类别,会分别计算IOU,然后取平均值,得到MIOU

      在U2-Net中(注:咱们这里不是实例分割!),有如下评价指标:

    PR curve:通过对比网络输出Mask和标记图Mask,计算acc(TP/(TP + FP))、recall(TP/(TP+FN))

    MAE: Mean Absolute Error,平均结对

    还有几个懒得讲。

    三、源码解读

    3.0、环境:pytoch1.7.1_cu110(和yolov5.4环境一样,直接拿来用,pytorch1.7.1+CU11.0)

    3.1、准备工作

    下载源码:git clone https://github.com/NathanUA/U-2-Net.git

    下载预训练模型: u2net.pth (176.3 MB) or u2netp.pth (4.7 MB) 分别放到 './saved_models/u2net/' and './saved_models/u2netp/'文件夹下面,没有就自己建

    训练与测试:python u2net_train.py or python u2net_test.py

    3.2、训练代码解读

    u2net_train.py(遇到报错请参考第六节、问题记录,我这里已经改好了):

      1 import os
      2 
      3 os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'  # OMP:Error
      4 import torch
      5 from torch.autograd import Variable
      6 import torch.nn as nn
      7 
      8 from torch.utils.data import DataLoader
      9 from torchvision import transforms
     10 import torch.optim as optim
     11 
     12 import glob
     13 import os
     14 
     15 from data_loader import RescaleT
     16 from data_loader import RandomCrop
     17 from data_loader import ToTensorLab
     18 from data_loader import SalObjDataset
     19 
     20 from model import U2NET
     21 from model import U2NETP
     22 
     23 # ------- 1. define loss function --------
     24 
     25 bce_loss = nn.BCELoss(size_average=True)
     26 
     27 # loss1-6:输出层上采样得到6张图,对应的loss
     28 # loss0:最终特征图的loss
     29 def muti_bce_loss_fusion(d0, d1, d2, d3, d4, d5, d6, labels_v):
     30     loss0 = bce_loss(d0, labels_v)
     31     loss1 = bce_loss(d1, labels_v)
     32     loss2 = bce_loss(d2, labels_v)
     33     loss3 = bce_loss(d3, labels_v)
     34     loss4 = bce_loss(d4, labels_v)
     35     loss5 = bce_loss(d5, labels_v)
     36     loss6 = bce_loss(d6, labels_v)
     37 
     38     loss = loss0 + loss1 + loss2 + loss3 + loss4 + loss5 + loss6
     39     print("l0: %3f, l1: %3f, l2: %3f, l3: %3f, l4: %3f, l5: %3f, l6: %3f\n" % (
     40     loss0.data.item(), loss1.data.item(), loss2.data.item(), loss3.data.item(), loss4.data.item(), loss5.data.item(),
     41     loss6.data.item()))
     42 
     43     return loss0, loss
     44 
     45 
     46 # ------- 2. set the directory of training dataset --------
     47 
     48 model_name = 'u2net'  # 'u2netp'
     49 
     50 data_dir = os.path.join(os.getcwd(), 'train_data' + os.sep)
     51 # tra_image_dir = os.path.join('DUTS', 'DUTS-TR', 'DUTS-TR', 'im_aug' + os.sep)
     52 # tra_label_dir = os.path.join('DUTS', 'DUTS-TR', 'DUTS-TR', 'gt_aug' + os.sep)
     53 
     54 tra_image_dir = os.path.join('APDrawingGAN_test', 'im' + os.sep)
     55 tra_label_dir = os.path.join('APDrawingGAN_test', 'gt' + os.sep)
     56 
     57 image_ext = '.jpg'
     58 label_ext = '.png'
     59 
     60 model_dir = os.path.join(os.getcwd(), 'saved_models', model_name + os.sep)
     61 
     62 #epoch_num = 100000
     63 # batch_size_train = 12 # error: RuntimeError: CUDA out of memory.
     64 epoch_num = 4000
     65 batch_size_train = 4 # 8G显存有点不够用
     66 batch_size_val = 1
     67 train_num = 0
     68 val_num = 0
     69 
     70 tra_img_name_list = glob.glob(data_dir + tra_image_dir + '*' + label_ext)
     71 
     72 tra_lbl_name_list = []
     73 for img_path in tra_img_name_list:
     74     img_name = img_path.split(os.sep)[-1]
     75 
     76     aaa = img_name.split(".")
     77     bbb = aaa[0:-1]
     78     imidx = bbb[0]
     79     for i in range(1, len(bbb)):
     80         imidx = imidx + "." + bbb[i]
     81 
     82     tra_lbl_name_list.append(data_dir + tra_label_dir + imidx + label_ext)
     83 
     84 print("---")
     85 print("train images: ", len(tra_img_name_list))
     86 print("train labels: ", len(tra_lbl_name_list))
     87 print("---")
     88 
     89 train_num = len(tra_img_name_list)
     90 
     91 # 数据预处理
     92 salobj_dataset = SalObjDataset(
     93     img_name_list=tra_img_name_list,
     94     lbl_name_list=tra_lbl_name_list,
     95     transform=transforms.Compose([
     96         RescaleT(320),    # 将原图缩放至 320*320
     97         RandomCrop(288),  # 从320*320中截取为288*288
     98         ToTensorLab(flag=0)]))
     99 # dataloader
    100 salobj_dataloader = DataLoader(salobj_dataset, batch_size=batch_size_train, shuffle=True, num_workers=1)
    101 
    102 # ------- 3. define model --------
    103 # define the net
    104 if (model_name == 'u2net'):
    105     net = U2NET(3, 1)
    106 elif (model_name == 'u2netp'):
    107     net = U2NETP(3, 1)
    108 
    109 if torch.cuda.is_available():
    110     net.cuda()
    111 
    112 # ------- 4. define optimizer --------
    113 print("---define optimizer...")
    114 # 学习率搞小点,Momentum 中beta1 = 0.9,RMSprop 中 beta2 = 0.999, 分母常数项设置为1e-8, 衰减率 = 0
    115 optimizer = optim.Adam(net.parameters(), lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0)
    116 
    117 # ------- 5. training process --------
    118 print("---start training...")
    119 ite_num = 0
    120 running_loss = 0.0
    121 running_tar_loss = 0.0
    122 ite_num4val = 0
    123 save_frq = 2000  # save the model every 2000 iterations
    124 
    125 if __name__ == '__main__':  # error:The "freeze_support()" line can be omitted if the progra
    126     for epoch in range(0, epoch_num):
    127         net.train()
    128 
    129         for i, data in enumerate(salobj_dataloader):
    130             ite_num = ite_num + 1
    131             ite_num4val = ite_num4val + 1
    132 
    133             inputs, labels = data['image'], data['label']
    134 
    135             inputs = inputs.type(torch.FloatTensor)
    136             labels = labels.type(torch.FloatTensor)
    137 
    138             # wrap them in Variable
    139             if torch.cuda.is_available():
    140                 inputs_v, labels_v = Variable(inputs.cuda(), requires_grad=False), Variable(labels.cuda(), requires_grad=False)
    141             else:
    142                 inputs_v, labels_v = Variable(inputs, requires_grad=False), Variable(labels, requires_grad=False)
    143 
    144             # y zero the parameter gradients
    145             optimizer.zero_grad()
    146 
    147             # forward + backward + optimize
    148             d0, d1, d2, d3, d4, d5, d6 = net(inputs_v)
    149             # 可以看到,7张mask都是直接和label图计算交叉熵
    150             # loss2:最终mask图的loss
    151             # loss:其余6个输出mask的loss之和
    152             loss2, loss = muti_bce_loss_fusion(d0, d1, d2, d3, d4, d5, d6, labels_v)
    153 
    154             loss.backward()
    155             optimizer.step()
    156 
    157             # # print statistics
    158             running_loss += loss.data.item()
    159             running_tar_loss += loss2.data.item()
    160 
    161             # del temporary outputs and loss
    162             del d0, d1, d2, d3, d4, d5, d6, loss2, loss
    163 
    164             print("[epoch: %3d/%3d, batch: %5d/%5d, ite: %d] train loss: %3f, tar: %3f " % (
    165                 epoch + 1, epoch_num, (i + 1) * batch_size_train, train_num, ite_num, running_loss / ite_num4val,
    166                 running_tar_loss / ite_num4val))
    167 
    168             if ite_num % save_frq == 0:
    169                 torch.save(net.state_dict(), model_dir + model_name + "_bce_itr_%d_train_%3f_tar_%3f.pth" % (
    170                 ite_num, running_loss / ite_num4val, running_tar_loss / ite_num4val))
    171                 running_loss = 0.0
    172                 running_tar_loss = 0.0
    173                 net.train()  # resume train
    174                 ite_num4val = 0
    View Code

    3.3、测试代码解读

    u2net_test.py 

      1 import os
      2 from skimage import io, transform
      3 import torch
      4 import torchvision
      5 from torch.autograd import Variable
      6 import torch.nn as nn
      7 import torch.nn.functional as F
      8 from torch.utils.data import Dataset, DataLoader
      9 from torchvision import transforms  # , utils
     10 # import torch.optim as optim
     11 
     12 import numpy as np
     13 from PIL import Image
     14 import glob
     15 
     16 from data_loader import RescaleT
     17 from data_loader import ToTensor
     18 from data_loader import ToTensorLab
     19 from data_loader import SalObjDataset
     20 
     21 from model import U2NET  # full size version 173.6 MB
     22 from model import U2NETP  # small version u2net 4.7 MB
     23 
     24 
     25 # normalize the predicted SOD probability map
     26 def normPRED(d):
     27     ma = torch.max(d)
     28     mi = torch.min(d)
     29 
     30     dn = (d - mi) / (ma - mi)
     31 
     32     return dn
     33 
     34 
     35 def save_output(image_name, pred, d_dir):
     36     predict = pred
     37     predict = predict.squeeze()
     38     predict_np = predict.cpu().data.numpy()
     39 
     40     im = Image.fromarray(predict_np * 255).convert('RGB')
     41     img_name = image_name.split(os.sep)[-1]
     42     image = io.imread(image_name)
     43     imo = im.resize((image.shape[1], image.shape[0]), resample=Image.BILINEAR)
     44 
     45     pb_np = np.array(imo)
     46 
     47     aaa = img_name.split(".")
     48     bbb = aaa[0:-1]
     49     imidx = bbb[0]
     50     for i in range(1, len(bbb)):
     51         imidx = imidx + "." + bbb[i]
     52 
     53     imo.save(d_dir + imidx + '.png')
     54 
     55 
     56 def main():
     57     # --------- 1. get image path and name ---------
     58     model_name = 'u2net'  # u2netp
     59 
     60     image_dir = os.path.join(os.getcwd(), 'test_data', 'test_images')
     61     prediction_dir = os.path.join(os.getcwd(), 'test_data', model_name + '_results' + os.sep)
     62     model_dir = os.path.join(os.getcwd(), 'saved_models', model_name, model_name + '.pth')
     63 
     64     img_name_list = glob.glob(image_dir + os.sep + '*')
     65     print(img_name_list)
     66 
     67     # --------- 2. dataloader ---------
     68     # 1. dataloader
     69     test_salobj_dataset = SalObjDataset(img_name_list=img_name_list,
     70                                         lbl_name_list=[],
     71                                         transform=transforms.Compose([RescaleT(320), # 缩放到了320
     72                                                                       ToTensorLab(flag=0)])
     73                                         )
     74     test_salobj_dataloader = DataLoader(test_salobj_dataset,
     75                                         batch_size=1,
     76                                         shuffle=False,
     77                                         num_workers=1)
     78     # --------- 3. model define ---------
     79     if (model_name == 'u2net'):
     80         print("...load U2NET---173.6 MB")
     81         net = U2NET(3, 1)
     82     elif (model_name == 'u2netp'):
     83         print("...load U2NEP---4.7 MB")
     84         net = U2NETP(3, 1)
     85 
     86     if torch.cuda.is_available():
     87         net.load_state_dict(torch.load(model_dir))
     88         net.cuda()
     89     else:
     90         net.load_state_dict(torch.load(model_dir, map_location='cpu'))
     91     net.eval()
     92 
     93     # 统计参数量级(by shiruiyu)
     94     num_params = 0
     95     for param in net.parameters():
     96         num_params += param.numel()
     97     print("numbers of parameters: ", num_params / 1e6, "百万")
     98 
     99     # --------- 4. inference for each image ---------
    100     for i_test, data_test in enumerate(test_salobj_dataloader):
    101 
    102         print("inferencing:", img_name_list[i_test].split(os.sep)[-1])
    103 
    104         inputs_test = data_test['image']
    105         inputs_test = inputs_test.type(torch.FloatTensor)
    106 
    107         if torch.cuda.is_available():
    108             inputs_test = Variable(inputs_test.cuda())
    109         else:
    110             inputs_test = Variable(inputs_test)
    111 
    112         d1, d2, d3, d4, d5, d6, d7 = net(inputs_test)
    113 
    114         # normalization
    115         pred = d1[:, 0, :, :] # 这里是推理,所以仅处理最终特征图
    116         pred = normPRED(pred)
    117 
    118         # save results to test_results folder
    119         if not os.path.exists(prediction_dir):
    120             os.makedirs(prediction_dir, exist_ok=True)
    121         save_output(img_name_list[i_test], pred, prediction_dir)
    122 
    123         del d1, d2, d3, d4, d5, d6, d7
    124 
    125 
    126 if __name__ == "__main__":
    127     main()
    View Code

    3.4、网络模型解读

    记得连带参考上图4,u2net.py

    先看有哪些函数,如下截图:

    一定要对比图看,已经注释得很详细了

      1 import torch
      2 import torch.nn as nn
      3 import torch.nn.functional as F
      4 
      5 # note:最新U2Net代码输入图像直接插值为320*320,后续没有进行截图
      6 # 下文中,in_ch, mid_ch, out_ch分别表示初始、中间、末端特征图channels维度
      7 # CBR组合:conv + BN + Relu(可能有空洞卷积)
      8 class REBNCONV(nn.Module):
      9     def __init__(self, in_ch=3, out_ch=3, dirate=1):
     10         super(REBNCONV, self).__init__()
     11         # dilation 空洞卷积参数
     12         self.conv_s1 = nn.Conv2d(in_ch, out_ch, 3, padding=1 * dirate, dilation=1 * dirate)
     13         self.bn_s1 = nn.BatchNorm2d(out_ch)
     14         self.relu_s1 = nn.ReLU(inplace=True)
     15 
     16     def forward(self, x):
     17         hx = x
     18         xout = self.relu_s1(self.bn_s1(self.conv_s1(hx)))
     19 
     20         return xout
     21 
     22 
     23 # 上采样:输入、输出在channel维度上是一致的,仅仅缩放W、H维度
     24 # upsample tensor 'src' to have the same spatial size with tensor 'tar'
     25 def _upsample_like(src, tar):
     26     src = F.upsample(src, size=tar.shape[2:], mode='bilinear')
     27 
     28     return src
     29 
     30 
     31 # 图4-stage1
     32 ### RSU-7 ###
     33 class RSU7(nn.Module):  # UNet07DRES(nn.Module):
     34 
     35     def __init__(self, in_ch=3, mid_ch=12, out_ch=3):
     36         super(RSU7, self).__init__()
     37 
     38         self.rebnconvin = REBNCONV(in_ch, out_ch, dirate=1)
     39 
     40         self.rebnconv1 = REBNCONV(out_ch, mid_ch, dirate=1)
     41         self.pool1 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
     42 
     43         self.rebnconv2 = REBNCONV(mid_ch, mid_ch, dirate=1)
     44         self.pool2 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
     45 
     46         self.rebnconv3 = REBNCONV(mid_ch, mid_ch, dirate=1)
     47         self.pool3 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
     48 
     49         self.rebnconv4 = REBNCONV(mid_ch, mid_ch, dirate=1)
     50         self.pool4 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
     51 
     52         self.rebnconv5 = REBNCONV(mid_ch, mid_ch, dirate=1)
     53         self.pool5 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
     54 
     55         self.rebnconv6 = REBNCONV(mid_ch, mid_ch, dirate=1)
     56 
     57         self.rebnconv7 = REBNCONV(mid_ch, mid_ch, dirate=2)
     58 
     59         self.rebnconv6d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
     60         self.rebnconv5d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
     61         self.rebnconv4d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
     62         self.rebnconv3d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
     63         self.rebnconv2d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
     64         self.rebnconv1d = REBNCONV(mid_ch * 2, out_ch, dirate=1)
     65 
     66     def forward(self, x):
     67         hx = x
     68         hxin = self.rebnconvin(hx)
     69 
     70         hx1 = self.rebnconv1(hxin)
     71         hx = self.pool1(hx1)
     72 
     73         hx2 = self.rebnconv2(hx)
     74         hx = self.pool2(hx2)
     75 
     76         hx3 = self.rebnconv3(hx)
     77         hx = self.pool3(hx3)
     78 
     79         hx4 = self.rebnconv4(hx)
     80         hx = self.pool4(hx4)
     81 
     82         hx5 = self.rebnconv5(hx)
     83         hx = self.pool5(hx5)
     84 
     85         hx6 = self.rebnconv6(hx)
     86         # hx7:图4-stage1中最右边、最小的蓝色块
     87         hx7 = self.rebnconv7(hx6)
     88         # 下面有多个cat操作
     89         # 对应图4-stage1中的符号“+”
     90         hx6d = self.rebnconv6d(torch.cat((hx7, hx6), 1))
     91         hx6dup = _upsample_like(hx6d, hx5)
     92 
     93         hx5d = self.rebnconv5d(torch.cat((hx6dup, hx5), 1))
     94         hx5dup = _upsample_like(hx5d, hx4)
     95 
     96         hx4d = self.rebnconv4d(torch.cat((hx5dup, hx4), 1))
     97         hx4dup = _upsample_like(hx4d, hx3)
     98 
     99         hx3d = self.rebnconv3d(torch.cat((hx4dup, hx3), 1))
    100         hx3dup = _upsample_like(hx3d, hx2)
    101 
    102         hx2d = self.rebnconv2d(torch.cat((hx3dup, hx2), 1))
    103         hx2dup = _upsample_like(hx2d, hx1)
    104         # hx1d:图4-stage1中最右边紫色块
    105         hx1d = self.rebnconv1d(torch.cat((hx2dup, hx1), 1))
    106 
    107         return hx1d + hxin
    108 
    109 
    110 # 图4-stage2
    111 ### RSU-6 ###
    112 class RSU6(nn.Module):  # UNet06DRES(nn.Module):
    113 
    114     def __init__(self, in_ch=3, mid_ch=12, out_ch=3):
    115         super(RSU6, self).__init__()
    116 
    117         self.rebnconvin = REBNCONV(in_ch, out_ch, dirate=1)
    118 
    119         self.rebnconv1 = REBNCONV(out_ch, mid_ch, dirate=1)
    120         self.pool1 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    121 
    122         self.rebnconv2 = REBNCONV(mid_ch, mid_ch, dirate=1)
    123         self.pool2 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    124 
    125         self.rebnconv3 = REBNCONV(mid_ch, mid_ch, dirate=1)
    126         self.pool3 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    127 
    128         self.rebnconv4 = REBNCONV(mid_ch, mid_ch, dirate=1)
    129         self.pool4 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    130 
    131         self.rebnconv5 = REBNCONV(mid_ch, mid_ch, dirate=1)
    132 
    133         self.rebnconv6 = REBNCONV(mid_ch, mid_ch, dirate=2)
    134 
    135         self.rebnconv5d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    136         self.rebnconv4d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    137         self.rebnconv3d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    138         self.rebnconv2d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    139         self.rebnconv1d = REBNCONV(mid_ch * 2, out_ch, dirate=1)
    140 
    141     def forward(self, x):
    142         hx = x
    143 
    144         hxin = self.rebnconvin(hx)
    145 
    146         hx1 = self.rebnconv1(hxin)
    147         hx = self.pool1(hx1)
    148 
    149         hx2 = self.rebnconv2(hx)
    150         hx = self.pool2(hx2)
    151 
    152         hx3 = self.rebnconv3(hx)
    153         hx = self.pool3(hx3)
    154 
    155         hx4 = self.rebnconv4(hx)
    156         hx = self.pool4(hx4)
    157 
    158         hx5 = self.rebnconv5(hx)
    159 
    160         hx6 = self.rebnconv6(hx5)
    161 
    162         hx5d = self.rebnconv5d(torch.cat((hx6, hx5), 1))
    163         hx5dup = _upsample_like(hx5d, hx4)
    164 
    165         hx4d = self.rebnconv4d(torch.cat((hx5dup, hx4), 1))
    166         hx4dup = _upsample_like(hx4d, hx3)
    167 
    168         hx3d = self.rebnconv3d(torch.cat((hx4dup, hx3), 1))
    169         hx3dup = _upsample_like(hx3d, hx2)
    170 
    171         hx2d = self.rebnconv2d(torch.cat((hx3dup, hx2), 1))
    172         hx2dup = _upsample_like(hx2d, hx1)
    173 
    174         hx1d = self.rebnconv1d(torch.cat((hx2dup, hx1), 1))
    175 
    176         return hx1d + hxin
    177 
    178 
    179 # 图4-stage3
    180 ### RSU-5 ###
    181 class RSU5(nn.Module):  # UNet05DRES(nn.Module):
    182 
    183     def __init__(self, in_ch=3, mid_ch=12, out_ch=3):
    184         super(RSU5, self).__init__()
    185 
    186         self.rebnconvin = REBNCONV(in_ch, out_ch, dirate=1)
    187 
    188         self.rebnconv1 = REBNCONV(out_ch, mid_ch, dirate=1)
    189         self.pool1 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    190 
    191         self.rebnconv2 = REBNCONV(mid_ch, mid_ch, dirate=1)
    192         self.pool2 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    193 
    194         self.rebnconv3 = REBNCONV(mid_ch, mid_ch, dirate=1)
    195         self.pool3 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    196 
    197         self.rebnconv4 = REBNCONV(mid_ch, mid_ch, dirate=1)
    198 
    199         self.rebnconv5 = REBNCONV(mid_ch, mid_ch, dirate=2)
    200 
    201         self.rebnconv4d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    202         self.rebnconv3d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    203         self.rebnconv2d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    204         self.rebnconv1d = REBNCONV(mid_ch * 2, out_ch, dirate=1)
    205 
    206     def forward(self, x):
    207         hx = x
    208 
    209         hxin = self.rebnconvin(hx)
    210 
    211         hx1 = self.rebnconv1(hxin)
    212         hx = self.pool1(hx1)
    213 
    214         hx2 = self.rebnconv2(hx)
    215         hx = self.pool2(hx2)
    216 
    217         hx3 = self.rebnconv3(hx)
    218         hx = self.pool3(hx3)
    219 
    220         hx4 = self.rebnconv4(hx)
    221 
    222         hx5 = self.rebnconv5(hx4)
    223 
    224         hx4d = self.rebnconv4d(torch.cat((hx5, hx4), 1))
    225         hx4dup = _upsample_like(hx4d, hx3)
    226 
    227         hx3d = self.rebnconv3d(torch.cat((hx4dup, hx3), 1))
    228         hx3dup = _upsample_like(hx3d, hx2)
    229 
    230         hx2d = self.rebnconv2d(torch.cat((hx3dup, hx2), 1))
    231         hx2dup = _upsample_like(hx2d, hx1)
    232 
    233         hx1d = self.rebnconv1d(torch.cat((hx2dup, hx1), 1))
    234 
    235         return hx1d + hxin
    236 
    237 
    238 # 图4-stage4
    239 ### RSU-4 ###
    240 class RSU4(nn.Module):  # UNet04DRES(nn.Module):
    241 
    242     def __init__(self, in_ch=3, mid_ch=12, out_ch=3):
    243         super(RSU4, self).__init__()
    244 
    245         self.rebnconvin = REBNCONV(in_ch, out_ch, dirate=1)
    246 
    247         self.rebnconv1 = REBNCONV(out_ch, mid_ch, dirate=1)
    248         self.pool1 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    249 
    250         self.rebnconv2 = REBNCONV(mid_ch, mid_ch, dirate=1)
    251         self.pool2 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    252 
    253         self.rebnconv3 = REBNCONV(mid_ch, mid_ch, dirate=1)
    254 
    255         self.rebnconv4 = REBNCONV(mid_ch, mid_ch, dirate=2)
    256 
    257         self.rebnconv3d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    258         self.rebnconv2d = REBNCONV(mid_ch * 2, mid_ch, dirate=1)
    259         self.rebnconv1d = REBNCONV(mid_ch * 2, out_ch, dirate=1)
    260 
    261     def forward(self, x):
    262         hx = x
    263 
    264         hxin = self.rebnconvin(hx)
    265 
    266         hx1 = self.rebnconv1(hxin)
    267         hx = self.pool1(hx1)
    268 
    269         hx2 = self.rebnconv2(hx)
    270         hx = self.pool2(hx2)
    271 
    272         hx3 = self.rebnconv3(hx)
    273 
    274         hx4 = self.rebnconv4(hx3)
    275 
    276         hx3d = self.rebnconv3d(torch.cat((hx4, hx3), 1))
    277         hx3dup = _upsample_like(hx3d, hx2)
    278 
    279         hx2d = self.rebnconv2d(torch.cat((hx3dup, hx2), 1))
    280         hx2dup = _upsample_like(hx2d, hx1)
    281 
    282         hx1d = self.rebnconv1d(torch.cat((hx2dup, hx1), 1))
    283 
    284         return hx1d + hxin
    285 
    286 
    287 # 图4-stage5、6
    288 ### RSU-4F ###
    289 class RSU4F(nn.Module):  # UNet04FRES(nn.Module):
    290 
    291     def __init__(self, in_ch=3, mid_ch=12, out_ch=3):
    292         super(RSU4F, self).__init__()
    293 
    294         self.rebnconvin = REBNCONV(in_ch, out_ch, dirate=1)
    295 
    296         self.rebnconv1 = REBNCONV(out_ch, mid_ch, dirate=1)
    297         self.rebnconv2 = REBNCONV(mid_ch, mid_ch, dirate=2)
    298         self.rebnconv3 = REBNCONV(mid_ch, mid_ch, dirate=4)
    299 
    300         self.rebnconv4 = REBNCONV(mid_ch, mid_ch, dirate=8)
    301 
    302         self.rebnconv3d = REBNCONV(mid_ch * 2, mid_ch, dirate=4)
    303         self.rebnconv2d = REBNCONV(mid_ch * 2, mid_ch, dirate=2)
    304         self.rebnconv1d = REBNCONV(mid_ch * 2, out_ch, dirate=1)
    305 
    306     def forward(self, x):
    307         hx = x
    308 
    309         hxin = self.rebnconvin(hx)
    310 
    311         hx1 = self.rebnconv1(hxin)
    312         hx2 = self.rebnconv2(hx1)
    313         hx3 = self.rebnconv3(hx2)
    314 
    315         hx4 = self.rebnconv4(hx3)
    316 
    317         hx3d = self.rebnconv3d(torch.cat((hx4, hx3), 1))
    318         hx2d = self.rebnconv2d(torch.cat((hx3d, hx2), 1))
    319         hx1d = self.rebnconv1d(torch.cat((hx2d, hx1), 1))
    320 
    321         return hx1d + hxin
    322 
    323 
    324 # 大模型4千万个参数(和小模型对比区别如下:)
    325 # 网络宽度,也就是每一层卷积核数量是2、4、8倍关系(倍数随着层数呈现指数增长)
    326 # 怪不得体积大小如此之大
    327 ##### U^2-Net ####
    328 class U2NET(nn.Module):
    329     def __init__(self, in_ch=3, out_ch=1):
    330         super(U2NET, self).__init__()
    331 
    332         self.stage1 = RSU7(in_ch, 32, 64)
    333         self.pool12 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    334 
    335         self.stage2 = RSU6(64, 32, 128)
    336         self.pool23 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    337 
    338         self.stage3 = RSU5(128, 64, 256)
    339         self.pool34 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    340 
    341         self.stage4 = RSU4(256, 128, 512)
    342         self.pool45 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    343 
    344         self.stage5 = RSU4F(512, 256, 512)
    345         self.pool56 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    346 
    347         self.stage6 = RSU4F(512, 256, 512)
    348 
    349         # decoder
    350         self.stage5d = RSU4F(1024, 256, 512)
    351         self.stage4d = RSU4(1024, 128, 256)
    352         self.stage3d = RSU5(512, 64, 128)
    353         self.stage2d = RSU6(256, 32, 64)
    354         self.stage1d = RSU7(128, 16, 64)
    355 
    356         self.side1 = nn.Conv2d(64, out_ch, 3, padding=1)
    357         self.side2 = nn.Conv2d(64, out_ch, 3, padding=1)
    358         self.side3 = nn.Conv2d(128, out_ch, 3, padding=1)
    359         self.side4 = nn.Conv2d(256, out_ch, 3, padding=1)
    360         self.side5 = nn.Conv2d(512, out_ch, 3, padding=1)
    361         self.side6 = nn.Conv2d(512, out_ch, 3, padding=1)
    362 
    363         self.outconv = nn.Conv2d(6 * out_ch, out_ch, 1)
    364 
    365     def forward(self, x):
    366         hx = x  # torch.Size([1, 3, 320, 320]) note:原图上输入是:1*3*288*288, 和下面是一样的懒得改了
    367         # print('hx.shape = ', hx.shape)
    368 
    369         # stage 1(En_1)
    370         hx1 = self.stage1(hx)  # torch.Size([1, 64, 320, 320])
    371         hx = self.pool12(hx1)  # torch.Size([1, 64, 160, 160])
    372 
    373         # stage 2(En_2)
    374         hx2 = self.stage2(hx)  # torch.Size([1, 128, 160, 160])
    375         hx = self.pool23(hx2)  # torch.Size([1, 128, 80, 80])
    376 
    377         # stage 3(En_3)
    378         hx3 = self.stage3(hx)  # torch.Size([1, 256, 80, 80])
    379         hx = self.pool34(hx3)  # torch.Size([1, 256, 40, 40])
    380 
    381         # stage 4(En_4)
    382         hx4 = self.stage4(hx)  # torch.Size([1, 512, 40, 40])
    383         hx = self.pool45(hx4)  # torch.Size([1, 512, 20, 20])
    384 
    385         # stage 5(En_5)
    386         hx5 = self.stage5(hx)  # torch.Size([1, 512, 20, 20])
    387         hx = self.pool56(hx5)  # torch.Size([1, 512, 10, 10])
    388 
    389         # stage 6(En_6)
    390         hx6 = self.stage6(hx)  # torch.Size([1, 512, 10, 10])
    391         hx6up = _upsample_like(hx6, hx5)  # torch.Size([1, 512, 20, 20])
    392 
    393         # -------------------- decoder --------------------
    394         # De_5
    395         hx5d = self.stage5d(torch.cat((hx6up, hx5), 1))  # torch.Size([1, 512, 20, 20])
    396         hx5dup = _upsample_like(hx5d, hx4)  # torch.Size([1, 512, 40, 40])
    397         # De_4
    398         hx4d = self.stage4d(torch.cat((hx5dup, hx4), 1))  # torch.Size([1, 256, 40, 40])
    399         hx4dup = _upsample_like(hx4d, hx3)  # torch.Size([1, 256, 80, 80])
    400         # De_3
    401         hx3d = self.stage3d(torch.cat((hx4dup, hx3), 1))  # torch.Size([1, 128, 80, 80])
    402         hx3dup = _upsample_like(hx3d, hx2)  # torch.Size([1, 128, 160, 160])
    403         # De_2
    404         hx2d = self.stage2d(torch.cat((hx3dup, hx2), 1)) # torch.Size([1, 64, 160, 160])
    405         hx2dup = _upsample_like(hx2d, hx1)  # torch.Size([1, 64, 320, 320])
    406         # De_1
    407         hx1d = self.stage1d(torch.cat((hx2dup, hx1), 1))  # torch.Size([1, 64, 320, 320])
    408 
    409         # side output
    410         # 0倍上采样
    411         d1 = self.side1(hx1d)  # torch.Size([1, 1, 320, 320])
    412         # 2倍上采样
    413         d2 = self.side2(hx2d)  # torch.Size([1, 1, 160, 160])
    414         d2 = _upsample_like(d2, d1)  # torch.Size([1, 1, 320, 320])
    415         # 5倍上采样
    416         d3 = self.side3(hx3d)  # torch.Size([1, 1, 80, 80])
    417         d3 = _upsample_like(d3, d1)  # torch.Size([1, 1, 320, 320])
    418         # 8倍上采样
    419         d4 = self.side4(hx4d)  # torch.Size([1, 1, 40, 40])
    420         d4 = _upsample_like(d4, d1)  # torch.Size([1, 1, 320, 320])
    421         # 16倍上采样
    422         d5 = self.side5(hx5d)  # torch.Size([1, 1, 20, 20])
    423         d5 = _upsample_like(d5, d1)  # torch.Size([1, 1, 320, 320])
    424         # 32倍上采样
    425         d6 = self.side6(hx6)  # torch.Size([1, 1, 10, 10])
    426         d6 = _upsample_like(d6, d1)  # torch.Size([1, 1, 320, 320])
    427         # concat + 1×1卷积
    428         d0 = self.outconv(torch.cat((d1, d2, d3, d4, d5, d6), 1))  # torch.Size([1, 1, 320, 320])
    429         # torch.sigmoid()
    430         return F.sigmoid(d0), F.sigmoid(d1), F.sigmoid(d2), F.sigmoid(d3), F.sigmoid(d4), F.sigmoid(d5), F.sigmoid(d6)
    431 
    432 
    433 # 小模型1百万个参数
    434 ### U^2-Net small ###
    435 class U2NETP(nn.Module):
    436 
    437     def __init__(self, in_ch=3, out_ch=1):
    438         super(U2NETP, self).__init__()
    439 
    440         self.stage1 = RSU7(in_ch, 16, 64)
    441         self.pool12 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    442 
    443         self.stage2 = RSU6(64, 16, 64)
    444         self.pool23 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    445 
    446         self.stage3 = RSU5(64, 16, 64)
    447         self.pool34 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    448 
    449         self.stage4 = RSU4(64, 16, 64)
    450         self.pool45 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    451 
    452         self.stage5 = RSU4F(64, 16, 64)
    453         self.pool56 = nn.MaxPool2d(2, stride=2, ceil_mode=True)
    454 
    455         self.stage6 = RSU4F(64, 16, 64)
    456 
    457         # decoder
    458         self.stage5d = RSU4F(128, 16, 64)
    459         self.stage4d = RSU4(128, 16, 64)
    460         self.stage3d = RSU5(128, 16, 64)
    461         self.stage2d = RSU6(128, 16, 64)
    462         self.stage1d = RSU7(128, 16, 64)
    463 
    464         self.side1 = nn.Conv2d(64, out_ch, 3, padding=1)
    465         self.side2 = nn.Conv2d(64, out_ch, 3, padding=1)
    466         self.side3 = nn.Conv2d(64, out_ch, 3, padding=1)
    467         self.side4 = nn.Conv2d(64, out_ch, 3, padding=1)
    468         self.side5 = nn.Conv2d(64, out_ch, 3, padding=1)
    469         self.side6 = nn.Conv2d(64, out_ch, 3, padding=1)
    470 
    471         self.outconv = nn.Conv2d(6 * out_ch, out_ch, 1)
    472 
    473     def forward(self, x):
    474         hx = x
    475 
    476         # stage 1
    477         hx1 = self.stage1(hx)
    478         hx = self.pool12(hx1)
    479 
    480         # stage 2
    481         hx2 = self.stage2(hx)
    482         hx = self.pool23(hx2)
    483 
    484         # stage 3
    485         hx3 = self.stage3(hx)
    486         hx = self.pool34(hx3)
    487 
    488         # stage 4
    489         hx4 = self.stage4(hx)
    490         hx = self.pool45(hx4)
    491 
    492         # stage 5
    493         hx5 = self.stage5(hx)
    494         hx = self.pool56(hx5)
    495 
    496         # stage 6
    497         hx6 = self.stage6(hx)
    498         hx6up = _upsample_like(hx6, hx5)
    499 
    500         # decoder
    501         hx5d = self.stage5d(torch.cat((hx6up, hx5), 1))
    502         hx5dup = _upsample_like(hx5d, hx4)
    503 
    504         hx4d = self.stage4d(torch.cat((hx5dup, hx4), 1))
    505         hx4dup = _upsample_like(hx4d, hx3)
    506 
    507         hx3d = self.stage3d(torch.cat((hx4dup, hx3), 1))
    508         hx3dup = _upsample_like(hx3d, hx2)
    509 
    510         hx2d = self.stage2d(torch.cat((hx3dup, hx2), 1))
    511         hx2dup = _upsample_like(hx2d, hx1)
    512 
    513         hx1d = self.stage1d(torch.cat((hx2dup, hx1), 1))
    514 
    515         # side output
    516         d1 = self.side1(hx1d)
    517 
    518         d2 = self.side2(hx2d)
    519         d2 = _upsample_like(d2, d1)
    520 
    521         d3 = self.side3(hx3d)
    522         d3 = _upsample_like(d3, d1)
    523 
    524         d4 = self.side4(hx4d)
    525         d4 = _upsample_like(d4, d1)
    526 
    527         d5 = self.side5(hx5d)
    528         d5 = _upsample_like(d5, d1)
    529 
    530         d6 = self.side6(hx6)
    531         d6 = _upsample_like(d6, d1)
    532 
    533         d0 = self.outconv(torch.cat((d1, d2, d3, d4, d5, d6), 1))
    534 
    535         return F.sigmoid(d0), F.sigmoid(d1), F.sigmoid(d2), F.sigmoid(d3), F.sigmoid(d4), F.sigmoid(d5), F.sigmoid(d6)
    View Code

    四、Libtorch部署

    模型导出python脚本:

    export_u2net.py

    (这里只给出导出CPU版本,实际上,在libtorch中无论是CPU还是GPU都是可以用这个导出的CPU模型,因为模型、数据是可以导入GPU中)

     1 import os
     2 import torch
     3 from model import U2NET  # full size version 173.6 MB
     4 
     5 
     6 def main():
     7     model_name = 'u2net'
     8     model_dir = os.path.join(os.getcwd(), 'saved_models', model_name + '_human_seg', model_name + '_human_seg.pth')
     9 
    10     if model_name == 'u2net':
    11         print("...load U2NET---173.6 MB")
    12         net = U2NET(3, 1)
    13 
    14     net.load_state_dict(torch.load(model_dir, map_location=torch.device('cpu')))
    15     net.eval()
    16 
    17     # --------- model 序列化 ---------
    18     #example = torch.zeros(1, 3, 512, 512).to(device='cuda')
    19     example = torch.zeros(1, 3, 512, 512)
    20     torch_script_module = torch.jit.trace(net, example)
    21     torch_script_module.save('human2-cpu.pt')
    22     print('over')
    23 
    24 
    25 if __name__ == "__main__":
    26     main()
    View Code

    部署代码:

    配置文件Config.yaml

     1 %YAML:1.0
     2 # note: 1、修改文件名时,记得保留符号 "",变量不需要该符号
     3 #       2、图分辨率 > 
     4 #       3、本文件注释须单独一行
     5 #       4、项目中所有读取、保存的本地数据都默认在dir: "D://Data//"下
     6 
     7 # data目录
     8 dir: "D:\\Data\\"
     9 
    10 # 原图
    11 srcImgFile: "img_1589.png"
    12 
    13 
    14 # ****************************************************************** 深度学习 ***********************************************************************
    15 # 风格转换模型文件名  
    16 styleModelFile: "D:\\U-2-Net-master\\human1-gpu.pt"
    View Code

    配置文件代码:Config.h、Config.cpp

     1 #ifndef CONFIG_H
     2 #define CONFIG_H
     3 
     4 #include<opencv2/opencv.hpp>
     5 #include<iostream>
     6 
     7 class Config
     8 {
     9 public:
    10     Config(const std::string& yamlFile);
    11     ~Config();
    12 
    13     template<typename T>
    14     T get(const std::string& key)
    15     {
    16         return T(this->m_fileStorage[key]);
    17     }
    18 
    19 private:
    20     std::string m_yamlFile;
    21     cv::FileStorage m_fileStorage;
    22 };
    23 
    24 #endif // !Config_H
    View Code
     1 #include "Config.h"
     2 
     3 Config::Config(const std::string& yamlFile):
     4     m_yamlFile(yamlFile)
     5 {
     6     this->m_fileStorage.open(this->m_yamlFile, cv::FileStorage::READ);
     7     if (!this->m_fileStorage.isOpened())
     8     {
     9         std::cerr << "open default.yaml failurely!" << std::endl;
    10         system("pause");
    11     }
    12 }
    13 
    14 Config::~Config()
    15 {
    16 }
    View Code

    人像语义分割:U2Net_Human.cpp,这里又报错(),请参考:《 libtorch在windows下场见错误整理总结》https://i.cnblogs.com/posts/edit-done;postId=14687275

      1 #include<opencv2/opencv.hpp>
      2 #include<torch/torch.h>
      3 #include<torch/script.h>
      4 #include"Config.h"
      5 
      6 torch::Tensor normPRED(torch::Tensor d)
      7 {
      8     at::Tensor ma, mi;
      9     torch::Tensor dn;
     10     ma = torch::max(d);
     11     mi = torch::min(d);
     12     dn = (d - mi) / (ma - mi);
     13     return dn;
     14 }
     15 
     16 void  bgr_u2net(cv::Mat& image_src, cv::Mat& result, torch::jit::Module& model)
     17 {
     18     auto device = torch::Device("cuda");
     19     //   auto image_bgr = cv::imread("bg11.png");
     20     //    auto xt = cv::imread("xt2.jpg");
     21     cv::Mat  image_src1 = image_src.clone();
     22     cv::resize(image_src, image_src, cv::Size(320, 320));
     23     cv::cvtColor(image_src, image_src, cv::COLOR_RGB2BGR);
     24     //    cv::cvtColor(image_src,image_src,cv::COLOR_BGR2RGB);
     25 
     26     torch::Tensor tensor_image_src = torch::from_blob(image_src.data, { image_src.rows, image_src.cols,3 }, torch::kByte);
     27     //    torch::Tensor tensor_image_bgr = torch::from_blob(image_bgr.data, {image_bgr.rows, image_bgr.cols,3},torch::kByte);
     28     torch::Tensor tensor_bgr = torch::from_blob(image_src1.data, { image_src1.rows, image_src1.cols,3 }, torch::kByte);
     29     tensor_image_src = tensor_image_src.permute({ 2,0,1 });
     30     tensor_image_src = tensor_image_src.toType(torch::kFloat);
     31     tensor_image_src = tensor_image_src.div(255);
     32     tensor_image_src = tensor_image_src.unsqueeze(0);
     33     //    tensor_image_bgr = tensor_image_bgr.permute({2,0,1});
     34     //    tensor_image_bgr = tensor_image_bgr.toType(torch::kFloat);
     35     //    tensor_image_bgr = tensor_image_bgr.div(255);
     36     //    tensor_image_bgr = tensor_image_bgr.unsqueeze(0);
     37     tensor_bgr = tensor_bgr.permute({ 2,0,1 });
     38     tensor_bgr = tensor_bgr.toType(torch::kFloat);
     39     tensor_bgr = tensor_bgr.div(255);
     40     tensor_bgr = tensor_bgr.unsqueeze(0);
     41     //    cv::imshow("image",tensor_image_bgr)
     42 
     43     auto src = tensor_image_src.to(device);
     44     //    auto bgr =   tensor_image_bgr.to(device);
     45     auto src_copy = tensor_bgr.to(device);
     46 
     47     auto outputs = model.forward({ src }).toTuple()->elements();
     48 
     49     auto pred = outputs[0].toTensor();
     50 
     51 
     52     //    pha = normPRED_(pha);
     53     //    auto fgr = outputs[1].toTensor();
     54     //    auto res_tensor = (pred * src + (1-pred)* torch::ones_like(src));
     55     //    double endtime=(double)(end-start)/CLOCKS_PER_SEC;
     56     //    std::cout<<"time:"<<endtime<<std::endl;
     57     //    auto res_tensor = (pred * src + (1-pred)*torch::tensor({120/255, 255/255, 155/255}).to(device).view({1,3,1,1}));
     58     auto res_tensor = (pred * torch::ones_like(src));
     59     res_tensor = normPRED(res_tensor);
     60     res_tensor = res_tensor.squeeze(0).detach();
     61     res_tensor = res_tensor.mul(255).clamp(0, 255).to(torch::kU8);
     62     res_tensor = res_tensor.to(torch::kCPU);
     63     //    cv::Mat result( image_bgr.rows,image_bgr.cols, CV_32FC3,fgr.data_ptr());
     64     cv::Mat resultImg(res_tensor.size(1), res_tensor.size(2), CV_8UC3);
     65     std::memcpy((void*)resultImg.data, res_tensor.data_ptr(), sizeof(torch::kU8) * res_tensor.numel());
     66     //    result=resultImg.clone();
     67     //    cv::cvtColor(result,result,cv::COLOR_BGR2RGB);
     68 
     69     cv::resize(resultImg, resultImg, cv::Size(image_src1.cols, image_src1.rows), cv::INTER_LINEAR);
     70     //   cv:: Mat element = getStructuringElement(cv::MORPH_RECT, cv::Size(15,15));
     71     //    cv::dilate(resultImg, resultImg, element);
     72     //    cv::threshold(resultImg, resultImg, 130, 255, cv::THRESH_BINARY);
     73     //    cv::imwrite("pha.jpg", resultImg);
     74     torch::Tensor tensor_result = torch::from_blob(resultImg.data, { resultImg.rows, resultImg.cols,3 }, torch::kByte);
     75     tensor_result = tensor_result.permute({ 2,0,1 });
     76     tensor_result = tensor_result.toType(torch::kFloat);
     77     tensor_result = tensor_result.div(255);
     78     tensor_result = tensor_result.unsqueeze(0);
     79     //    torch::Tensor  c=(tensor_result>220/255);
     80 
     81     //    tensor_result>200/255;
     82     ;
     83     //    tensor_result[tensor_result>=200/255]=1;
     84     //    res_tensor = (c * tensor_bgr -c* torch::ones_like(tensor_bgr)+torch::ones_like(tensor_bgr) );
     85     res_tensor = (tensor_result * tensor_bgr + (1 - tensor_result) * torch::ones_like(tensor_bgr));
     86     //    res_tensor = (tensor_result * tensor_bgr +(1-tensor_result)* tensor_image_bgr );
     87     res_tensor = res_tensor.squeeze(0).detach();
     88     res_tensor = res_tensor.mul(255).clamp(0, 255).to(torch::kU8);
     89     res_tensor = res_tensor.to(torch::kCPU);
     90     //    cv::Mat result( image_bgr.rows,image_bgr.cols, CV_32FC3,fgr.data_ptr());
     91     cv::Mat resultImg1(res_tensor.size(1), res_tensor.size(2), CV_8UC3);
     92     std::memcpy((void*)resultImg1.data, res_tensor.data_ptr(), sizeof(torch::kU8) * res_tensor.numel());
     93     result = resultImg1.clone();
     94 
     95 
     96 }
     97 
     98 int main()
     99 {
    100     // load srcImg
    101     Config cfg("Config.yaml");
    102     cv::Mat srcImg = cv::imread(cfg.get<std::string>("srcImgFile"), -1);
    103     cv::Mat srcImg_;
    104     cv::resize(srcImg, srcImg_, cv::Size(512, 512));
    105 
    106     std::string str = cfg.get<std::string>("styleModelFile");
    107 
    108     // load model of cpu
    109     torch::jit::script::Module styleModule;
    110     // load style model
    111     auto device_type = at::kCPU;
    112     if (torch::cuda::is_available()) {
    113         std::cout << "gpu" << std::endl;
    114         device_type = at::kCUDA;
    115     }
    116     try
    117     {
    118         styleModule = torch::jit::load(str);
    119         styleModule.to(device_type);
    120     }
    121     catch (const c10::Error& e)
    122     {
    123         std::cerr << "errir code: -2, error loading the model\n";
    124         return -1;
    125     }
    126     cv::Mat dstImg;
    127     bgr_u2net(srcImg_, dstImg, styleModule);
    128 
    129     cv::imshow("dstImg", dstImg);
    130     cv::waitKey(0);
    131 
    132     return 1;
    133 }
    View Code

    更新下U2Net_Human.cpp,似乎对libtorch还不够纯熟。

      1 #include<opencv2/opencv.hpp>
      2 #include<torch/torch.h>
      3 #include<torch/script.h>
      4 #include"Config.h"
      5 
      6 torch::Tensor normPRED(torch::Tensor d) 
      7 {
      8     at::Tensor ma, mi;
      9     torch::Tensor dn;
     10     ma = torch::max(d);
     11     mi = torch::min(d);
     12     dn = (d - mi) / (ma - mi);
     13     return dn;
     14 }
     15 
     16 void  bgr_u2net(cv::Mat& image_src, cv::Mat& result, torch::jit::Module& model) 
     17 {
     18     auto device = torch::Device("cuda");
     19   
     20     cv::Mat  image_src1 = image_src.clone();
     21     cv::resize(image_src, image_src, cv::Size(320, 320));
     22     //cv::cvtColor(image_src, image_src, cv::COLOR_RGB2BGR);
     23     cv::cvtColor(image_src,image_src,cv::COLOR_BGR2RGB);
     24     
     25     torch::Tensor tensor_image_src = torch::from_blob(image_src.data, { image_src.rows, image_src.cols, 3 }, torch::kByte);
     26     //  torch::Tensor tensor_image_bgr = torch::from_blob(image_bgr.data, {image_bgr.rows, image_bgr.cols,3},torch::kByte);
     27     torch::Tensor tensor_bgr = torch::from_blob(image_src1.data, { image_src1.rows, image_src1.cols,3 }, torch::kByte);
     28     tensor_image_src = tensor_image_src.permute({ 2,0,1 }); // RGB -> BGR互换,有点多余
     29     tensor_image_src = tensor_image_src.toType(torch::kFloat);
     30     tensor_image_src = tensor_image_src.div(255);
     31     // [3, 320, 320] 
     32     tensor_image_src = tensor_image_src.unsqueeze(0); // 拿掉第一个维度
     33     // [1, 3, 320, 320]
     34     std::cout << tensor_image_src.sizes() << std::endl;
     35 
     36     tensor_bgr = tensor_bgr.permute({ 2,0,1 });
     37     tensor_bgr = tensor_bgr.toType(torch::kFloat);
     38     tensor_bgr = tensor_bgr.div(255);
     39     tensor_bgr = tensor_bgr.unsqueeze(0);
     40 
     41     auto src = tensor_image_src.to(device);
     42     //    auto bgr =   tensor_image_bgr.to(device);
     43     //auto src_copy = tensor_bgr.to(device);
     44 
     45     auto outputs = model.forward({ src }).toTuple()->elements();
     46 
     47     auto pred = outputs[0].toTensor();
     48     
     49     auto res_tensor = (pred * torch::ones_like(src));
     50     
     51     std::cout << torch::ones_like(src).sizes() << std::endl;
     52     std::cout << src.sizes() << std::endl;
     53     
     54     res_tensor = normPRED(res_tensor);
     55     res_tensor = res_tensor.squeeze(0).detach();
     56     res_tensor = res_tensor.mul(255).clamp(0, 255).to(torch::kU8);
     57     res_tensor = res_tensor.to(torch::kCPU);
     58     //    cv::Mat result( image_bgr.rows,image_bgr.cols, CV_32FC3,fgr.data_ptr());
     59     cv::Mat resultImg(res_tensor.size(1), res_tensor.size(2), CV_8UC3);
     60     std::memcpy((void*)resultImg.data, res_tensor.data_ptr(), sizeof(torch::kU8) * res_tensor.numel());
     61     //    result=resultImg.clone();
     62     //    cv::cvtColor(result,result,cv::COLOR_BGR2RGB);
     63 
     64     cv::resize(resultImg, resultImg, cv::Size(image_src1.cols, image_src1.rows), cv::INTER_LINEAR);
     65     //   cv:: Mat element = getStructuringElement(cv::MORPH_RECT, cv::Size(15,15));
     66     //    cv::dilate(resultImg, resultImg, element);
     67     //    cv::threshold(resultImg, resultImg, 130, 255, cv::THRESH_BINARY);
     68     //    cv::imwrite("pha.jpg", resultImg);
     69     torch::Tensor tensor_result = torch::from_blob(resultImg.data, { resultImg.rows, resultImg.cols,3 }, torch::kByte);
     70     tensor_result = tensor_result.permute({ 2,0,1 });
     71     tensor_result = tensor_result.toType(torch::kFloat);
     72     tensor_result = tensor_result.div(255);
     73     tensor_result = tensor_result.unsqueeze(0);
     74     //    torch::Tensor  c=(tensor_result>220/255);
     75 
     76     //    tensor_result>200/255;
     77     ;
     78     //    tensor_result[tensor_result>=200/255]=1;
     79     //    res_tensor = (c * tensor_bgr -c* torch::ones_like(tensor_bgr)+torch::ones_like(tensor_bgr) );
     80     res_tensor = (tensor_result * tensor_bgr + (1 - tensor_result) * torch::ones_like(tensor_bgr));
     81     //    res_tensor = (tensor_result * tensor_bgr +(1-tensor_result)* tensor_image_bgr );
     82     res_tensor = res_tensor.squeeze(0).detach();
     83     res_tensor = res_tensor.mul(255).clamp(0, 255).to(torch::kU8);
     84     res_tensor = res_tensor.to(torch::kCPU);
     85     //    cv::Mat result( image_bgr.rows,image_bgr.cols, CV_32FC3,fgr.data_ptr());
     86     cv::Mat resultImg1(res_tensor.size(1), res_tensor.size(2), CV_8UC3);
     87     std::memcpy((void*)resultImg1.data, res_tensor.data_ptr(), sizeof(torch::kU8) * res_tensor.numel());
     88     result = resultImg1.clone();
     89 
     90 
     91 }
     92 
     93 int main()
     94 {
     95     // load srcImg
     96     Config cfg("Config.yaml");
     97     cv::Mat srcImg = cv::imread(cfg.get<std::string>("srcImgFile"), -1);
     98     cv::Mat srcImg_;
     99     cv::resize(srcImg, srcImg_, cv::Size(512, 512));
    100     if (srcImg_.channels() == 4)
    101     {
    102         cv::cvtColor(srcImg_, srcImg_, cv::COLOR_BGRA2BGR);
    103     }
    104 
    105     std::string str = cfg.get<std::string>("styleModelFile");
    106 
    107     // load model of cpu
    108     torch::jit::script::Module styleModule;
    109     // load style model
    110     auto device_type = at::kCPU;
    111     if (torch::cuda::is_available()) {
    112         std::cout << "gpu" << std::endl;
    113         device_type = at::kCUDA;
    114     }
    115     try
    116     {
    117         styleModule = torch::jit::load(str);
    118         styleModule.to(device_type);
    119     }
    120     catch (const c10::Error& e)
    121     {
    122         std::cerr << "errir code: -2, error loading the model\n";
    123         return -1;
    124     }
    125     cv::Mat dstImg;
    126     bgr_u2net(srcImg_, dstImg, styleModule);
    127 
    128     cv::imshow("dstImg", dstImg);
    129     cv::waitKey(0);
    130 
    131     return 1;
    132 }
    View Code

    五、性能分析 

    六、问题记录

    6.1、u2net_train.py报错问题:

    1、 OMP:Error

    解决:在文件第一行添加如下代码:

    import os
    os.environ['KMP_DUPLICATE_LIB_OK'] = 'True' # OMP:Error

    2、爆显存 error: RuntimeError: CUDA out of memory.

    batch_size_train = 12 # 将12改为1

    3、error:The "freeze_support()" line can be omitted if the progra

    if __name__ == '__main__': # error:The "freeze_support()" line can be omitted if the progra
        for epoch in range(0, epoch_num): #在 这个for循环前面加上一行,如上所示
         ......
    reference:
    [1] 肖像绘画:https://www.cvpy.net/studio/cv/func/DeepLearning/sketch/sketch/page/

  • 相关阅读:
    Firefox地址栏样式设定
    定制Eclipse
    超简单的java爬虫
    JavaWeb--中文乱码小结
    编译原理之正则表达式转NFA
    Fedora下Eclipse/MyEclipse崩溃的解决方案
    利用Octopress在github pages上搭建个人博客
    在Eclipse中导入新浪微博SDK
    BlueMix
    云计算的三层简单理解
  • 原文地址:https://www.cnblogs.com/tensorrt/p/14675857.html
Copyright © 2020-2023  润新知