• 学习笔记20:图像语义分割


    图像语义分割形象化描述

    图像语义分割是指像素级地识别图像,即标注出图像中每个像素所属的对象类别。

    目标:一般是将一张RGB图像(height*width*3)或是灰度图(height*width*1)作为输入,输出的是分割图,其中每一个像素包含了其类别的标签(height*width*1)

    Unet网络架构

    • Unet的左侧是convolution layers

    • 右侧则是upsamping layers

    • convolutions layers中每个pooling layer前输出值会concatenate到对应的upsamping层的输出值中

    • 前半部分作用是特征提取,后半部分是上采样。在一些文献中也把这样的结构叫做编码器-解码器结构。

    • 上采样部分会融合特征提取部分的输出,这样做实际上是将多尺度特征融合在了一起,以最后一个上采样为例,它的特征既来自第一个卷积block的输出(同尺度特征),也来自上采样的输出(大尺度特征)

    获取原图和分割图路径

    all_pics = glob.glob(r'E:HKdatasetHKdataset	raining*.png') # 获取所有图片
    images = [p for p in all_pics if 'matte' not in p] # 获取原图
    annotations = [p for p in all_pics if 'matte' in p] # 获取分割图
    

    制作数据集

    np.random.seed(2021)
    index = np.random.permutation(len(images))
    images = np.array(images)[index]
    anno = np.array(annotations)[index]
    
    all_test_pics = glob.glob(r'E:HKdatasetHKdataset	esting*.png')
    test_images = [p for p in all_test_pics if 'matte' not in p]
    test_anno = [p for p in all_test_pics if 'matte' in p]
    
    transform = transforms.Compose([
                        transforms.Resize((256, 256)),
                        transforms.ToTensor(),
    ])
    
    class Portrait_dataset(data.Dataset):
        def __init__(self, img_paths, anno_paths):
            self.imgs = img_paths
            self.annos = anno_paths
            
        def __getitem__(self, index):
            img = self.imgs[index]
            anno = self.annos[index]
            pil_img = Image.open(img)    
            img_tensor = transform(pil_img)
            pil_anno = Image.open(anno)    
            anno_tensor = transform(pil_anno)
            anno_tensor = torch.squeeze(anno_tensor).type(torch.long) # 去掉维数为 1 的维度
            anno_tensor[anno_tensor > 0] = 1 # 将分割图转化为只存在0和1两个像素的图像
            return img_tensor, anno_tensor
        
        def __len__(self):
            return len(self.imgs)
    
    train_dataset = Portrait_dataset(images, anno)
    test_dataset = Portrait_dataset(test_images, test_anno)
    train_dl = data.DataLoader(train_dataset, batch_size=4, shuffle=True)
    test_dl = data.DataLoader(test_dataset, batch_size=4)
    

    定义模型

    下采样模型

    一个下采样模型包括一层池化+两层卷积

    第一层卷积channel的数量由in_channels->out_channels,第二层卷积channel数量由out_channels->out_channels

    class Downsample(nn.Module):
        def __init__(self, in_channels, out_channels):
            super(Downsample, self).__init__()
            self.conv_relu = nn.Sequential(
                                nn.Conv2d(in_channels, out_channels, 
                                          kernel_size=3, padding=1),
                                nn.ReLU(inplace=True),
                                nn.Conv2d(out_channels, out_channels, 
                                          kernel_size=3, padding=1),
                                nn.ReLU(inplace=True)
                )
            self.pool = nn.MaxPool2d(kernel_size=2)
        def forward(self, x, is_pool=True):
            if is_pool:
                x = self.pool(x)
            x = self.conv_relu(x)
            return x
    

    上采样模型

    上采样模型包括两层卷积+一层上采样,上采样采用反卷积

    第一层卷积channel的数量由2 * channels->channels,第二层卷积channel数量由channels->channels

    上采样再将channel数量减半

    class Upsample(nn.Module):
        def __init__(self, channels):
            super(Upsample, self).__init__()
            self.conv_relu = nn.Sequential(
                                nn.Conv2d(2*channels, channels, 
                                          kernel_size=3, padding=1),
                                nn.ReLU(inplace=True),
                                nn.Conv2d(channels, channels,  
                                          kernel_size=3, padding=1),
                                nn.ReLU(inplace=True)
                )
            self.upconv_relu = nn.Sequential(
                                   nn.ConvTranspose2d(channels, 
                                                      channels//2, 
                                                      kernel_size=3,
                                                      stride=2,
                                                      padding=1,
                                                      output_padding=1),
                                   nn.ReLU(inplace=True)
                )
            
        def forward(self, x):
            x = self.conv_relu(x)
            x = self.upconv_relu(x)
            return x
    

    模型

    模型构成是由5层下采样模型,1个上采样层,3个上采样模型,2层卷积(3*3),1层卷积(1*1)输出

    class Net(nn.Module):
        def __init__(self):
            super(Net, self).__init__()
            self.down1 = Downsample(3, 64)
            self.down2 = Downsample(64, 128)
            self.down3 = Downsample(128, 256)
            self.down4 = Downsample(256, 512)
            self.down5 = Downsample(512, 1024)
            
            self.up = nn.Sequential(
                                   nn.ConvTranspose2d(1024, 
                                                      512, 
                                                      kernel_size=3,
                                                      stride=2,
                                                      padding=1,
                                                      output_padding=1),
                                   nn.ReLU(inplace=True)
                )
            
            self.up1 = Upsample(512)
            self.up2 = Upsample(256)
            self.up3 = Upsample(128)
            
            self.conv_2 = Downsample(128, 64)
            self.last = nn.Conv2d(64, 2, kernel_size=1)
    
        def forward(self, x):
            x1 = self.down1(x, is_pool=False)
            x2 = self.down2(x1)
            x3 = self.down3(x2)
            x4 = self.down4(x3)
            x5 = self.down5(x4)
            
            x5 = self.up(x5)
            
            x5 = torch.cat([x4, x5], dim=1)           # 32*32*1024
            x5 = self.up1(x5)                         # 64*64*256)
            x5 = torch.cat([x3, x5], dim=1)           # 64*64*512  
            x5 = self.up2(x5)                         # 128*128*128
            x5 = torch.cat([x2, x5], dim=1)           # 128*128*256
            x5 = self.up3(x5)                         # 256*256*64
            x5 = torch.cat([x1, x5], dim=1)           # 256*256*128
            
            x5 = self.conv_2(x5, is_pool=False)       # 256*256*64
            
            x5 = self.last(x5)                        # 256*256*3
            return x5
    
  • 相关阅读:
    python核心编程2 第八章 练习
    python核心编程2 第六章 练习
    python核心编程2 第五章 练习
    Redis
    CENTOS7错误:Cannot find a valid baseurl for repo: base/7/x86_6
    HTTP协议
    计算机网络知识点
    好记性不如烂笔头~
    一些算法题
    解决mysql插入数据时出现Incorrect string value: 'xF0x9F...' 的异常
  • 原文地址:https://www.cnblogs.com/miraclepbc/p/14385632.html
Copyright © 2020-2023  润新知