• Pytorch常用代码整理


    Pytorch常用代码整理

    查看Pytorch基本信息

    需要用到的包

    import collections
    import os
    import shutil
    import tqdm
    import numpy as np
    import PIL.Image
    import torch
    import torchvision
    

    检查 PyTorch 版本

    torch.__version__               # PyTorch version
    torch.version.cuda              # Corresponding CUDA version
    torch.backends.cudnn.version()  # Corresponding cuDNN version
    torch.cuda.get_device_name(0)   # GPU type
    

    设置为 cuDNN benchmark 模式

    torch.backends.cudnn.benchmark = True
    

    打印tf版本

    import tensorflow as tf;
    print(tf.__version__)
    

    设置随机种子

    def seed_torch(seed=1029):
        random.seed(seed)
        os.environ['PYTHONHASHSEED'] = str(seed) # 为了禁止hash随机化,使得实验可复现
        np.random.seed(seed)
        torch.manual_seed(seed)
        torch.cuda.manual_seed(seed)
        torch.cuda.manual_seed_all(seed) # if you are using multi-GPU.
        torch.backends.cudnn.benchmark = False
        torch.backends.cudnn.deterministic = True
    

    不打印tensorflow的log信息

    import os
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
    

    忽略警告

    import warnings
    warnings.filterwarnings("ignore")
    

    张量处理

    张量基本信息

    tensor.type()   # Data type
    tensor.size()   # Shape of the tensor. It is a subclass of Python tuple
    tensor.dim()    # Number of dimensions.
    

    Pytorch存储图片

    from torchvision.utils import save_image
    save_image(tensor , filename , padding =0)
    

    图片保存为gif动画

    import os
    import cv2
    import numpy as np
    import matplotlib.pyplot as plt
    import matplotlib.animation as animation
    
    dir_ = './out'
    file_list = os.listdir(dir_)
    img_list = []
    for file_ in file_list:
        file_path = os.path.join(dir_,file_)
        img_ = cv2.imread(file_path)
        img_list.append(img_)
    
    #%%capture
    fig = plt.figure(figsize=(8,8))
    plt.axis("off")
    ims = [[plt.imshow(i, animated=True)] for i in img_list]
    ani = animation.ArtistAnimation(fig, ims, interval=500, repeat_delay=1000, blit=True)
    # animation.ArtistAnimation:  #https://matplotlib.org/3.3.1/api/_as_gen/matplotlib.animation.ArtistAnimation.html
    ani.save("test.gif",writer='pillow')
    

    torch.Tensor 与 np.ndarray 转换

    # torch.Tensor -> np.ndarray.
    ndarray = tensor.cpu().numpy()
    
    # np.ndarray -> torch.Tensor.
    tensor = torch.from_numpy(ndarray).float()
    tensor = torch.from_numpy(ndarray.copy()).float()  # If ndarray has negative stride
    

    torch.Tensor 与 PIL.Image 转换

    # torch.Tensor -> PIL.Image.
    # PIL张量采用H*W*D的顺序,而Pytorch中的张量则采用N×D×H×W 的顺序,并且数据范围在 [0, 1],需要进行转置和规范化。
    image = torchvision.transforms.functional.to_pil_image(tensor)  # Equivalently way
    
    # PIL.Image -> torch.Tensor.
    tensor = torchvision.transforms.functional.to_tensor(PIL.Image.open(path)) 
    

    np.ndarray 与 PIL.Image 转换

    # np.ndarray -> PIL.Image.
    res = Image.fromarray(res.astype('uint8')).convert('RGB')
    
    # PIL.Image -> np.ndarray.
    ndarray = np.asarray(PIL.Image.open(path))
    

    从只包含一个元素的张量中提取值

    value = tensor.item()
    

    复制张量

    # Operation                 |  New/Shared memory | Still in computation graph |
    tensor.clone()            # |        New         |          Yes               |
    tensor.detach()           # |      Shared        |          No                |
    tensor.detach.clone()()   # |        New         |          No                |
    

    拼接张量

    注意 torch.cat 和 torch.stack 的区别在于 torch.cat 沿着给定的维度拼接,而 torch.stack 会新增一维。例如当参数是 3 个 10×5 的张量,torch.cat 的结果是 30×5 的张量,而 torch.stack 的结果是 3×10×5 的张量。

    tensor = torch.cat(list_of_tensors, dim=0)
    tensor = torch.stack(list_of_tensors, dim=0)
    

    模型定义

    计算模型整体参数量

    
    self.networks = Net()
    pytorch_total_params = sum(p.numel() for p in self.networks.parameters() if p.requires_grad)
    print('Total Params: %d' % pytorch_total_params)
    
    num_parameters = sum(torch.numel(parameter) for parameter in model.parameters())
    

    继承自nn.Module的自定义Flatten模块

    class Flatten(nn.Module):
        def __init__(self):
            super(Flatten,self).__init__()
        
        def forward(self, input):
            return input.view(input.size(0), -1)
    	
    net = nn.Sequential(
                        nn.Conv2d(1,16,stride=1,padding=1),
                        nn.MaxPool2d(2,2),
                        Flatten(),# 这里是自己实现的继承自nn.Modeules的子类
                        nn.Linear(xxx,xx))
    

    模型权值初始化

    # Common practise for initialization.
    for layer in model.modules():
        if isinstance(layer, torch.nn.Conv2d):
            torch.nn.init.kaiming_normal_(layer.weight, mode='fan_out',
                                          nonlinearity='relu')
            if layer.bias is not None:
                torch.nn.init.constant_(layer.bias, val=0.0)
        elif isinstance(layer, torch.nn.BatchNorm2d):
            torch.nn.init.constant_(layer.weight, val=1.0)
            torch.nn.init.constant_(layer.bias, val=0.0)
        elif isinstance(layer, torch.nn.Linear):
            torch.nn.init.xavier_normal_(layer.weight)
            if layer.bias is not None:
                torch.nn.init.constant_(layer.bias, val=0.0)
    
    # Initialization with given tensor.
    layer.weight = torch.nn.Parameter(tensor)
    

    模型训练

    常用训练和验证数据预处理

    train_transform = torchvision.transforms.Compose([
        torchvision.transforms.RandomResizedCrop(size=224,
                                                 scale=(0.08, 1.0)),
        torchvision.transforms.RandomHorizontalFlip(),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406),
                                         std=(0.229, 0.224, 0.225)),
     ])
     val_transform = torchvision.transforms.Compose([
        torchvision.transforms.Resize(224),
        torchvision.transforms.CenterCrop(224),
        torchvision.transforms.ToTensor(),
        torchvision.transforms.Normalize(mean=(0.485, 0.456, 0.406),
                                         std=(0.229, 0.224, 0.225)),
    ])
    

    训练基本代码框架

    for t in epoch(80):
        for images, labels in tqdm.tqdm(train_loader, desc='Epoch %3d' % (t + 1)):
            images, labels = images.cuda(), labels.cuda()
            scores = model(images)
            loss = loss_function(scores, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()
    

    标记平滑(label smoothing)

    for images, labels in train_loader:
        images, labels = images.cuda(), labels.cuda()
        N = labels.size(0)
        # C is the number of classes.
        smoothed_labels = torch.full(size=(N, C), fill_value=0.1 / (C - 1)).cuda()
        smoothed_labels.scatter_(dim=1, index=torch.unsqueeze(labels, dim=1), value=0.9)
    
        score = model(images)
        log_prob = torch.nn.functional.log_softmax(score, dim=1)
        loss = -torch.sum(log_prob * smoothed_labels) / N
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    

    Mixup

    beta_distribution = torch.distributions.beta.Beta(alpha, alpha)
    for images, labels in train_loader:
        images, labels = images.cuda(), labels.cuda()
    
        # Mixup images.
        lambda_ = beta_distribution.sample([]).item()
        index = torch.randperm(images.size(0)).cuda()
        mixed_images = lambda_ * images + (1 - lambda_) * images[index, :]
    
        # Mixup loss.    
        scores = model(mixed_images)
        loss = (lambda_ * loss_function(scores, labels) 
                + (1 - lambda_) * loss_function(scores, labels[index]))
    
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
    

    得到当前学习率

    # If there is one global learning rate (which is the common case).
    lr = next(iter(optimizer.param_groups))['lr']
    
    # If there are multiple learning rates for different layers.
    all_lr = []
    for param_group in optimizer.param_groups:
        all_lr.append(param_group['lr'])
    

    保存与加载断点

    # Save checkpoint.
    is_best = current_acc > best_acc
    best_acc = max(best_acc, current_acc)
    checkpoint = {
        'best_acc': best_acc,    
        'epoch': t + 1,
        'model': model.state_dict(),
        'optimizer': optimizer.state_dict(),
    }
    model_path = os.path.join('model', 'checkpoint.pth.tar')
    torch.save(checkpoint, model_path)
    if is_best:
        shutil.copy('checkpoint.pth.tar', model_path)
    
    # Load checkpoint.
    if resume:
        model_path = os.path.join('model', 'checkpoint.pth.tar')
        assert os.path.isfile(model_path)
        checkpoint = torch.load(model_path)
        best_acc = checkpoint['best_acc']
        start_epoch = checkpoint['epoch']
        model.load_state_dict(checkpoint['model'])
        optimizer.load_state_dict(checkpoint['optimizer'])
        print('Load checkpoint at epoch %d.' % start_epoch)
    

    计算准确率、查准率(precision)、查全率(recall)

    # data['label'] and data['prediction'] are groundtruth label and prediction 
    # for each image, respectively.
    accuracy = np.mean(data['label'] == data['prediction']) * 100
    
    # Compute recision and recall for each class.
    for c in range(len(num_classes)):
        tp = np.dot((data['label'] == c).astype(int),
                    (data['prediction'] == c).astype(int))
        tp_fp = np.sum(data['prediction'] == c)
        tp_fn = np.sum(data['label'] == c)
        precision = tp / tp_fp * 100
        recall = tp / tp_fn * 100
    

    其他注意事项

    模型定义

    • 建议有参数的层和汇合(pooling)层使用 torch.nn 模块定义,激活函数直接使用 torch.nn.functional。torch.nn 模块和 torch.nn.functional 的区别在于,torch.nn 模块在计算时底层调用了 torch.nn.functional,但 torch.nn 模块包括该层参数,还可以应对训练和测试两种网络状态。使用 torch.nn.functional 时要注意网络状态,如:

    • def forward(self, x):
        ...
        x = torch.nn.functional.dropout(x, p=0.5, training=self.training)
      
      
    • model(x) 前用 model.train() 和 model.eval() 切换网络状态。

    • 不需要计算梯度的代码块用 with torch.no_grad() 包含起来。model.eval() 和 torch.no_grad() 的区别在于,model.eval() 是将网络切换为测试状态,例如 BN 和随机失活(dropout)在训练和测试阶段使用不同的计算方法。torch.no_grad() 是关闭 PyTorch 张量的自动求导机制,以减少存储使用和加速计算,得到的结果无法进行 loss.backward()。

    • torch.nn.CrossEntropyLoss 的输入不需要经过 Softmax。torch.nn.CrossEntropyLoss 等价于 torch.nn.functional.log_softmax + torch.nn.NLLLoss。

    • loss.backward() 前用 optimizer.zero_grad() 清除累积梯度。optimizer.zero_grad() 和 model.zero_grad() 效果一样。

    PyTorch 性能与调试

    • torch.utils.data.DataLoader 中尽量设置 pin_memory=True,对特别小的数据集如 MNIST 设置 pin_memory=False 反而更快一些。num_workers 的设置需要在实验中找到最快的取值。
    • 用 del 及时删除不用的中间变量,节约 GPU 存储。
    • 使用 inplace 操作可节约 GPU 存储。
    • 减少 CPU 和 GPU 之间的数据传输。例如如果你想知道一个 epoch 中每个 mini-batch 的 loss 和准确率,先将它们累积在 GPU 中等一个 epoch 结束之后一起传输回 CPU 会比每个 mini-batch 都进行一次 GPU 到 CPU 的传输更快。
    • 使用半精度浮点数 half() 会有一定的速度提升,具体效率依赖于 GPU 型号。需要小心数值精度过低带来的稳定性问题。
    • 时常使用 assert tensor.size() == (N, D, H, W) 作为调试手段,确保张量维度和你设想中一致。
    • 除了标记 y 外,尽量少使用一维张量,使用 n*1 的二维张量代替,可以避免一些意想不到的一维张量计算结果。
  • 相关阅读:
    WebSVN 2.3.3
    webSVN客户端(转) initOS的日志 网易博客
    分享:httping 2.0 发布,测试 HTTP 连接的工具
    ThriftUsageC++ Thrift Wiki
    硬盘接口:SCSI、IDE与SATA的区别
    把event sql导入数据库难点
    cpu插槽 LGA 1155
    Processing简介
    技嘉主板GAB75MD3V产品规格
    pci Express
  • 原文地址:https://www.cnblogs.com/lwp-nicol/p/15428567.html
Copyright © 2020-2023  润新知