======================================================
在 https://github.com/pytorch/vision/issues/3657 中对原始预处理过程中的具体代码形式进行了讨论:
给出了对原始方法猜测后的具体的预处理数据的代码形式:
import torch
from torchvision import datasets, transforms as T
transform = T.Compose([T.Resize(256), T.CenterCrop(224), T.ToTensor()])
dataset = datasets.ImageNet(".", split="train", transform=transform)
means = []
variances = []
for img in subset(dataset):
means.append(torch.mean(img))
variances.append(torch.std(img)**2)
mean = torch.mean(torch.stack(means), axis=0)
std = torch.sqrt(torch.mean(torch.stack(variances), axis=0))
回答:
从回答上可以看到原始计算的时候采用了这个形式的计算,部分内容在:
https://github.com/pytorch/vision/pull/1965 给出了更具体的解释:
重点说明:
We know that they were calculated them on a random subset of the train
split of the ImageNet2012
dataset. Which images were used or even the sample size as well as the used transformation are unfortunately lost.
同时作者对自己复现出的结果和原始结果的差距做了猜测和解释:
In #1439 my calculated std
s differed significantly from the values we used. This resulted from the fact that I previously used sqrt(mean([var(img) for img in dataset]))
while we probably used mean([std(img) for img in dataset])
. You can find the script I've used for all calculations here.
作者在上一次复现的时候使用的代码:
sqrt(mean([var(img) for img in dataset]))
但是原始结果中的代码可能是:
mean([std(img) for img in dataset])
作者又给出了新的计算代码:
https://gist.github.com/pmeier/f5e05285cd5987027a98854a5d155e27
============================================================