pytorch DataLoader extremely slow first epoch
在检查pytorch dataloader加载速度时发现,在第一次加载数据集时非常的慢。
例如:
data_loader = DataLoader(dataset=data_set, batch_size=64, num_workers=2, shuffle=True, pin_memory=False, drop_last=True)
TT = []; for i in range(3): S = time.time() for index, (input, target) in enumerate(data_loader): print(index) E = time.time() T = E - S TT.append(T) print(TT) #[75.70432996749878, 5.695326089859009, 5.47631311416626]
首次加载数据花了75s,后续加载数据均为5s左右。
Pytorch's dataloader is too slow when processing large dataset.