Richard Zhang 2018

Richard Zhang 2018
Abstract

现在广泛使用的 PSNR 和 SSIM 在解释人类感知上失败了. 但是, Recently, the deep learning community has found that features of the VGG networktrained on ImageNet classification has been remarkably useful as a training loss for image synthesis.
( VGG 是牛津大学视觉几何小组提出的用于图像识别的卷积神经网络模型，其中VGG16为16权层的VGG模型，VGG19为19权层的VGG模型。)

We find that deep features outperform all previous metrics by large margins on our dataset.

More surprisingly, this result is not restricted to ImageNet-trained VGG features, but holds across different deep architectures and levels of supervision (supervised, self-supervised, or even unsupervised).

Our results suggest that perceptual similarity is an emergent property(自然属性) shared across deep visual representations.

时间线:

DL community 发现在在 VGG network 上训练的 deep feature 作为图像合成的训练损失具有重要的实用价值。

$⟹$ 受到启发，发现在我们的数据集上，深度特性比所有以前的度量指标都要好得多。

$⟹$ 进一步发现, 这个结果适用于不同的深层架构和监督类型

$⟹$ 最终研究表明 perceptual similarity is an emergent property shared across deep visual representations.

1. Motivation

The very notion of visual similarity is often subjective, aiming to mimic human visual perception.

传统的算法并不能很好地完成这个任务. 现在已经有很多用于在某种程度上衡量两幅图像的相似程度的算法被提出, 比如 SSIM, MSSIM, FSIM, HDR-VDP.

在 cv comunity 的发现基础上, We find that internal activations of networks trained for high-level classification tasks, even across network architectures and no further calibration, do indeed correspond to human perceptual judgments.

研究表明, 深度神经网络比简单的拟合算法在处理任务时有着先天的优势, 但是当取随机初始值时, 学习器的性能会大打折扣.
- Our contributions are as follows
1. We introduce a large-scale, highly varied, perceptual similarity dataset, containing 484k human judgments.
2. We show that deep features ( trained on supervised, self-supervised, and unsupervised objectives alike, model low-level perceptual similarity surprisingly well ) outperforming previous, widely-used metrics.
3. 训练过的神经网络才具有较好的性能.
4. With our data, we can improve performance by “calibrating” feature responses from a pre-trained network
- Prior work on datasets
  
  Our study is using a large set of distortions and real algorithm outputs. It contains both 1. traditional distortions, and 2. CNN-based algorithm outputs. 3. We also collect judgments on outputs from real algorithms for the tasks of superresolution, frame interpolation, and image deblurring, which is especially important as these are the real-world use cases for a perceptual metric .
  
  Our dataset is focused on perceptual similarity, rather than quality assessment.
  
  Additionally, it is collected on patches as opposed to full images, in the wild, with a different experimental design
- Prior work on deep networks and human judgments
  
  讲述了其他人的研究情况
2. Berkeley-Adobe Perceptual Patch Similarity (BAPPS) Dataset

2.1. Distortions
- Traditional distortions
- CNN-based distortions
- Distorted image patches from real algorithms
- Superresolution
- Frame interpolation
- Video deblurring
- Colorization
2.2. Psychophysical Similarity Measurements
- 2AFC similarity judgments
- Just noticeable differences (JND)
3. Deep Feature Spaces
- Network architectures
  
  We evaluate the SqueezeNet, AlexNet, and VGG architectures. Finally, the SqueezeNet was designed to be extremely lightweight in size, with similar classification performance to AlexNet. We use the first Conv layer and subsequent “fire” modules.
  
  We additionally evaluate self-supervised methods.
- Network activations to distance
  
  ? ? ?
- Training on our data
4. Experiments

4.1. Evaluations
- How well do low-level metrics and classification networks perform?
- Does the network have to be trained on classification?
- Do metrics correlate across different perceptual tasks?
- Can we train a metric on traditional and CNN-based distortions?
- Does training on traditional and CNN-based distortions transfer to real-world scenarios?
- Where do deep metrics and low-level metrics disagree?
5. Conclusions
- Acknowledgements
相关阅读:
【bzoj题解】2186 莎拉公主的困惑
 【算法学习】整体二分
 【算法学习】【洛谷】cdq分治 & P3810 三维偏序
 【比赛游记】NOIP2017游记
 【0】如何在电脑中使用多个python版本【python虚拟环境配置】
Mysql 安装服务无法启动解决方案与使用的一般使用指令
 4-urllib库添加代理，添加请求头格式模板
 3-urllib的post请求方式
 02-urllib库的get请求方式
 01-urllib库添加headers的一般方法
原文地址：https://www.cnblogs.com/larkiisready/p/11681616.html

Richard Zhang 2018

Abstract

1. Motivation

2. Berkeley-Adobe Perceptual Patch Similarity (BAPPS) Dataset

2.1. Distortions

2.2. Psychophysical Similarity Measurements

3. Deep Feature Spaces

4. Experiments

4.1. Evaluations

5. Conclusions