• 论文阅读笔记A Latent Transformer for Disentangled Face Editing in Images and Videos


    论文题目:应用于图像和视频解纠缠面部编辑的潜在转换器

    一、introduction and related work(记了一些关键语句)

    (1)研究表明,在生成模型的潜在空间中,沿特定方向移动潜在代码可以导致相应生成图像中视觉属性的不变性。 

    (2)Firstly, successful manipulations can only be achieved in well disentangled and linearized latent spaces

    (3)用线性变换对人脸属性进行操作是非常有局限性的。

    (4)the state-of-the-art image generator to project real image to latent space:stylegan

    (5)The transformation network generates disentangled,identity-preserving and controllable attribute editing resultson real images

    (6)有关disentangled representations相关的工作

    • One  optimization-based  method,  Im-age2StyleGAN++ , carried out local editing along with global semantic edits on images by applying masked interpolation on the activation features of StyleGAN(?这是什么)
    • Collinsetal. performed a k-means clustering on the activations of StyleGAN and detected a disentanglement of semantic objects,  which enables further local semantic editing on the generated image
    • For high level semantic edits, Ganalyze[13] learned a manifold in the latent space of BigGAN [5] togenerate images of different memorability. 
    • InterFaceGAN[35] proposed to learn a hyper-plane for a binary classifi-cation in the latent space, which one can use to manipulatethe target facial attribute by simple interpolation.  Follow-ing their work,  StyleSpace [42] carried out a quantitativestudy on the latent spaces of StyleGAN [21] and realized ahighly localized and disentangled control of the visual attributes.
    •  StyleFlow [3] achieved conditional exploration ofthe latent space by training conditional normalizing flows.
    • 还有很多,具体看论文related work部分

    二、contributions

    We propose a latent transformation network for facial attribute editing, achieving disentangled and controllable manipulations on real images with good identity preservation. 

    Our method can carry out efficient sequential attribute editing on real images. 

    We introduce a pipeline to generalize the face editing to videos and generate realistic and stable manipulations on high resolution videos.

    三、method

    1、we propose a framework to edit faces inreal images and videos via the latent space of StyleGAN.

    2、假设总共有n个属性a,对于每个不同的attributes训练不同的transformer

    3、为了从latent code中predict attributes,用了一个latent classifier C,C是pre-trained

    Latent Classifier:To predict attributes on the manipu-lated latent codes, we train an attribute classifierC on the“latent code - label” pairs. 

    The classifier consists of three fully connected layers with ReLU activations in between.C is fixed during the training of the latent transformer.

    面部属性分类器引用于:(Harness-ing synthesized abstraction images to improve facial attributerecognition)

    4.Given a latent code w∈ W+,the latent transformer T generates the direction for a single attribute modification, where the amount of changes is controlled by a scaling factor α. The network is expressed with a single layer of linear transformation

     5.loss function

     四、evaluation metrics

    1、quantitative 

    We compare our method quantitatively with GANSpace and  InterFaceGAN  using  three  metrics: 

    (1) target  attribute change rate

    (2)attribute preservation rate

    (3)identity preser-vation score

    2、qualitative

  • 相关阅读:
    使用python将文字写入word文档中
    将图片显示到excel中
    新的写入xlsxwriter和追加写入openpyxl
    oracle 12.2 alter table move online
    主从复制管理和故障处理方法
    MySQL中的权限管理
    windows的CMD如何全屏最大化
    Troubleshooting query v$asm_disk v$asm_diskgroup hang
    library cache锁争用解决
    一则由ORA-1652引起的fixed object相关问题
  • 原文地址:https://www.cnblogs.com/h694879357/p/15528988.html
Copyright © 2020-2023  润新知