• Transformer总结


    Contents

    Attention

    • Recurrent Models of Visual Attention [2014 deepmind NIPS]
    • Neural Machine Translation by Jointly Learning to Align and Translate [ICLR 2015]

    OverallSurvey

    • Efficient Transformers: A Survey [paper]
    • A Survey on Visual Transformer [paper]
    • Transformers in Vision: A Survey [paper]

    NLP

    Language

    • Sequence to Sequence Learning with Neural Networks [NIPS 2014] [paper] [code]
    • End-To-End Memory Networks [NIPS 2015] [paper] [code]
    • Attention is all you need [NIPS 2017] [paper] [code]
    • Bidirectional Encoder Representations from Transformers: BERT [paper] [code] [pretrained-models]
    • Reformer: The Efficient Transformer [ICLR2020] [paper] [code]
    • Linformer: Self-Attention with Linear Complexity [AAAI2020] [paper] [code]
    • GPT-3: Language Models are Few-Shot Learners [NIPS 2020] [paper] [code]

    Speech

    • Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation [INTERSPEECH 2020] [paper] [code]

    CV

    Backbone_Classification

    Papers and Codes

    • CoaT: Co-Scale Conv-Attentional Image Transformers [arxiv 2021] [paper] [code]
    • SiT: Self-supervised vIsion Transformer [arxiv 2021] [paper] [code]
    • VIT: An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [VIT] [ICLR 2021] [paper] [code]
      • Trained with extra private data: do not generalized well when trained on insufficient amounts of data
    • DeiT: Data-efficient Image Transformers [arxiv2021] [paper] [code]
      • Token-based strategy and build upon VIT and convolutional models
    • Transformer in Transformer [arxiv 2021] [paper] [code1] [code-official]
    • OmniNet: Omnidirectional Representations from Transformers [arxiv2021] [paper]
    • Gaussian Context Transformer [CVPR 2021] [paper]
    • General Multi-Label Image Classification With Transformers [CVPR 2021] [paper] [code]
    • Scaling Local Self-Attention for Parameter Efficient Visual Backbones [CVPR 2021] [paper]
    • T2T-ViT: Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet [ICCV 2021] [paper] [code]
    • Swin Transformer: Hierarchical Vision Transformer using Shifted Windows [ICCV 2021] [paper] [code]
    • Bias Loss for Mobile Neural Networks [ICCV 2021] [paper] [[code()]]
    • Vision Transformer with Progressive Sampling [ICCV 2021] [paper] [[code(https://github.com/yuexy/PS-ViT)]]
    • Rethinking Spatial Dimensions of Vision Transformers [ICCV 2021] [paper] [code]
    • Rethinking and Improving Relative Position Encoding for Vision Transformer [ICCV 2021] [paper] [code]

    Interesting Repos

    Self-Supervised

    • Emerging Properties in Self-Supervised Vision Transformers [ICCV 2021] [paper] [code]
    • An Empirical Study of Training Self-Supervised Vision Transformers [ICCV 2021] [paper] [code]

    Interpretability and Robustness

    • Transformer Interpretability Beyond Attention Visualization [CVPR 2021] [paper] [code]
    • On the Adversarial Robustness of Visual Transformers [arxiv 2021] [paper]
    • Robustness Verification for Transformers [ICLR 2020] [paper] [code]
    • Pretrained Transformers Improve Out-of-Distribution Robustness [ACL 2020] [paper] [code]

    Detection

    • DETR: End-to-End Object Detection with Transformers [ECCV2020] [paper] [code]
    • Deformable DETR: Deformable Transformers for End-to-End Object Detection [ICLR2021] [paper] [code]
    • End-to-End Object Detection with Adaptive Clustering Transformer [arxiv2020] [paper]
    • UP-DETR: Unsupervised Pre-training for Object Detection with Transformers [[arxiv2020] [paper]
    • Rethinking Transformer-based Set Prediction for Object Detection [arxiv2020] [paper] [zhihu]
    • End-to-end Lane Shape Prediction with Transformers [WACV 2021] [paper] [code]
    • ViT-FRCNN: Toward Transformer-Based Object Detection [arxiv2020] [paper]
    • Line Segment Detection Using Transformers [CVPR 2021] [paper] [code]
    • Facial Action Unit Detection With Transformers [CVPR 2021] [paper] [code]
    • Adaptive Image Transformer for One-Shot Object Detection [CVPR 2021] [paper] [code]
    • Self-attention based Text Knowledge Mining for Text Detection [CVPR 2021] [paper] [code]
    • Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions [ICCV 2021] [paper] [code]
    • Group-Free 3D Object Detection via Transformers [ICCV 2021] [paper] [code]
    • Fast Convergence of DETR with Spatially Modulated Co-Attention [ICCV 2021] [paper] [code]

    HOI

    • End-to-End Human Object Interaction Detection with HOI Transformer [CVPR 2021] [paper] [code]
    • HOTR: End-to-End Human-Object Interaction Detection with Transformers [CVPR 2021] [paper] [code]

    Tracking

    • Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking [CVPR 2021] [paper] [code]
    • TransTrack: Multiple-Object Tracking with Transformer [CVPR 2021] [paper] [code]
    • Transformer Tracking [CVPR 2021] [paper] [code]
    • Learning Spatio-Temporal Transformer for Visual Tracking [ICCV 2021] [paper] [code]

    Segmentation

    • SETR : Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers [CVPR 2021] [paper] [code]
    • Trans2Seg: Transparent Object Segmentation with Transformer [arxiv2021] [paper] [code]
    • End-to-End Video Instance Segmentation with Transformers [arxiv2020] [paper] [zhihu]
    • MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers [CVPR 2021] [paper] [official-code] [unofficial-code]
    • Medical Transformer: Gated Axial-Attention for Medical Image Segmentation [arxiv 2020] [paper] [code]
    • SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation [CVPR 2021] [paper] [code]

    Reid

    • Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer [CVPR 2021] [paper] [code]

    Localization

    • LoFTR: Detector-Free Local Feature Matching with Transformers [CVPR 2021] [paper] [code]
    • MIST: Multiple Instance Spatial Transformer [CVPR 2021] [paper] [code]

    Generation

    Inpainting

    • STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting [ECCV 2020] [paper] [code]

    Image enhancement

    • Pre-Trained Image Processing Transformer [CVPR 2021] [paper]
    • TTSR: Learning Texture Transformer Network for Image Super-Resolution [CVPR2020] [paper] [code]

    Pose Estimation

    • Pose Recognition with Cascade Transformers [CVPR 2021] [paper] [code]
    • TransPose: Towards Explainable Human Pose Estimation by Transformer [arxiv 2020] [paper] [code]
    • Hand-Transformer: Non-Autoregressive Structured Modeling for 3D Hand Pose Estimation [ECCV 2020] [paper]
    • HOT-Net: Non-Autoregressive Transformer for 3D Hand-Object Pose Estimation [ACMMM 2020] [paper]
    • End-to-End Human Pose and Mesh Reconstruction with Transformers [CVPR 2021] [paper] [code]
    • 3D Human Pose Estimation with Spatial and Temporal Transformers [arxiv 2020] [paper] [code]
    • End-to-End Trainable Multi-Instance Pose Estimation with Transformers [arxiv 2020] [paper]

    Face

    • Robust Facial Expression Recognition with Convolutional Visual Transformers [arxiv 2020] [paper]
    • Clusformer: A Transformer Based Clustering Approach to Unsupervised Large-Scale Face and Visual Landmark Recognition [CVPR 2021] [paper] [code]

    Video Understanding

    • Is Space-Time Attention All You Need for Video Understanding? [arxiv 2020] [paper] [code]
    • Temporal-Relational CrossTransformers for Few-Shot Action Recognition [CVPR 2021] [paper] [code]
    • Self-Supervised Video Hashing via Bidirectional Transformers [CVPR 2021] [paper]
    • SSAN: Separable Self-Attention Network for Video Representation Learning [CVPR 2021] [paper]

    Depth-Estimation

    • Adabins:Depth Estimation using Adaptive Bins [CVPR 2021] [paper] [code]

    Prediction

    • Multimodal Motion Prediction with Stacked Transformers [CVPR 2021] [paper] [code]
    • Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case [paper]
    • Transformer networks for trajectory forecasting [ICPR 2020] [paper] [code]
    • Spatial-Channel Transformer Network for Trajectory Prediction on the Traffic Scenes [arxiv 2021] [paper] [code]
    • Pedestrian Trajectory Prediction using Context-Augmented Transformer Networks [ICRA 2020] [paper] [code]
    • Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction [ECCV 2020] [paper] [code]
    • Hierarchical Multi-Scale Gaussian Transformer for Stock Movement Prediction [paper]
    • Single-Shot Motion Completion with Transformer [arxiv2021] [paper] [code]

    NAS

    PointCloud

    • Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [CVPR 2021] [paper] [code]
    • Point 4D Transformer Networks for Spatio-Temporal Modeling in Point Cloud Videos [CVPR 2021] [paper]

    Fashion

    • Kaleido-BERT:Vision-Language Pre-training on Fashion Domain [CVPR 2021] [paper] [code]

    Medical

    • Lesion-Aware Transformers for Diabetic Retinopathy Grading [CVPR 2021] [paper]

    Cross-Modal

    • Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers [CVPR 2021] [paper]
    • Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning [CVPR2021] [paper] [code]
    • Topological Planning With Transformers for Vision-and-Language Navigation [CVPR 2021] [paper]
    • Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos [CVPRR 2021] [paper]
    • VLN BERT: A Recurrent Vision-and-Language BERT for Navigation [CVPR 2021] [paper] [code]
    • Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling [CVPR 2021] [paper] [code]

    Reference

  • 相关阅读:
    【正则】——作业计算数学运算
    【面向对象】-类和对象作业
    【递归】
    笔记本linux问题记录
    监听器模式(Listener)
    Java引用类型
    Dubbo-服务发布-本地暴露/远程暴露
    完成一个Spring的自定义配置
    Dubbo SPI-Adaptive详解
    Dubbo-动态编译
  • 原文地址:https://www.cnblogs.com/isLinXu/p/16096908.html
Copyright © 2020-2023  润新知