• Video Architecture Search


    Video Architecture Search

    2019-10-20 06:48:26

     

    This blog is from: https://ai.googleblog.com/2019/10/video-architecture-search.html 

     

    Examples of various EvaNet architectures. Each colored box (large or small) represents a layer with the color of the box indicating its type: 3D conv. (blue), (2+1)D conv. (orange), iTGM (green), max pooling (grey), averaging (purple), and 1x1 conv. (pink). Layers are often grouped to form modules (large boxes). Digits within each box indicate the filter size.

    The representative AssembleNet model evolved using the Moments-in-Time dataset. A node corresponds to a block of spatio-temporal convolutional layers, and each edge specifies their connectivity. Darker edges mean stronger connections. AssembleNet is a family of learnable multi-stream architectures, optimized for the target task.
    A figure comparing AssembleNet with state-of-the-art, hand-designed models on Charades (left) and Moments-in-Time (right) datasets. AssembleNet-50 or AssembleNet-101 has an equivalent number of parameters to a two-stream ResNet-50 or ResNet-101.

    TinyVideoNet (TVN) architectures evolved to maximize the recognition performance while keeping its computation time within the desired limit. For instance, TVN-1 (top) runs at 37 ms on a CPU and 10ms on a GPU. TVN-2 (bottom) runs at 65ms on a CPU and 13ms on a GPU.
    CPU runtime of TinyVideoNet models compared to prior models (left) and runtime vs. model accuracy of TinyVideoNets compared to (2+1)D ResNet models (right). Note that TinyVideoNets take a part of this time-accuracy space where no other models exist, i.e., extremely fast but still accurate.

  • 相关阅读:
    如何最大限度提高.NET的性能
    Webserivce简单安全验证
    一些NLP相关的JD,作参考
    拼多多的故事
    storm的一些相关文章
    这篇文章不错,仔细读读,码农晋升为技术管理者后,痛并快乐着的纠结内心
    protobuf的反射机制
    如何清理Docker占用的磁盘空间?
    经典面试题:浏览器是怎样解析CSS的?
    代码编辑器横评:为什么 VS Code 能拔得头筹
  • 原文地址:https://www.cnblogs.com/wangxiaocvpr/p/11706640.html
Copyright © 2020-2023  润新知