• TBDR下msaa 在metal vulkan和ogles的解决方案


    https://developer.arm.com/solutions/graphics/developer-guides/understanding-render-passes/multi-sample-anti-aliasing

    msaa在local mem上做很省但是 带宽不处理多出8x  对于4xmsaa

    计算如下

    处理前

    python
    bytesPerFrame4x = 2560 * 1440 * 4 * 4
    bytesPerFrame1x = 2560 * 1440 * 4 * 1
    
    # Additional 4x bandwidth is doubled because the additional samples
    # are written by one pass and then re-read to resolve the final color
    bytesPerFrame = ((bytesPerFrame4x * 2) + bytesPerFrame1x)
    bytesPerSecond = bytesPerFrame * 60
                   = 7.9 GB/s

    处理后


    python bytesPerFrame1x = 2560 * 1440 * 4 * 1 # All additional 4x bandwidth is kept entirely inside the tile memory bytesPerSecond = bytesPerFrame1x * 60 = 884 MB/s

    处理的方法就是load store action选一共1x那种

    vulkan和metal都可以这样处理 之前有发blog

    https://www.cnblogs.com/minggoddess/p/10950349.html

    vulkan还要用

    using VK_MEMORY_PROPERTY_LAZILY_ALLOCATED_BIT and constructing the VkImage with VK_IMAGE_USAGE_TRANSIENT_ATTACHMENT_BIT.做memoryless

    metal直接设置memoryless

    对于

    ogles

    用如下扩展

    [EXT_multisampled_render_to_texture][EXT_msaa] extension

    https://www.khronos.org/registry/OpenGL/extensions/EXT/EXT_multisampled_render_to_texture.txt

       This extension introduces functionality to perform multisampled 
        rendering to a color renderable texture, without requiring an 
        explicit resolve of multisample data. 

    Some GPU architectures - such as tile-based renderers - are capable of performing multisampled rendering by storing multisample data in internal high-speed memory and downsampling the data when writing out to external memory after rendering has finished. Since per-sample data is never written out to external memory, this approach saves bandwidth and storage space. In this case multisample data gets discarded, however this is acceptable in most cases.

     自动resolve不用显示resovle了 在tile上还可以省3x store 和footprint

    FramebufferTexture2DMultisampleEXT
    RenderbufferStorageMultisampleEX

     还有depthstencil的

    所有tbdr下这套解决方案 在ogles 要用extension unity有实现 之后会验下数据

     memoryless其实是个metal和vulkan才有的概念

    msaa情况下完全对应上面这个扩展1x store ok了

    unity里面根据rendertexture的descriptor 

    antiAliasing会自动开这个扩展相关代码 glRenderbufferStorageMultisample

     ======================

    unity里面对msaa自动开了 glRenderbufferStorageMultisample

    这需要capabilities  HasMultisample

    ogles3 或者

    HasMultiSampleAutoResolve 这个capa对应以下两个扩展

    kGL_EXT_multisampled_render_to_texture

    kGL_IMG_multisampled_render_to_texture

    force-clamped是啥

    kGL_EXT_multisampled_render_to_texture

    glRenderbufferStorageMultisampleEXT

    glFramebufferTexture2DMultisampleEXT

    Mali用的这组

    kGL_IMG_multisampled_render_to_texture

    glRenderbufferStorageMultisampleIMG

    glFramebufferTexture2DMultisampleIMG

    kGL_APPLE_framebuffer_multisample

    glRenderbufferStorageMultisampleAPPLE

    glResolveMultisampleFramebufferAPPLE

    有metal这个不用管了

    kGL_NV_framebuffer_multisample

    kGL_NV_framebuffer_blit

    glRenderbufferStorageMultisampleNV

    -------

    ----------------------------------------------------

    下面就是profiler的数据了 这部分好诡异 好难理解

    开了msaa 

    read memory和write mem 大幅下降  如果只是shaderbusy也说不通 这个降幅

    --snapdragon845

    和后处理有关 应该和msaa没什么关系

  • 相关阅读:
    《ASP.NET Core 高性能系列》致敬伟大的.NET斗士甲骨文!
    ThreadLocal<T>的是否有设计问题
    从.NET和Java之争谈IT这个行业
    自建型呼叫中心
    托管型呼叫中心
    数字语音记录仪3.0
    模拟电话录音系统2.0
    easyui-combobox 下拉菜单 多选 multiple
    利用easyui-combotree实现 下拉菜单 多选功能(带多选框)
    eclipse项目导入 idea中
  • 原文地址:https://www.cnblogs.com/minggoddess/p/11240211.html
Copyright © 2020-2023  润新知