• [转]顶点数据压缩


    http://www.cnblogs.com/oiramario/archive/2012/09/26/2703277.html

    看过敏敏的http://www.klayge.org/2012/09/21/%E5%8E%8B%E7%BC%A9tangent-frame/

    今年2、3月份曾经整过这玩意,做到用tangent.w来存handedness,解决了uv mirror的问题

    没想到顶点数据压缩还有这么深的学问,于是乎按照资料对max插件进行了修改,效果超出想象

    目前做到使用unsigned char x 4来存normal和tangent,short x 2来存texcoord,我们可以大致算一下

    之前是normal = float x 3,tangent = float x 4,texcoord = float x 2(还要看一共有几层uv) ,一共是12 + 16 + 8 = 36

    压缩之后变成normal = unsigned char x 4,tangent = unsigned char x 4,texcoord = short x 2,一共是4 + 4 + 4 = 12

    每个顶点从36字节减少到12字节,少了一半多,通过观察一个20000多面的模型,mesh的大小从1388KB减少到552KB,压缩后是原大小的0.39倍

    还没有像文中介绍的那样将tangent frame压缩到仅用8个字节的程度

    其优点是数据量大大减少,这样vertex cache的命中率会提高,据观察fps有约5%的提高

    其缺点是vs中的计算量稍微增加了一些,另外压缩导致精度上会有损失 

    复制代码
    float f = 0.1234567f;
    unsigned char uc = (unsigned char)((f * 0.5f + 0.5f) * 255);
    short s = (short)((f * 0.5f + 0.5f) * 32767.0f);

    float unpackuc = uc * 2.0f / 255.0f - 1.0f;
    float unpacks = s * 2.0f / 32767.0f - 1.0f;


    unpackuc = 0.12156863
    unpacks = 0.12344737
    复制代码

    参考资料:

    http://www.humus.name/Articles/Persson_CreatingVastGameWorlds.pdf 

    http://www.crytek.com/download/izfrey_siggraph2011.pdf 

    http://fabiensanglard.net/dEngine/index.php 

    http://oddeffects.blogspot.com/2010/09/optimizing-vertex-formats.html

    注意:

    在声明顶点元素时,使用UBYTE4或者SHORT4。

    D3DVERTEXELEMENT9 declExt[] = {
     // stream, offset, type, method, usage, usageIndex
     { 0, 0, D3DDECLTYPE_FLOAT3, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_POSITION, 0 },
     { 0, 12, D3DDECLTYPE_UBYTE4, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_NORMAL, 0 },
     // 2d uv
     { 0, 16, D3DDECLTYPE_FLOAT2, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 0 },
     { 0, 24, D3DDECLTYPE_SHORT2N, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 1 },
     // tangent
     { 0, 28, D3DDECLTYPE_UBYTE4, D3DDECLMETHOD_DEFAULT, D3DDECLUSAGE_TEXCOORD, 2 },
     D3DDECL_END()
    };

    但是在着色器中直接使用float4作为输入,GPU会自动转换。

    float4 normal : NORMAL;

    或:
    float4 normal : BLENDINDICES;

    有些显卡不支持UBYTE4类型的NORMAL语法输入,可尝试作为BLENDINDICES使用。这也是UBYTE4常用的方式。

    ================================================

    Pack the normals into the w value of the position of each vertex, then you should be able to do something similar to this to read it back, and then you just need to convert it back to a normal vector in the shader (multiply by 2, then subtract 1).

    To pack the normal into a float you should be able to use something like this (not tested and should probably use the proper casts instead of C style casts, and the normal needs to be normalized):

    float PackNormal(const Vector3& normal)
    {
       //Use 127.99999f instead of 128 so that if the value was 1 it won't be 256 which screws things up
       unsigned int packed = (unsigned int)((normal.x + 1.0f) * 127.99999f);
       packed += (unsigned int)((normal.y + 1.0f) * 127.99999f) << 8;
       packed += (unsigned int)((normal.z + 1.0f) * 127.99999f) << 16;
    
       return *((float*)(&packed));
    }
  • 相关阅读:
    CPU的物理限制
    递归快还是循环(迭代)快?
    VS2010下测试程序性能瓶颈
    Qt编程之实现在QFileDialog上添加自定义的widget
    This application failed to start because it could not find or load the Qt platform plugin "windows"
    网络设备Web登录检测工具device-phamer
    Outlook数据提取工具readpst
    Xamarin无法调试Android项目
    Web应用扫描工具Wapiti
    Xamarin 2017.11.1更新
  • 原文地址:https://www.cnblogs.com/pulas/p/3380681.html
Copyright © 2020-2023  润新知