Desktop Good performance is critical to the success of many games. Below are some simple guidelines for maximizing the speed of your game's graphical rendering. 好的性能,是很多游戏成功 的关键。下面是一些简单的指引,最大限度地提高你的游戏图形渲染速度。
Optimizing Meshes 优化网格You only pay a rendering cost for objects that have a Mesh Renderer attached and are within the view frustum. There is no rendering cost from empty GameObjects in the scene or from objects that are out of the view of any camera.您只需支付附有网格渲染器 (Mesh Renderer)的、而且在摄像机视景体内部的对象渲染的开销。有很多空的游 戏物体(GameObjects )在你的场景,并不产生渲染开销。
Modern graphics cards are really good at handling a lot of polygons but there is a significant overhead for each batch (ie, mesh) that you submit to the graphics card. So if you have a 100-triangle object it is going to be just as expensive to render as a 1500-triangle object. The "sweet spot" for optimal rendering performance is somewhere around 1500-4000 triangles per mesh.现在的显卡性能都很好,可 以处理大量的多边形,但您提交给显卡的每个批处理都会造成相当大的开销。如果 你有一个100个三角面的物件和有1500个三角面的物件渲染的开销一样大。大约每 1500-4000三角形一个网格,优化渲染性能"最合适"。
Usually, the best way to improve rendering performance is to combine objects together so that each mesh has around 1500 or more triangles and uses only one Material for the entire mesh. It is important to understand that combining two objects which don't share a material does not give you any performance increase at all. The most common reason for having multiple materials is that two meshes don't share the same textures, so to optimize rendering performance, you should ensure that any objects you combine share the same textures.通常,提高渲染性能最好的 办法是把对象合并在一起,使每个网格有1500左右或更多的三角面和整个网格仅使 用一个材质(Material )。重要的是要明白,只合并两个物体而没有共享材质,这 样不会给你带来任何性能提高。如果你想有效地合并,你需要确保你的网格结合后 ,只使用一种材质。多维材质最常见的原因是两个网格没有共享相同的纹理。所以 ,如果你要优化渲染性能,你需要确保合并的物体共享纹理。
However, when using many pixel lights in the Forward rendering path, there are situations where combining objects may not make sense, as explained below.然而,当在正向渲染路径下 使用一些像素灯,有一些情况会使得合并物体不奏效,下面解释说明。
Pixel Lights in the Forward Rendering Path 正 向渲染路径的像素灯Note: this applies only to the Forward rendering path. 注:这 仅适用于正向渲染路径。 If you use pixel lighting then each mesh has to be rendered as many times as there are pixel lights illuminating it. If you combine two meshes that are very far apart, it will increase the effective size of the combined object. All pixel lights that illuminate any part of this combined object will be taken into account during rendering, so the number of rendering passes that need to be made could be increased. Generally, the number of passes that must be made to render the combined object is the sum of the number of passes for each of the separate objects, and so nothing is gained by combining. For this reason, you should not combine meshes that are far enough apart to be affected by different sets of pixel lights. 如果您使用像素光照,那么 每个网格渲染的次数和被像素灯照亮的物体渲染的次数一样多。如果你把两个相距 很远的物体合并,这会增加物体的有效大小。照亮这个合并后物体的任何一小部分 的所有像素灯都会在渲染过程中计算。因此需要的渲染通道数量就会增加。一般情 况下,要渲染合并物体的通道数是每个单独物体的通道数之和,所以通过合并没有 得到好处。出于这个原因,你不应该把相距很远而不会同时受到不同的像素灯影响 的这些网格合并。
During rendering, unity finds all lights surrounding a mesh and calculates which of those lights affect it most. The Quality Settings are used to modify how many of the lights end up as pixel lights and how many as vertex lights. Each light calculates its importance based on how far away it is from the mesh and how intense its illumination is. Furthermore, some lights are more important than others purely from the game context. For this reason, every light has a Render Mode setting which can be set to Important or Not Important; lights marked as Not Important will typically have a lower rendering overhead.渲染网格时,Unity 找到网 格周围的所有灯光。然后计算出哪些灯光影响网格最大。 质量设置是用来修改最终的灯有多少是像素光照,有多少是顶 点光照。每个灯光根据灯光离网格的距离,和灯光的强度计算出它的重要性。取决 于游戏环境,有些灯比其他更重要。出于这个原因,每一个灯光可以设置渲染模式 ( Render Mode),可以设置重要(Important )或不重要(Not Important)。 灯光标记为不重要( Not Important)通常具有较低的渲染开销。
As an example, consider a driving game where the player's car is driving in the dark with headlights switched on. The headlights are likely to be the most visually significant light sources in the game, so their Render Mode would probably be set to Important. On the other hand, there may be other lights in the game that are less important (other cars' rear lights, say) and which don't improve the visual effect much by being pixel lights. The Render Mode for such lights can safely be set to Not Important so as to avoid wasting rendering capacity in places where it will give little benefit.举个例子,试想一下一款赛 车游戏,玩家的汽车打开车头灯在夜间行驶。车头灯是在游戏中最重要的灯光。出 于这个原因,车头灯的渲染模式应设置为"重要"(Important)。另一方面,在游 戏中其他不太重要的灯(比如汽车的尾灯),不会由像素灯而提升太多的视觉效果 。这种灯的渲染模式可以放心地设置为不重要 Not Important ,以避免在不会让 你得到多少好处的地方浪费渲染性能。
Per-Layer Cull Distances 每层消隐距离In some games, it may be appropriate to cull small objects more aggressively than large ones in order to reduce number of draw calls. For example, small rocks and debris could be made invisible at long distances while large buildings would still be visible. To accomplish this culling, you can put small objects into a separate layer and setup per-layer cull distances using the Camera.layerCullDistances script function.在一些游戏中,您可能需要 将小物件剔除,以减少绘图调用的数量。例如,在足够远的距离,大型建筑物仍然 可见,小石块和碎片可以隐藏掉。要做到这一点,小物件放入一个单独的层 (separate layer)和使用Camera.layerCullDistances函数,设置每一层 消隐距离。
Shadows 阴影If you are deploying for Desktop platforms then you should be careful when using shadows because they can add a lot of rendering overhead to your game if not used correctly. For further details, see the Shadows page.如果你是组建的目标是台式 机平台,那么你要注意阴影;阴影开销一般较大。如果不正确使用,它们可能会为 你的游戏带来大量的性能开销。有关阴影的更多细节,请阅读阴影页。
Note: Shadows are not currently supported on iOS or android devices.注意:请记住目前iOS或 Android设备不支持阴影。
See Also 另请参见
iOS A useful background to iOS optimization can be found on the iOS hardware page. IOS硬件页上可以找到iOS优化有用的资料。
Alpha-Testing (Alpha测试)Unlike desktop machines, iOS devices incur a high performance overhead for alpha- testing (or use of the discard and clip operations in pixel shaders). You should replace alpha-test shaders with alpha-blend if at all possible. Where alpha- testing cannot be avoided, you should keep the overall number of visible alpha-tested pixels to a minimum.与台式机不同,iOS设备的 alpha测试产生比较高的性能开销。您应该替换带有alpha混色的alpha 测试着色器 ,尽一切可能。alpha测试无法避免的,你应该保持整体可见alpha 测试的像素数 目减少到最低。
Vertex Performance 顶点性能Generally you should aim to have no more than 40,000 vertices visible per frame when targeting iPhone 3GS or newer devices. You should keep the vertex count below 10,000 for older devices equipped with the MBX GPU, such as the original iPhone, iPhone 3G and iPod Touch 1st and 2nd Generation.一般来说,针对iPhone 3GS 或更新的设备时,目标应该使每帧可见的顶点不超过40,000。在配备MBX GPU的旧 设备,你应该保持顶点数低于10,000,如原来的iPhone,iPhone 3G和iPod Touch一 代和第二代。
Lighting Performance 照明性能Per-pixel dynamic lighting will add significant rendering overhead to every affected pixel and can lead to objects being rendered in multiple passes. Avoid having more than one Pixel Light illuminating any single object and use directional lights as far as possible. Note that a Pixel Light is a one which has its Render Mode option set to Important.逐像素的动态照明将显着增 加每个受影响的像素的渲染开销,并可能导致对象多次渲染。避免多于一个像素灯 Pixel Light照亮任何单一的物件,并尽量使用方向灯。请注意,一个像素灯,渲 染模式选项设置为重要Important。
Per-vertex dynamic lighting can add significant cost to vertex transformations. Try to avoid situations where multiple lights illuminate any given object. For static objects, baked lighting is much more efficient.逐顶点的动态照明显着增加 顶点转变的开销。尽量避免多个灯照亮任何给定的物体的情况,对于静态对象,烘 焙照明更加高效。
Optimize Model Geometry (优化模型几何体) When optimizing the geometry of a model, there are two basic rules:优化模型的几何体时,有两个基本 规则:
请注意,图形硬件处理顶点 的实际数量通常和3D应用程序显示的不一样。建模应用程序通常显示几何顶点的数 量,例如构建模型不同角点的数量。
For a graphics card, however, some geometric vertices will need to be split into two or more logical vertices for rendering purposes. A vertex must be split if it has multiple normals, UV coordinates or vertex colors. Consequently, the vertex count in Unity is invariably a lot higher than the count given by the 3D application.然而,对于一个图形卡,将 需要一些几何顶点拆分成两个或两个以上的逻辑顶点来渲染。如果有多个法线,UV 坐标或顶点颜色的顶点必须分割。因此,在Unity 的顶点计数总是比3D应用程序计 数高了很多。
Texture Compression 纹理压缩Using iOS's native PVRT compression formats will decrease the size of your textures (resulting in faster load times and smaller memory footprint) and can also dramatically increase rendering performance. Compressed textures use only a fraction of the memory bandwidth needed for uncompressed 32bit RGBA textures. A comparison of uncompressed vs compressed texture performance can be found in the iOS Hardware Guide.使用IOS的原生PVRT压缩格式,将减少纹理的大小(结果是更 快的加载时间和较小的内存占用),也可以大大提高渲染性能。压缩纹理仅使用未 压缩32位的RGBA纹理所需的内存带宽的一小部分。未压缩与压缩纹理性能的比较, 可以在iOS硬件指南找到。
Some images are prone to visual artifacts in the alpha channels of PVRT-compressed textures. In such cases, you might need to tweak the PVRT compression parameters directly in your imaging software. You can do that by installing the PVR export plugin or using PVRTexTool from Imagination Tech, the creators of the PVRT format. The resulting compressed image file with a .pvr extension will be imported by the Unity editor directly and the specified compression parameters will be preserved.有些图片在PVRT压缩纹理的 alpha通道容易产生视觉缺陷。在这种情况下,你可能需要在图像处理软件直接调 整PVRT的压缩参数。你可以通过安装PVR导出插件PVR export plugin或使用 Imagination Tech的 PVRTexTool,用于 创建PVRT格式。产生的扩展名为 .pvr的压缩图像文件将通过Unity 编辑器直接导 入和指定的压缩参数将被保留。
If PVRT-compressed textures do not give good enough visual quality or you need especially crisp imaging (as you might for GUI textures, say) then you should consider using 16-bit textures instead of RGBA textures. By doing so, you will reduce the memory bandwidth by half.如果PVRT的压缩纹理没有给 出足够好的视觉质量,或者您需要特别明快的显像(可能是GUI的纹理),那么你 应该考虑使用16位的纹理,而不是RGBA的纹理。这样做,你将减少一半的内存带宽 。
Tips for writing high-performance shaders编写 高性能着色器的小技巧The GPUs on iOS devices have fully supported pixel and vertex shaders since the iPhone 3GS. However, the performance is nowhere near what you would get from a desktop machine, so you should not expect desktop shaders to port to iOS unchanged. Typically, shaders will need to be hand optimized to reduce calculations and texture reads in order to get good performance. 从iPhone 3GS开始,iOS设 备的GPU已经完全支持像素和顶点着色器。然而,性能远不及台式机,所以你不应 该指望台式机的着色器运用到iOS设备上效果会维持不变。通常情况下,着色器将 需要手工优化,以减少计算和纹理读取,以获得良好的性能。
Complex mathematical operations 复杂的数学运算 Transcendental mathematical functions (such as pow, exp, log, cos, sin, tan, etc) will tax the GPU greatly, so a good rule of thumb is to have no more than one such operation per fragment. Consider using lookup textures as an alternative where applicable.复杂的数学函数(如pow, exp,log,cos,sin,tan等等)会大大增加GPU负担,所以一个好的经验法则是, 每一个片段不超过一个这样的操作。考虑使用查找纹理作为替代品。
It is not advisable to attempt to write your own normalize, dot, inversesqrt operations, however. If you use the built-in ones then the driver will generate much better code for you.尝试编写自己的normalize ,dot,inversesqrt 等操作,这是不可取。然而如果您使用内置的驱动程序会为 你产生更好的代码。
Bear in mind also that the discard operation will make your fragments slower.紧记discard 操作,会使你 的片段速度变慢。
Floating point operations 浮点运算You should always specify the precision of floating point variables when writing custom shaders. It is critical to pick the smallest possible floating point format in order to get the best performance.编写自定义的着色器时,你 总是指定浮点变量精度。关键是挑选精度尽可能小的浮点格式,以获得最佳的性能 。
If the shader is written in GLSL ES then the floating point precision is specified as follows:-如果用GLSL ES书写的着色 器,浮点精确度规定如下: -
如果是用CG书写的着色器或 它是一个表面着色器,指定精度如下: -
有关着色器性能的更多细节 ,请阅读的着色器性能页面。
Hardware documentation 硬件说明文件Take your time to study Apple documentations on hardware and best practices for writing shaders. Note that we would suggest to be more aggressive with floating point precision hints however.花一点时间去学习苹果的文 档, hardware and best practices for writing shaders。
Bake Lighting into Lightmaps 烘焙光照到光照贴图 Bake your scene static lighting into textures using Unity built-in Lightmapper. The process of generating a lightmapped environment takes only a little longer than just placing a light in the scene in Unity, but:使用Unity内置的产生光照贴图工具,将你场景中的静态光照烘焙至 纹理。产生使用光照贴图的环境的过程仅仅比放置一个灯光在Unity的场景多花一 点点时间,但:
相同相机,如果被渲染的物 体使用相同的材质, Unity IOS能够运用多种内部优化,如:
所有这些优化会为您节省宝 贵的CPU周期。因此,把额外的工作放在合并纹理成单一的图集的和让物体使用相 同的材质,总会有回报的。做到这一点!
Simple Checklist to make Your Game Faster简要 清单,让你的游戏速度更快
Android Lighting Performance 光照 性能Per-pixel dynamic lighting will add significant cost to every affected pixel and can lead to rendering object in multiple passes. Avoid having more than one Pixel Light affecting any single object, prefer it to be a directional light. Note that Pixel Light is a light which has a Render Mode setting set to Important. 逐像素的动态照明将显着增 加每个受影响的像素的渲染开销,并可能导致对象多次渲染。避免多于一个像素灯 Pixel Light照亮任何单一的物件,并尽量使用方向灯。请注意,一个像素灯,渲 染模式选项设置为重要Important。
Per-vertex dynamic lighting can add significant cost to vertex transformations. Avoid multiple lights affecting single objects. Bake lighting for static objects.逐顶点的动态照明显着增加 顶点变换的开销。尽量避免多个灯照亮任何给定的物体的情况,对于静态对象,烘 焙静态物体的照明。
Optimize Model Geometry 优化模型几何体When optimizing the geometry of a model, there are two basic rules:优化模型的几何体时,有两 个基本规则:
请注意,图形硬件处理顶点 的实际数量通常和3D应用程序显示的不一样。建模应用程序通常显示几何顶点的数 量,例如构建模型不同角点的数量。
For a graphics card however, some vertices have to be split into separate ones. If a vertex has multiple normals (it's on a "hard edge"), or has multiple UV coordinates, or has multiple vertex colors, it has to be split. So the vertex count you see in Unity is almost always different from the one displayed in 3D application.然而,对于一个图形卡,将 需要一些几何顶点拆分成两个或两个以上的逻辑顶点来渲染。如果有多个法线("硬 边缘上"),或者多个UV坐标或有多个顶点颜色的顶点必须分割。因此,在Unity 的 顶点计数总是比3D应用程序计数高了很多。
Texture Compression 纹理压缩All Android devices with support for OpenGL ES 2.0 also support the ETC1 compression format; it's therefore encouraged to whenever possible use ETC1 as the prefered texture format. Using compressed textures is important not only to decrease the size of your textures (resulting in faster load times and smaller memory footprint), but can also increase your rendering performance dramatically! Compressed textures require only a fraction of memory bandwidth compared to full blown 32bit RGBA textures.支持OpenGL ES 2.0的所有 Android设备还支持ETC1压缩格式(ETC1 compression format) ,因此,它鼓励 尽可能优选使用ETC1纹理格式。使用压缩纹理不仅是重要的,以减少您的纹理的大 小(导致更快的加载时间和较小的内存占用),但也可以极大地提高渲染性能!压 缩纹理仅使用未压缩32位的RGBA纹理所需的内存带宽的一小部分。
If targeting a specific graphics architecture, such as the Nvidia Tegra or Qualcomm Snapdragon, it may be worth considering using the proprietary compression formats available on those architectures. The Android Market also allows filtering based on supported texture compression format, meaning a distribution archive (.apk) with for example DXT compressed textures can be prevented for download on a device which doesn't support it.如果针对一个特定的图形架 构,如Nvidia Tegra或者Qualcomm Snapdragon,可能要考虑在这些架构上使用专 有的压缩格式。 Android Market还可以根据支持的纹理压缩格式进行过滤, 意味 着分发的.apk包带有比如DXT压缩纹理, 可以防止下载的设备上不支持它。
Enable Mip Maps 启动多级纹理As a rule of thumb, always have Generate Mip Maps enabled. In the same way Texture Compression can help limit the amount of texture data transfered when the GPU is rendering, a mip mapped texture will enable the GPU to use a lower-resolution texture for smaller triangles. The only exception to this rule is when a texel (texture pixel) is known to map 1:1 to the rendered screen pixel, as with UI elements or in a pure 2D game.根据经验,总是启用生成多级纹理。当GPU渲染时,在同样的方式纹 理压缩可0以帮助限制纹理数据传输量。多级纹理让GPU能让较小的三角形使用较低 分辨率的纹理,此规则的唯一例外是当texel(纹理像素)1:1映射到渲染屏幕像 素,像UI元素或在纯2D游戏。
Tips for writing well performing shaders 编写性能好 的着色器的技巧Although all Android OpenGL ES 2.0 GPUs fully support pixel and vertex shaders, do not expect to grab a desktop shader with complex per-pixel functionality and run it on Android device at 30 frames per second. Most often shaders will have to be hand optimized, calculations and texture reads kept to a minimum in order to achieve good frame rates.尽管所有的Android的 OpenGL ES 2.0 GPU完全支持像素和顶点着色器,不要指望可以拿一个复杂的逐像 素功能的台式机着色器在Android设备上运行每秒30帧。大多数情况下着色器将必 须手工优化,计算和纹理读取保持在最低限度,以达到良好的帧速率。
Complex arithmetic operations 复杂的数学运算 Arithmetic operations such as pow, exp, log, cos, sin, tan etc heavily tax GPU. Rule of thumb is to have not more than one such operation per fragment. Consider that sometimes lookup textures could be a better alternative.复杂的数学函数(如pow, exp,log,cos,sin,tan等等)会大大增加GPU负担,所以一个好的经验法则是, 每一个片段不超过一个这样的操作。考虑使用查找纹理作为替代品。
Do NOT try to roll your own normalize, dot, inversesqrt operations however. Always use built-in ones -- this was driver will generate much better code for you.不要尝试编写自己的 normalize,dot,inversesqrt 等操作。然而如果您使用内置的,会为你产生更好 的代码。
Keep in mind that discard operation will make your fragments slower.紧记discard 操作,会使你 的帧速度变慢。
Floating point operations 浮点运算Always specify precision of the floating point variables while writing custom shaders. It is crucial to pick smallest possible format in order to achieve best performance.编写自定义的着色器时,你 总是指定浮点变量精度。关键是挑选精度尽可能小的浮点格式,以获得最佳的性能 。
If shader is written in GLSL ES, then precision is specified as following:如果用GLSL ES编写的着色 器,浮点精确度规定如下: -
如果是用CG编写的着色器或 是一个表面着色器,指定精度如下: -
有关着色器性能的更多细节 ,请阅读的着色器性能页面。引述的性 能数据是基于PowerVR图形架构,可用设备如Samsung Nexus S。其他硬件架构可能 遇到从使用减少的寄存器精度或多或少的好处。
Bake Lighting into Lightmaps 烘焙光照到光照贴图 Bake your scene static lighting into textures using Unity built-in Lightmapper. The process of generating a lightmapped environment takes only a little longer than just placing a light in the scene in Unity, but:使用Unity内置的产生光照 贴图的工具,将你场景中的静态光照烘焙至纹理。产生使用光照贴图的环境的过程仅仅比放置一个灯光在Unity的场景多花 一点点时间,但:
相同相机,如果被渲染的物 体使用相同的材质, Unity Android能够运用多种内部优化,如:
所有这些优化会为您节省宝 贵的CPU周期。因此,把额外的工作放在合并纹理成单一的图集的和让物体使用相 同的材质,总会有回报的。做到这一点!
Simple Checklist to make Your Game Faster简要 清单,让你的游戏速度更快
From Others |