“Unity Mobile Game Performance Optimization Basics” starts from some basic discussions on Unity mobile game optimization, exemplifies and analyzes some of the most common performance problems in mobile game projects developed with Unity in recent years, and shows how to use the UWA’s performance detection tools to identify and address these issues. The contents include the basic logic of performance optimization, UWA performance detection tools, and common performance problems, hoping to provide Unity developers with more efficient R&D methods and practical experience.
This article will deliver the second part of the series about resource memory, Mono heap memory, and other common game memory control, with a total of 13 sections, including texture resources, mesh resources, animation resources, audio resources, material resources, and other resource memory and Mono heap Memory and other common game memory control explanations.
The first part of the series, “Unity Mobile Game Performance Optimization Series: Introduction” can be reviewed here.
1.1 Concept Explanation
First of all, before discussing various parameters related to memory and formulating standards, we need to clarify the actual meaning of various memory parameters that often appear in the statistics of various performance tools.
In the Android system, the PSS (Proportional Set Size) memory that we most commonly see and care about is the size of the space address actually used by a process in RAM, that is, the actual physical memory used. As a result, when the PSS memory peak in a game process is higher and the proportion of the total physical memory of the current hardware is higher, the probability of the game process being killed (flashback) by the system is higher.
In PSS memory, in addition to the Unused part, we generally care about Reserved Total memory, system caches such as Lua, Native code, plug-ins, and self-allocation of third-party libraries. The proportion of Reserved Total is generally high, so its size and trend are also the main statistical objects of UWA performance analysis tools (for projects that use Lua, UWA also provides Lua special test reports to count Lua memory, which will be mentioned below).
Reserved Total and Used Total represent the overall allocation and overall usage of the Unity engine respectively in terms of memory. Generally speaking, when the engine allocates memory, it does not “get it and use it” from the operating system, but first obtains a certain amount of contiguous memory, and then uses it internally. When the free memory is not enough, the engine will apply a certain amount of contiguous memory from the system again.
Note: For most platforms, Reserved Total memory = Reserved Unity memory + GFX memory + FMOD memory + Mono memory
(1) Reserved Unity Memory
Reserved Unity and Used Unity allocate memory within each module of the Unity engine, including the memory usage of each Manager, the memory usage of serialized information, and the memory usage of some resources, etc.
Through in-depth analysis of a large number of projects, UWA found that the main reasons for the large memory allocation of Reserved Unity are as follows:
Memory usage of serialized information: There are many types of serialized information in the Unity engine, of which the most common one with large memory usage is SerializedFile. The memory allocation of the serialized information is mainly caused by the project loading the AssetBundle file through a specific API (WWW.LoadFromCacheOrDownload, CreateFromFile, etc.).
Resource memory usage: It mainly includes Mesh, AnimationClip, RenderTexture and other resources. For Mesh resources that do not have the “Read/Write Enable” option turned on, the memory usage is counted in the GFX memory for GPU use, but after this option is turned on, the mesh data will be kept in Reserved Unity, which is convenient for the project to run. Real-time editing and modification of Mesh data. At the same time, if the R&D team also turns on the “Read/Write Enable” option for texture resources (off by default), the texture resources will also be reserved in Reserved Unity, resulting in a larger memory footprint.
(2) GFX memory
GFX memory is the memory allocation amount fed back by the underlying graphics card driver, which is controlled by the underlying graphics card driver. Generally speaking, the memory usage of this part is mainly determined by the amount of rendering-related resources, including texture resources, Mesh resources, and the part of Shader resources that are transmitted to the GPU, as well as the memory allocated by the relevant libraries for parsing these resources.
(3) Managed heap memory
Managed heap memory represents the amount of managed heap memory allocations allocated by the code during the project runtime. For projects that use Mono for code compilation, their managed heap memory is mainly allocated and managed by Mono; for projects that use IL2CPP for code compilation, their managed heap memory is mainly allocated and managed by Unity itself.
1.2 Memory Parameter Standard
After we understand the meaning of various parameters related to memory, we know that the key to avoiding game crash is to control the PSS memory peak. The bulk of PSS memory lies in the resource memory and Mono heap memory in Reserved Total. For projects using Lua, Lua memory should also be concerned.
According to UWA’s experience, only when the PSS memory peak is controlled below 0.5-0.6 times the total hardware memory, the risk of flashback is low. For example, for a 2GB device, the PSS memory should be controlled below 1GB, and for a 3GB device, it should be controlled below 1.5GB.
For most projects, the PSS memory is about 200MB-300MB higher than the Reserved Total, so the Reserved Total of the 2GB device should be controlled below 700MB, and the 3GB device should be controlled below 1GB.
In particular, UWA also believes that Mono heap memory needs to be paid attention to. Because in many projects, in addition to the problem of the high residency itself or the risk of leakage, the size of Mono heap memory will also affect the GC time. UWA believes that it is best to control it below 80MB.
The following table lists the recommended standards for each resource memory provided by UWA, which are more stringent. However, developers still need to adjust according to the actual situation of their own projects. For example, if a 2D project saves the use of almost all mesh resources, then the standards for other resources can be more flexible.
More detailed standards can be viewed directly in the UWA online product.
After the memory standard is formulated based on the actual situation of the project, it is generally necessary to further negotiate with the art and designing team to give reasonable art specification parameters and write it into a document.
After the specification is set, regularly check whether all the art resources in the project meet the specification, and modify and update it in time. In the process of checking whether the art is compliant, you can use the callback function provided by Unity to write automation tools to improve efficiency. You can refer to “The Practice of Automating Standard Unity Resources”.
If the resources cannot be batch processed into high, medium, and low versions, the artist needs to create different resources for each image quality level.
The problems mentioned in this section are not specific and do not only appear in one type of resource memory. Therefore, in order to avoid repetition, they are discussed here in a unified manner.
2.1 Suspected redundant resources
In the specific resource list (hereinafter referred to as the resource list) reported by the UWA GOT Online Resource mode, we often see that the peak value of a certain resource is greater than 1 and is marked in red. The peak quantity is also a very important indicator of resource usage. The so-called “peak quantity” refers to the maximum quantity of the same resource in the same frame. Theoretically, the parameter of the peak quantity should not be greater than 1. When the peak quantity is greater than 1, it will be marked in red in the list, which is called a suspected redundant resource.
In general, this problem is caused by the loading of AssetBundle resources, that is, when making AssetBundle files, some shared resources (such as Texture, Mesh, etc.) are simultaneously entered into multiple different AssetBundle files, but no dependency packaging is performed. , so that when these AssetBundles are loaded, multiple copies of the same resource appear in the memory, that is, resource redundancy, it is recommended to carry out strict detection and improvement.
2.2 Unnamed Resources
In the resource list, sometimes it is found that there is a resource with the resource name N/A. Generally speaking, resources named N/A are new in the code but not named. It is recommended to name these resources through the .name method to facilitate resource statistics and management. In particular, N/A resources with serious redundancy or individual memory usage should be paid attention to and strictly checked.
2.3 Resident resources occupy a large amount of memory
In the resource list, together with the life cycle curve of the resource, it can be found that some resources that occupy a large amount of their own memory reside in the memory after being loaded into the memory, and are not unloaded until the end of the test process, which may cause a larger resource memory usage and higher peak value at a later stage of the game. It is recommended to check whether these resources are necessary to reside in memory. If it is no longer needed to be used, you should check why it is not uninstalled when the scene is switched; for resources that persist in a single scene for a long time, you can consider manual uninstallation.
The consideration of whether a resource is resident involves a trade-off between memory pressure and CPU time-consuming pressure. To put it simply, if the memory pressure of the current project is high, and the CPU time-consuming pressure during scene switching is small, you can consider changing the cache strategy, and unloading the resources that are not used in the next scene in a time when the scene is switched, and reload when needed.
3.1 Texture Format
Unreasonable texture format settings is usually one of the main reasons for texture resources occupying a large amount of memory. Even for many projects that have established art resource standards and uniformly modified texture formats, it is still easy to count the existence of a large number of texture resources in formats such as RGBA32, ARGB32, RGBA Half, RGB24, etc. Textures in these formats not only take up a lot of memory, but also lead to problems such as larger game packages, higher time-consuming to load these resources, and higher texture bandwidth.
The main reasons for this type of problem are as follows: the art naming is not standardized so that it is not modified by the callback function, or the resources created in the code do not have their texture format set; Target format textures are not supported by the hardware or texture resources themselves, resulting in textures being parsed as uncompressed formats.
For the former case, after finding the problematic resource in the resource list, you need to go back to the project to check and modify it manually; for the latter case, the texture formats recommended by UWA for hardware support mainly include ASTC and ETC2.
Among them, the ETC2 format requires the corresponding texture resolution to be a multiple of 4. When the corresponding texture has Mipmap enabled, it is strictly required that its resolution be a power of 2. Otherwise, the texture will be parsed into an uncompressed format.
The resolution of the texture resource (that is, the length and width parameters in the resource list) is also the main reason for the excessive memory usage. Generally speaking, the higher the resolution, the larger the memory usage. The most important of these are textures that occupy larger resolutions (typically ≥ 1024). For mobile platforms, it is difficult to tell the difference with the naked eye of the player with too detailed performance, and a high resolution often means unnecessary waste.
Using texture resources of different resolutions on different gear models is a very practical and easy-to-operate grading strategy. This is true even for atlas textures. In particular, Unity provides a Variant function for SpriteAltas, which can quickly copy an original atlas and reduce the resolution of the variant atlas according to the Scale parameter for lower Graded use.
3.3 Read/Write Enabled
As mentioned above, the memory footprint of texture resources is calculated in GFX memory, that is, the part that is passed to the GPU. And the texture resource with the Read/Write Enabled option turned on will also reserve memory on the CPU side, which doubles the memory usage of the resource.
Both the UWA GOT Online Resource mode report resource list and the local resource detection report can directly show which textures have the Read/Write Enabled option enabled. In fact, resources that do not need to be modified at runtime do not need to enable the Read/Write Enabled option. Developers should investigate and disable unnecessary settings to reduce memory overhead.
When Mipmap is enabled for a texture, its memory usage will increase to 1.33 times that of the original data. For 3D objects, such as terrain, objects or characters in the scene, it is recommended to enable the Mipmap function of the texture, which can reduce the bandwidth at runtime. But it is worth noting that in the Mipmap page in the real machine test report, the trend of the screen ratio of each Mipmap channel with Mipmap texture turned on during the game is counted. If the 3D object in the scene uses 1/2 or even 1/4, 1/8 of the Mipmap channel in a large area, it means that the texture resolution used by the 3D object is too high, and there is a waste phenomenon. Lower-resolution textures can be used instead.
However, if it is a 2D project or UI interface resource, it is recommended to turn off the Mipmap function of the corresponding texture to avoid unnecessary memory overhead.
3.5 Anisotropic and Trilinear Filtering
Enabling the anisotropic filtering of textures is beneficial to the display of objects such as the ground, but it will increase the GPU rendering bandwidth. The principle is that when the texture is compressed and sampled, the information in the cache will be read; If it is not read, the System Memory will be read farther away from the GPU, so the clock cycles will increase. When the anisotropy is turned on and the number of sampling points increases, the probability of Cache Miss will increase, resulting in more bandwidth increase. In the engine, the anisotropy of texture resources can be turned off through scripts; or for textures that need to be turned on, the number of samples can be set to 1-16 in the engine, and it is also recommended to set a lower value as much as possible.
Set the texture to trilinear filtering, the texture will be blurred between different mipmap channels, and the GPU rendering bandwidth will increase compared to bilinear filtering. Trilinear interpolation adopts 8 sampling points (bilinear adopts 4 sampling points), which will also increase the probability of Cache Miss, resulting in an increase in bandwidth. Trilinear filtering should be avoided as much as possible.
3.6 Atlas Production
Insufficient scientific atlas production is also a problem that often occurs in projects. Atlas textures with a high number of spikes sometimes appear in the asset list, but it is not necessarily redundant. In one case, a large number of small images are packaged into the same atlas, resulting in the maximum resolution (such as 2048*2048) set by the texture resource of the atlas cannot hold so many small images, and the resource will generate more textures Paginate to pack these small images. Therefore, as long as the game relies on a small image in a texture page, the resource, that is, all pages under the resource, will be loaded into the memory, resulting in unnecessary waste. Therefore, it is generally recommended to control it within 2-3 pages.
Even if the above-mentioned relatively extreme phenomenon does not occur, many projects will also have the phenomenon of “one stroke affecting the whole body”. That is to say, only one or a few small images in the atlas are used, but the entire texture that occupies a large amount of memory is loaded into the memory.
For this reason, when making a packaged atlas, it is a very important strategy to strictly follow the usage scenarios and classification of the thumbnail images. Choosing an appropriate resolution to avoid the texture not being filled and causing waste is also a point that developers need to pay attention to.
3.7 The case of using TextMeshPro
TextMeshPro can provide better performance and convenient functions for UI components, making it favored by many developers. But the TMP font atlas texture (texture with SDF Atlas in the name, in Alpha 8 format) generated using TMP also has some pitfalls worth noting.
(1) Sometimes, combined with the font resource list, it is noticed that there is a .ttf font file corresponding to the TMP atlas texture in the memory. Indicates that the TMP font atlas is a dynamic font. Consider resetting the dynamic TMP to a static TMP and removing the dependency on the .ttf file after the project is developed and the characters used in the game have been added to an Altas texture of the dynamic font. In this way, the corresponding font resource will not appear in memory. However, this method is not recommended if the font is also used for user input.
(2) The resolution of the Atlas font texture is larger. At this time, it is recommended to check whether the characters are filled with the texture of the atlas in the engine and whether the production and generation of the texture are reasonable. For dynamic TMP, if it is not full, such as only occupying less than 3/4 of the texture, you can consider turning on the Multi Atlas Textures option and setting the texture size. For example, you can make 1 texture of 40964096 become 3 20482048 textures, saving 32MB-3*8MB=8MB of space.
(3) There are TMP-related resources (LiberationSans SDF Atlas, EmojiOne) in the resource list. They are the default settings of TMP. You can remove the dependence on these default resources in Project Settings-TextMesh Pro Settings, and they will not appear in memory.
Since Multi Atlas Textures is an option of dynamic TMP, (1) and (2) cannot be used at the same time, and can be selected according to the actual situation of the project.
4.1 Number of Vertices and Triangles
Mesh resources with too many vertices and triangles will not only cause high memory usage but also be unfavorable for clipping. For these meshes, on the one hand, the mesh can be simplified, the number of vertices and triangles can be reduced, and a low-poly version can be made for the grading of low-end and mid-range models; For complex terrain and buildings, you can consider splitting them into several repeating small grids for re-joining. As long as the batch operation is done well, the number of tiles rendered on the same screen can be reduced at the cost of a little Culling calculation time.
4.2 Vertex Attributes
If there is no unified art asset standard and it is not processed when imported, there is a high chance that the meshes in the project will contain a lot of “excess” vertex data. The “excess” data here means that the grid data contains data that is not needed in the Shader during rendering. For example, if the mesh data contains vertex data such as Position, UV, Normal, Color, and Tangent, but only Position, UV, and Normal are required in the Shader used for rendering, the Color and Tangent in the mesh data are ” redundant” data, causing unnecessary memory waste. Among them, a small mesh resource has vertex attributes, which will cause the Combined Mesh to also have vertex attributes, which needs to be paid attention to.
For this problem, a relatively simple method is to try to open the “Optimize Mesh Data” option. This option is located in Other Settings of Player Settings. When checked, the engine will traverse all grid data when publishing, and remove its “redundant” data, thereby reducing its data size. However, it should be noted that the R&D team is advised to pay extra attention to grids that require modification of Material in the case of Runtime. If you need to modify a more complex material for a GameObject during runtime and need to access more vertex attributes, it is recommended to mount these materials on the corresponding Prefab before publishing, so as to prevent the engine from removing the mesh that will be used in runtime. data.
4.3 Read/Write Enabled
In the resource list, a large number of mesh resources whose vertex attributes are not displayed as -1 (or “-“) are often counted. Vertex attribute information can only be collected by the UWA report when Read/Write is enabled for the mesh resource. At this point, the vertex attribute is not displayed as -1, and the memory usage of the mesh will increase. Generally speaking, meshes that do not need to be modified on the CPU side do not need to enable Read/Write. The Read/Write properties of these meshes can be modified through the API in the editor, or directly in the Inspector window for meshes in FBX.
Generally speaking, animation resources with a memory footprint greater than 200KB and short duration can be considered as animation resources with a large memory footprint, and there is a certain room for optimization. The optimization methods for animation resources are:
(1) Change the Animation Type to Generic. Compared with another Legacy type, Generic actually uses the new version of Unity’s Mecanim animation system, and the overall performance is much better. It is generally not recommended to use the old version of the animation system, and the third Humanoid is also provided by the new version of the animation system for human characters. The special workflow has the advantage of flexible reusability, but it has requirements on the number of bones of the model (ie, humanoid bones), which can be selected according to the needs of the project.
(2) Change Anim. Compression to Optimal. Optimal actually allows Unity to automatically select the optimal curve expression among several algorithms, thus occupying the smallest storage space. Keyframe Reduction is a relatively stable and conservative algorithm, and the probability of affecting the performance of animation is smaller.
(3) Turn off the Resample Curves option. The official document says that enabling this option will have a certain performance improvement, but in fact, according to the “Practice of Automation Specification of Unity Resources”, the performance improvement of enabling Resample Curves mentioned above is reflected in playback instead of loading, and the effect is minimal; on the contrary, it may cause erroneous animation performance. Therefore, combined with experimental data, in most cases, it is recommended to turn off this option.
(4) Consider using the API to cull the Scale curve of the animation resource and the precision of the compressed animation. Among them, the practice of compressing animation accuracy can refer to as “Unity Animation File Optimization Research”.
The above four methods can effectively reduce the memory usage of animation resources, but (2) and (4) will theoretically cause the loss of animation accuracy, but it may not be obvious. It is recommended that the R&D team debug it, and try to optimize its memory usage while ensuring that the animation performance is not affected.
For BGM with long duration and some conventional audio resources with short duration but large memory, there is a certain room for optimization. The optimization methods for audio resources are:
(1) Turn on Force To Mono. Doing so will cause the audio to be automatically mixed to mono instead of losing a channel, thus greatly reducing the audio memory under the premise of less impact on the performance.
(2) Modify its Load Type to Compressed In Memory or Streaming. Compressed In Memory is suitable for most regular audio, while Streaming is suitable for background music that is often long and memory-intensive.
(3) For the audio of Compressed In Memory, modify its compression format (Compression Format) to a format with a higher compression rate, such as Vorbis, MP3;
(4) For audio in Vorbis and MP3 compression formats, you can continue to lower its Quality parameter to further compress its memory.
All of the above methods can effectively reduce the audio resource memory (Streaming can be stably reduced to about 200KB), but it will cause a certain time cost or reduce the sound quality, which can be selected as appropriate.
The material resource itself occupies less memory, and we generally pay more attention to how to optimize its number, because too much of it will affect the time-consuming of the Resource.UnloadUnusedAssets API will be mentioned later.
There are too many material resources, often mainly because there are too many redundant Material resources of the Instance type. Generally speaking, this situation occurs because the parameters of meshrender.material are accessed and modified through code, so the Unity engine will instantiate a new Material to achieve the effect, thereby causing redundancy in memory. In this regard, it is recommended to use MaterialPropertyBlock to optimize. For specific related operations and examples, see the following article “Using MaterialPropertyBlock to Replace Material Property Operations”. However, this method is not applicable under URP and will interrupt the SRP Batcher. In addition, it is necessary to pay attention to and optimize the suspected redundancy of non-Instance material resources, and will not repeat them.
In addition to the number of problems, material resources often involve some texture sampling and Shader use-related problems, resulting in some extra memory and GPU performance waste, and the more noteworthy ones have also been used as detection rules in UWA local resource detection. in the report.
For materials that use solid color texture sampling, you can replace the texture sampling with a color parameter, thereby saving the cost of one texture sampling; for materials with empty texture sampling, Unity will sample the built-in texture provided, but the calculated color is a Constants are still wasteful; and for materials that contain useless texture samples, due to Unity’s mechanism, the shader will automatically save the texture samples on it. Even if the Shader is replaced, the textures that were originally depended on will not be removed, so it may cause false dependencies. The actual unnecessary texture is brought into the package, resulting in a waste of memory.
8.1 Rendering Resolution
Some RT resources in the resource list reflect the current rendering resolution of the project. For projects with high pressure on the GPU and rendering modules, reducing the rendering resolution on low-end and mid-range models is a very intuitive and effective grading strategy. Generally, low-end models can consider not using the real machine resolution, and reduce it to 0.8-0.9 times, and even many teams will choose 0.7 times or 720P.
If the resolution of some other RT resources is too high, attention should also be paid, especially the resources with a resolution higher than 2048*2048. It should be investigated whether it is necessary to use such a fine RT and consider the effect of lower resolution on low-end machines.
The AA multiples of RT resources are shown in the resource list. Turning on multiple AA will double the RT memory usage and put pressure on the GPU. It is recommended to check whether it is necessary to turn on AA, especially on mid-to-low-end machines, you can consider turning off this effect.
In particular, 2 times the AA on some Huawei device models will fail. That is, performance consumption has been incurred but no anti-aliasing effect has actually been achieved.
Some common post-processing-related RTs (such as Bloom, and Blur) start sampling from 1/2 of the rendering resolution. You can consider sampling from 1/4 and reducing the number of downsampling, thereby saving memory and reducing the impact of post-processing on rendering pressure.
From the perspective of performance optimization, it is even better to completely turn off all kinds of post-processing on low-end and mid-range models. More details about the common post-processing effects will be further discussed in the GPU section below.
8.4 RT under URP
When using URP, two more RT resources, _CameraColorTexture, and _CameraDepthAttachment will be added to the memory as rendering targets, and when the CopyDepth and CopyColor settings of the URP camera are turned on, _CameraDepthTexture and _CameraOpaqueTexture will be additionally generated as intermediate RTs. When these two RTs appear in the resource list, it is necessary to check whether CopyDepth and CopyColor are really used, otherwise, they should be turned off to avoid unnecessary waste.
Unity 2019.4.20 is a turning point in Shader’s approach to memory statistics. Before this, the Shader’s memory was mainly counted in ShaderLab, and then it was mainly counted on the Shader resource itself.
For projects with versions prior to Unity 2019.4.20, viewing the ShaderLab memory requires TakeSample in the Unity Profiler. Whether it is the Shader resource ontology or the ShaderLab memory usage is too high, it is necessary to control the number of Shaders and the number of variants.
9.2 Number of variants
A large number of variants is the main reason that a Shader resource occupies too much memory and occupies too much package body. In the project iteration, there may be keywords that have been deprecated or not actually used, resulting in the multiplication of variants; or the Shader is more complex, and some keyword combinations will never be used, resulting in many variants being redundant.
In response to the above situation, Unity provides a callback function to eliminate unused keywords or keyword combinations related variants when the project is loaded with AssetBundle or Build.
Shader redundancy needs special attention. Shader redundancy not only leads to increased memory but also may cause repeated parsing, that is, unnecessary Shader.Parse and Shader.CreateGPUProgram API calls at runtime are time-consuming.
9.4 Standard Shader
Find Standard, ParticleSystem/Standard Unlit in the resource list. There are a large number of these two Shader variants, the loading time will be very high, and the memory usage is also too large. It is not recommended to use them directly in the project. The reason is generally that some 3D objects in the imported FBX model or some 3D objects generated by Unity use their own Default Material, thus relying on the Standard Shader. It is recommended to investigate and simplify.
If the memory of a single font resource occupies more than 10MB, it can be considered that the memory of the font resource is too large. Consider using the FontPruner font reduction tool or other font reduction tools to slim down fonts and reduce memory usage.
We also need to pay attention to the excessive number of fonts in the project, because each Font will correspond to a Font Texture, so the number of font resources and the number of Font Textures is also increased, thus occupying more memory.
Combining the resource list with the particle system curve, the number of particles in the memory of many projects will be much higher than the actual number of particles playing.
At this time, on the one hand, it is necessary to check whether there are particle resources that have been deprecated but not deleted in the iteration process or components that have been tested in the production process but have not been released; Necessary particle cache.
Mono Heap Memory
The UWA GOT Online Mono mode report provides two main functions: heap memory specific allocation and heap memory leak analysis, for developers to analyze heap memory problems in projects.
12.1 Sustain/Peak Allocation Stack
In the heap memory-specific allocation page, you can check the specific stack of the high heap memory allocation function. We mainly focus on two forms of heap memory allocation.
One is a single high heap allocation. This kind of peak usually occurs in the large allocation caused by the meter reading operation in the early stage of the game, and the developer needs to check whether it is reasonable based on the specific stack information. If there is a peak heap memory allocation during the game running process, you need to pay attention.
Another situation is persistently high heap memory allocation. If there is a phenomenon in which more heap memory is allocated every frame or every few frames in the project, attention should be paid to it. Continuous high heap memory allocation will lead to an increase in GC frequency, resulting in frequent freezes in the game. You can use the stack to check which child nodes are continuously allocating heap memory.
12.2 Leakage Analysis
Check the heap memory residency of each function in the project on the leak analysis page. Select the two sampling frames before and after in the chart to compare, you can view the changes in the heap memory residency from the stack, and see what stack allocation is mainly caused by the increase in residency.
On the one hand, it can avoid the risk of leakage caused by the continuous rise of heap memory; on the other hand, optimizing the functions with high resident and releasing them in time can reduce the time-cost of a single GC. We generally recommend testing the GOT Online Mono mode as long as possible, such as 1 hour, otherwise, the leakage problem is often difficult to be exposed.
UWA GOT Online Lua mode provides performance tests for the Lua scripting language.
The function names that appear here have the format in “function name@filename:line number”.
The CPU time-consuming bottleneck function and the specific cause of the CPU time-consuming peak can be located by the Lua file name/line number/function name provided by the report. The naming format of Lua functions is X@Y:Z, where X is the function name. If it cannot be obtained, X will become the default unknown; Y is the file location where the function is defined; Z is the line where the function is defined No. It should be noted that when a Lua script is run in bytecode, this value will always be 0, so it is recommended to use Lua source code to run as much as possible when testing.
For the memory allocated by Lua, the line graph in the report selects the maximum value of data within 30 frames as the data point. According to the trend of the line chart, it helps developers to have a general understanding of the heap memory allocation during project operation. Among them, the decline of heap memory means that a GC has occurred. The functions of viewing the specific memory allocation and leak analysis in the Lua report are similar to the Mono mode report.
Another important feature in the Lua schema report is Mono Object Reference Statistics.
From a principle level, an object pool is maintained in the Unity Mono virtual machine for linking Unity Object objects and Lua objects. When the Unity Object object in the scene is Destroyed, there is no more in the scene, but because the Lua layer still holds the Use data reference, the object pool cannot release the Unity Object. If the object refers to Texture, Mesh, and other related resources, it will cause leakage. At this time, the related objects of the Lua layer need to be empty (nil), and after dereferencing, the Unity Object can be recycled after the next GC occurs. The significance of this function is to assist developers in troubleshooting such leakage risks.
The report provides a Mono object reference histogram, in which the black part represents the number of objects that have not been Destroyed. Due to the influence of Lua-side GC, there will be some Destroyed objects. At this time, we should pay attention to whether it is stable, and if it continues to rise, we need to pay attention.
After selecting the corresponding frame in the column, a list of Mono object types for that frame will be displayed:
- Object Type: Represents the specific type of the Unity Object object;
- The Number of Objects: Indicates the number of objects of this type;
- The number of Destroyed Objects: Indicates the number of objects of this type that have been destroyed, but the Lua layer still has related references. You need to pay attention to the number of Destroyed objects. If the number is large, there is a risk of leaking C# heap memory.
13.2 Plugins and Third-Party Libraries
The use of plug-ins and third-party libraries such as Wwise is quite common, but it is generally not possible to quantitatively and intuitively count statistics at runtime. However, in general, they do not occupy much memory. Only after the above memory optimization points have been checked, it is still found that there is a large gap between the PSS memory and the value of Reserved Total, and then combined with the documentation of the plug-in or third-party library or its development. The method provided by the author can be optimized in a targeted manner, and even an alternative with better performance is considered.
That’s all for today’s sharing. Of course, life is boundless but knowing is boundless. In the long development cycle, these problems you see maybe just the tip of the iceberg. We have already prepared more technical topics on the UWA Q&A website, waiting for you to explore and share them together. You are welcome to join us, who love progress. Maybe your method can solve the urgent needs of others, and the “stone” of other mountains can also attack your “jade”.
YOU MAY ALSO LIKE!!!
UWA Website: https://en.uwa4d.com
UWA Blogs: https://blog.en.uwa4d.com
UWA Product: https://en.uwa4d.com/feature/got
You may also like
January 4, 2023
December 21, 2022