We will continue discussing Unity performance optimization today, and the part that we will focus on in this article is particle system optimization.
The impact of the particle system cannot be underestimated on either CPU or GPU. With the emergence of more heavy and AAA-type game projects, the tastes of players have become more critical; the complexity of gameplay has become higher, and the visual special effects have become more complicated…So we need to treat the particle system more carefully.
What impact does the particle system have on the CPU?
In the UWA particle module report, the following parameters need our attention:
“ParticleSystem.Update”: The average CPU time for particle system update;
“ParticleSystem.Draw”: The average CPU time for the particle system to submit Draw Call per frame;
“ParticleSystem.ScheduleGeometryJobs”: This function is related to the scheduling of multi-threaded update tasks of the particle system. Generally speaking, the larger the value, the more the number of particle systems playing in the project.
According to UWA’s experience, generally speaking, the main influencing factors are as follows:
- Number of Particle Systems
In the UWA performance briefing, directly search (ctrl+F) “Number of particle systems”, two test results will appear:
(1) UWA recommends no more than 600 for the number of particle systems (Device model with 1G). This number refers to the total number of all ParticleSystems in the memory, including those that are being played and those that are in the buffer pool.
(2) The number of particle systems for Playing, here refers to the number of ParticleSystem components that are being played. This includes both on-screen and off-screen. We recommend that the peak number that appears in one frame does not exceed 50 (Device model with 1G).
So how to check which particle systems are cached during the running of my project? Are these Playing particle systems reasonable? Here is a trick, which can be viewed in the report’s Specific Resource Information – Particle System.
As shown in the figure above, the blue line represents the particle systems that are all loaded into the memory, the purple is suspected to be redundant, and the yellow is the number of particle systems that are actually played. Following the life cycle diagram of the game running, you can choose the screenshot of a certain frame, especially the part with a higher number, and then turn on the [Selected Frame] mode to view all the ParticleSystems and all the ParticleSystems Playing under this frame.
In view of the above two problems, the optimization and analysis can also focus on these two points:
(1) Pay attention to whether the peak of the number of particle systems (the blue curve) is too high. You can select a peak frame to check which particle systems are buffering, whether they are all reasonable, and whether there is excessive buffering;
(2) Pay attention to whether the peak value of the number of particle systems Playing (the yellow curve) is too high. You can select a peak frame to check which particle systems are playing, whether they are all reasonable, and whether you can make some production optimizations (see below for details).
- About ParticleSystem.Prewarm
We can notice that there is a function in the important performance parameters of UWA report: ParticleSystem.Prewarm, which means that there is a particle system in the current frame with the “Prewarm” option turned on, and the particle system with this option turned on is instantiated in the scene, or when it is converted from Deactive to active, a complete simulation will be executed immediately. Take “flame” as an example: when Prewarm is turned on, you can see the “big fire” in the first frame after loading, instead of starting from the “small spark” to gradually increase.
However, the operation of Prewarm usually takes a certain amount of time. It is recommended to turn it off when it is not used.
What impact does the particle system have on the GPU?
If the particle system has a serious issue, it will also affect the performance of the GPU. We can use the Overdraw data in the UWA’s Real Device Testing report to locate the issue. Generally, we recommend that it should not exceed 5 on low-end and mid-range devices.
Combining the trend chart of Overdraw and the corresponding screenshots of the game, we can check whether the particle special effects are too large or overlapped.
Here are some common optimization ideas:
For low-end devices, reduce the complexity of the particle system and screen coverage as much as possible, thereby reducing its rendering overhead and improving the running smoothness. The specific approach is as follows:
(1) Reduce the number of particles and the number of particles on the same screen on the low-end and mid-range devices, such as displaying only “critical” particle effects or particle effects released by their own characters, so as to reduce the CPU overhead of Update;
(2) Close the particle system that is far away from the current field of view or the current camera, and turn it on after getting close, so as to avoid unnecessary overhead of particle system Update;
(3) Minimize the coverage area of particle effects on the screen as much as possible. The larger the coverage area and the more stacking numbers, the greater the rendering overhead.
How to Standardize your Particle System?
The optimization points of the particle system mentioned above are mainly done in the middle and late stages of the project for most teams. However, without comprehensive and scientific detection and prediction of the performance pressure of particle effects, this is actually a risk. Therefore, scientific art standards and checking methods in our daily development are needed to ensure that the final performance on the real machine is perfect and smooth.
- Static resource checking is a monitoring service for resource specifications launched by UWA. In the daily development process, the performance of art resources such as textures, grids, animation resources, particle effects, etc. can be checked one by one, as shown in the figure below.
The detection principle of particle effects is essentially similar to that of ParticleEffectProfiler. The scanning process needs to be rendered in Game View, which can display the detection results intuitively and is very friendly to developers.
The detection rules in the report are as follows, which will be gradually improved and increased in the future.
(1) The average Overdraw rate during special effects playback is too high
Count the average Overdraw of the pixels participating in the rendering of each frame, and take the highest value in the process. The larger the value, the higher the possibility that special effects will cause GPU pressure, and it is recommended to check it.
(2) DrawCall peak value is too high during special effects playback
Count the number of DrawCalls in each frame and take the highest value in the process. It needs to be checked when the value is high.
(3) The total texture memory of special effects is too large
Count the total memory of textures included in special effects. When the value is high, it may be that the texture is used excessively and needs to be checked.
(4) There are too many ParticleSystem components when the special effect is running
Count the number of ParticleSystem components included in the special effect. When the value is high, it is easy to cause high rendering-related indicators, and high serialization time-consuming, etc., which needs to be checked.
(5) The maximum number of particles during special effects playback is too large
Count the total number of particles in each frame and take the highest value in the process. The larger the value, the higher the update overhead of special effects may be.
(6) The total number of texture maps in the special effects is too large
Count the total number of textures included in the special effect. When the value is high, it is easy to lead to high rendering-related indicators, which need to be checked.
(7) ParticleSystem with Collision or Trigger enabled
It is recommended that the Collison or Trigger function should be disabled for the particle system, otherwise there will be higher physical overhead.
- Monitoring the skill special effects and optimizing the issues one by one according to the CPU Time function of GOT Online
The developer team can set the size and position of skill effects through the camera while running the skill effects in sequence on the real machine. In this way, it can run an automatic test on a real device. Through the GPU time-consuming feedback on the real device, you can immediately locate which skills and special effects are causing a high pressure on GPU while running on different grades of real devices.
The same is true for Draw Call and Triangle, and bottlenecks can be found quickly.
Some teams will go a step further and detect the instantiation of special effects and Active/Deactive together so that they can know which skill special effects will bring performance hazards during runtime.
This process has been executed regularly in multiple teams, and the feedback is very positive.
The above are some of the issues and corresponding methods that need to be paid attention to when optimizing the particle system. However, how to operate requires everyone to combine the actual situation of the project, and the UWA service can quickly help you locate the performance bottleneck.
YOU MAY ALSO LIKE!!!
UWA Website: https://en.uwa4d.com
UWA Blogs: https://blog.en.uwa4d.com
You may also like
January 4, 2023
December 21, 2022