Basic Knowledge
The image is blurred using the convolution kernel calculated by the twodimensional Gaussian function, which is expensive in realtime computing games. Some methods need to be considered to optimize acceleration and reduce the number of texture reading operations and arithmetic operations. Optimization can be done by exploiting the separable nature of the twodimensional Gaussian distribution function.
The separable feature of the twodimensional Gaussian distribution function means that a twodimensional Gaussian function can be decomposed into the same onedimensional Gaussian function and processed twice. The two are equivalent, which means that the obtained effect is the same, but the processing time but greatly shortened. The mathematical proof formula is as follows:
Make:
Then,
Rewrite the Gaussian kernel matrix as a normalization constant multiplied by a row vector and a column vector:
The normalization constant S at this time can be divided into the reciprocal multiplication of the normalization coefficients of two onedimensional Gaussian kernels:
So, it is also possible to split G into a product of two vectors:
Among which:
Therefore, a Gaussian blurred image Y can also be expressed as:
The above formula can show that the onedimensional Gaussian kernel is used to perform a horizontal convolution operation with the input image X to obtain the intermediate result Z, and then the vertical convolution operation is performed with the onedimensional Gaussian kernel and the intermediate result Z to obtain the final result Y.
It can be seen that calculating the blurring effect of a pixel in this way requires (4ω+2) multiplication operations and 4ω addition operations, and the time complexity is O(ω).
For a 1024*1024 size image, if you want to perform a blurring operation on the entire image, using a 33*33 size convolution kernel, you need to perform 1024*1024*33*2≈6.9kw texture reading times. This is a significant drop compared to direct processing using a 2D Gaussian convolution kernel.
Unity Implementation
According to the above algorithm, we implement it in Unity:
First, implement the onedimensional Gaussian kernel calculation formula:
float gauss(float x, float sigma)
{
return 1.0f / (sqrt(2.0f * PI) * sigma) * exp((x * x) / (2.0f * sigma * sigma));
}
Implement a onedimensional fuzzy algorithm:
float4 GaussianBlur(pixel_info pinfo, float sigma, float2 dir)
{
float4 o = 0;
float sum = 0;
float2 uvOffset;
float weight;
for (int kernelStep = KERNEL_SIZE / 2; kernelStep <= KERNEL_SIZE / 2; ++kernelStep)
{
uvOffset = pinfo.uv;
uvOffset.x += ((kernelStep)* pinfo.texelSize.x) * dir.x;
uvOffset.y += ((kernelStep)* pinfo.texelSize.y) * dir.y;
weight = gauss(kernelStep, sigma);
o += tex2D(pinfo.tex, uvOffset) * weight;
sum += weight;
}
o *= (1.0f / sum);
return o;
}
Horizontal 1D blurring on the first pass:
struct pixel_info
{
sampler2D tex;
float2 uv;
float4 texelSize;
};
float4 frag_horizontal(v2f_img i) : COLOR
{
pixel_info pinfo;
pinfo.tex = _MainTex;
pinfo.uv = i.uv;
pinfo.texelSize = _MainTex_TexelSize;
return GaussianBlur(pinfo, _Sigma, float2(1,0));
}
Pass
{
CGPROGRAM
#pragma target 3.0
#pragma vertex vert_img
#pragma fragment frag_horizontal
ENDCG
}
The builtin function GrabPass{} of Unity is used to capture the screen image after the first pass runs. The function saves the image to a variable named _GrabTexture by default, and then realizes vertical blur:
float4 frag_vertical(v2f_img i) : COLOR
{
pixel_info pinfo;
pinfo.tex = _GrabTexture;
pinfo.uv = i.uv;
pinfo.texelSize = _GrabTexture_TexelSize;
return GaussianBlur(pinfo, _Sigma, float2(0,1));
}
uniform sampler2D _GrabTexture;
uniform float4 _GrabTexture_TexelSize;
Because the UV coordinates of the pictures obtained by the GrabPass{} function will be inconsistent with ordinary pictures, the UV coordinates need to be processed to ensure that the results after the blurring algorithm will not be reversed, so the vertex shader is implemented as:
v2f_img vert_img_grab(appdata_img v)
{
v2f_img o;
UNITY_INITIALIZE_OUTPUT(v2f_img, o);
UNITY_SETUP_INSTANCE_ID(v);
UNITY_INITIALIZE_VERTEX_OUTPUT_STEREO(o);
o.pos = UnityObjectToClipPos(v.vertex);
o.uv = half2(v.vertex.x, 1v.vertex.y);
return o;
}
Implement a vertical blur pass:
GrabPass{}
Pass
{
CGPROGRAM
#pragma target 3.0
#pragma vertex vert_img_grab
#pragma fragment frag_vertical
ENDCG
It can be seen that to blur in this way, two passes are required to generate two DrawCalls, which is also an important factor affecting performance.
The implementation here uses the GrabPass{} function, which is relatively timeconsuming. Consider using two Blit in the OnRenderImage() function to perform horizontal and vertical operations respectively.
Screen Post Processing Effects Series
Screen Post Processing Effects Chapter 1 – Basic Algorithm of Gaussian Blur and Its Implementation
That’s all for today’s sharing. Of course, life is boundless but knowing is boundless. In the long development cycle, these problems you see maybe just the tip of the iceberg. We have already prepared more technical topics on the UWA Q&A website, waiting for you to explore and share them together. You are welcome to join us, who love progress. Maybe your method can solve the urgent needs of others, and the “stone” of other mountains can also attack your “jade”.
YOU MAY ALSO LIKE!!!
UWA Website: https://en.uwa4d.com
UWA Blogs: https://blog.en.uwa4d.com
UWA Product: https://en.uwa4d.com/feature/got
You may also like

Dynamic Resolution in Games from Principle to Application
January 4, 2023 
Introduction of the Raytracing Technology Part 2
December 21, 2022