0/ 759/ /1

Basic Knowledge

The image is blurred using the convolution kernel calculated by the two-dimensional Gaussian function, which is expensive in real-time computing games. Some methods need to be considered to optimize acceleration and reduce the number of texture reading operations and arithmetic operations. Optimization can be done by exploiting the separable nature of the two-dimensional Gaussian distribution function.


The separable feature of the two-dimensional Gaussian distribution function means that a two-dimensional Gaussian function can be decomposed into the same one-dimensional Gaussian function and processed twice. The two are equivalent, which means that the obtained effect is the same, but the processing time but greatly shortened. The mathematical proof formula is as follows:





Rewrite the Gaussian kernel matrix as a normalization constant multiplied by a row vector and a column vector:


The normalization constant S at this time can be divided into the reciprocal multiplication of the normalization coefficients of two one-dimensional Gaussian kernels:


So, it is also possible to split G into a product of two vectors:

Among which: 


Therefore, a Gaussian blurred image Y can also be expressed as:


The above formula can show that the one-dimensional Gaussian kernel is used to perform a horizontal convolution operation with the input image X to obtain the intermediate result Z, and then the vertical convolution operation is performed with the one-dimensional Gaussian kernel and the intermediate result Z to obtain the final result Y.


It can be seen that calculating the blurring effect of a pixel in this way requires (4ω+2) multiplication operations and 4ω addition operations, and the time complexity is O(ω).


For a 1024*1024 size image, if you want to perform a blurring operation on the entire image, using a 33*33 size convolution kernel, you need to perform 1024*1024*33*2≈6.9kw texture reading times. This is a significant drop compared to direct processing using a 2D Gaussian convolution kernel.


Unity Implementation

According to the above algorithm, we implement it in Unity:


First, implement the one-dimensional Gaussian kernel calculation formula:


float gauss(float x, float sigma)


return  1.0f / (sqrt(2.0f * PI) * sigma) * exp(-(x * x) / (2.0f * sigma * sigma));



Implement a one-dimensional fuzzy algorithm:

float4 GaussianBlur(pixel_info pinfo, float sigma, float2 dir)


float4 o = 0;

float sum = 0;

float2 uvOffset;

float weight;


for (int kernelStep = -KERNEL_SIZE / 2; kernelStep <= KERNEL_SIZE / 2; ++kernelStep)


uvOffset = pinfo.uv;

uvOffset.x += ((kernelStep)* pinfo.texelSize.x) * dir.x;

uvOffset.y += ((kernelStep)* pinfo.texelSize.y) * dir.y;

weight = gauss(kernelStep, sigma);

o += tex2D(pinfo.tex, uvOffset) * weight;

sum += weight;


o *= (1.0f / sum);

return o;



Horizontal 1D blurring on the first pass:

struct pixel_info


sampler2D tex;

float2 uv;

float4 texelSize;



float4 frag_horizontal(v2f_img i) : COLOR


pixel_info pinfo;

pinfo.tex = _MainTex;

pinfo.uv = i.uv;

pinfo.texelSize = _MainTex_TexelSize;

return GaussianBlur(pinfo, _Sigma, float2(1,0));






#pragma target 3.0

#pragma vertex vert_img

#pragma fragment frag_horizontal



The built-in function GrabPass{} of Unity is used to capture the screen image after the first pass runs. The function saves the image to a variable named _GrabTexture by default, and then realizes vertical blur:


float4 frag_vertical(v2f_img i) : COLOR


pixel_info pinfo;

pinfo.tex = _GrabTexture;

pinfo.uv = i.uv;

pinfo.texelSize = _GrabTexture_TexelSize;

return GaussianBlur(pinfo, _Sigma, float2(0,1));


uniform sampler2D _GrabTexture;

uniform float4 _GrabTexture_TexelSize;


Because the UV coordinates of the pictures obtained by the GrabPass{} function will be inconsistent with ordinary pictures, the UV coordinates need to be processed to ensure that the results after the blurring algorithm will not be reversed, so the vertex shader is implemented as:


v2f_img vert_img_grab(appdata_img v)


v2f_img o;




o.pos = UnityObjectToClipPos(v.vertex);

o.uv = half2(v.vertex.x, 1-v.vertex.y);

return o;



Implement a vertical blur pass:





#pragma target 3.0

#pragma vertex vert_img_grab

#pragma fragment frag_vertical



It can be seen that to blur in this way, two passes are required to generate two DrawCalls, which is also an important factor affecting performance.


The implementation here uses the GrabPass{} function, which is relatively time-consuming. Consider using two Blit in the OnRenderImage() function to perform horizontal and vertical operations respectively.

Screen Post Processing Effects Series

Screen Post Processing Effects Chapter 1 – Basic Algorithm of Gaussian Blur and Its Implementation


That’s all for today’s sharing. Of course, life is boundless but knowing is boundless. In the long development cycle, these problems you see maybe just the tip of the iceberg. We have already prepared more technical topics on the UWA Q&A website, waiting for you to explore and share them together. You are welcome to join us, who love progress. Maybe your method can solve the urgent needs of others, and the “stone” of other mountains can also attack your “jade”.


UWA Website:

UWA Blogs:

UWA Product: 

Related Topics

Post a Reply

Your email address will not be published.