1/ 1577/ /1

(Continue from Introduction of the Raytracing Technology Part 1)

5. Path Tracing

When Kajiya proposed the rendering equation in 1986, he also gave a solution to this equation. Kajiya converts the rendering equation into a path-integral-based form.


Path tracing is a variant of distributed ray tracing where each point only emits one reflected and refracted ray. This avoids an explosion in the number of rays, but a simple implementation can result in very noisy images. To compensate for this, multiple rays are traced for each pixel.


One advantage of path tracing is that camera effects such as depth of field and motion blur can be incorporated at a little extra cost due to the high number of visible rays emitted per pixel. On the other hand, it is more difficult to ensure a good distribution of reflected rays (e.g. through layering) than distributed ray tracing. In short, distributed ray tracing emits the most rays in the ray tree, while path tracing emits the most rays initially.


5.1 Path Tracing Equations

There are two forms of rendering equations. The first one is the hemispherical form that we often use, but it also has an important form, that is, the area form. Kajiya converts the rendering equation into an explicit equation based on path integrals so that the lighting calculation can effectively perform Monte Carlo calculations.


According to the relationship between solid angle elements and area elements:


This is the final area integral form, and after continuous substitutions between the left and right sides, such a sum formula is obtained:


Where  refers to a path  of length n,  is the contribution of the light source to the final result after n stages of propagation:


This is the path and form of LTE. This is an n-1-dimensional high-dimensional integral, and each of our subsequent sampling samples is an n-1-dimensional vector. We call the space in which this vector resides the path space.


The advantage of the path integral form is that it transforms the rendering problem of the entire scene into a sampling problem of the path space composed of paths of various lengths. Randomly generate several paths between the viewpoint and the light source, estimate their contribution value, and add the contribution value to the pixels affected by the path to obtain the rendered image of the scene. Both bidirectional path tracing and Metropolis ray tracing are suitable for description and calculation in this form of path integral.


5.2 Direct Lighting

5.2.1 Illumination with only one lamp

Since we need to calculate the light, we need to change the hemispherical integral to the area integral:


Where yi is a point on the light range, V (x, y) is the visibility function of the shadow light. It is 1 when there is an occluder, and 0 otherwise.


In order to compute the direct illumination due to a single light source in the scene (we need to define a probability density function p(y) over the area of the light source, on which shadow rays are produced. We assume that such a PDF can be constructed regardless of the light source’s geometry characteristics. We use Monte Carlo integration for estimation.


directIllumination (x, theta)//x is the point on the surface
  estimatedRadiance = 0;
  for all shadow rays
    generate point y on light source;
    estimatedRadiance +=Le(y, yx) * BRDF * radianceTransfer(x,y)/pdf(y);
  estimatedRadiance = estimatedRadiance / #shadowRays;


The variance of the Monte Carlo integration, as well as the noise in the final image, is mainly determined by the choice of p(y). Ideally, p(y) is equal to the contribution of each point y to the final estimator, but in practice, this is almost impossible, and a more realistic choice must be made, mainly because it is easy to implement. Frequently used choices for p(y) are:

  1. Uniform sampling of light source area
  2. Uniform sampling of solid angle by light source


Uniform Sampling of Light Source Areas

In this case, p(y) is uniformly distributed over the surface of the light, so the probability of each point is . This is the easiest way to sample a light source and can be easily improved using optimization methods such as layered sampling. With this sampling scheme, there will be several sources of noise in the image. If a point x is in the shadow penumbra, some of its shadow rays will produce V(x,yi)=0 and some will not. This results in most of the noise in soft shadow areas. On the other hand, if outside the shadows, most of the noise is caused by changes in the cosine term.


Uniform Sampling of A Solid Angle by Light Source

To remove noise caused by cosine terms and square factors, sampling by a solid angle is a good choice. It requires rewriting the integral over the area of ​​the light source as an integral over the solid angle subtended by the light source. This will remove a cosine term and distance factor from the integrand. However, visibility testing still exists. This sampling technique is often difficult to implement because generating directions over arbitrary solid angles is not trivial.


5.2.1 Illumination with Multiple Lamps

When there are multiple light sources in the scene, it is easiest to calculate direct lighting separately for each light source in turn. We generate some shadow rays for each light and then sum up the total contribution of each light. This way, the direct lighting component of each light source will be calculated independently, and the number of shadow rays per light source can be chosen according to any criteria.


However, it is often best to treat all combined sources as a single one and apply Monte Carlo integration to the combined integral. They can be used by any light source when generating shadow rays. Thus, direct illumination of any number of light sources can be computed with just a single shadow ray and still obtain an unbiased image (although in this case, the final image can be very noisy). This approach works because we completely abstract light sources into separate analytic surfaces, only looking at one collection. However, to get a working algorithm, we still need individual access to any light source, since any individual light source may require a separate sampling procedure to generate points on its surface.


5.3 Indirect Illumination

In contrast to direct illumination calculations, the problem of indirect illumination is generally much more difficult because indirect light hits surface points from all possible directions. Therefore, it is difficult to optimize its calculations along the same lines as direct lighting.


Indirect illumination consists of light that reaches the target point X after at least one reflection on an intermediate surface between the light source and x. Indirect illumination is an important component of global illumination.


indirectIllumination (x, theta)
  estimatedRadiance = 0;
  if (no absorption)
     for all indirect paths
        sample direction psi on hemisphere;
        y = trace(x, psi);
        estimated radiance +=computeRadiance(y, -psi) * BRDF *cos(Nx, psi) / pdf(psi);
     estimatedRadiance = estimatedRadiance / #paths;

computeRadiance(x, dir)
  estimatedRadiance = Le(x, dir);
  estimatedRadiance += directIllumination(x, dir);
  estimatedRadiance += indirectIllumination(x, dir);


5.3.1 Uniform Sampling of Indirect Light



The most general Monte Carlo method for evaluating indirect illumination is to use an arbitrary hemispherical PDF, , and generate multiple  in random orientations. This produces the following estimators:


To compute this estimator, for each generated direction, we need to evaluate the BRDF and its cosine term for tracing a ray from x along the direction  and evaluate the reflected radiance   at the nearest intersection . This last value expresses the recursive nature of indirect lighting since the reflected radiance at  can be decomposed again into direct and indirect contributions. Here the uniformly sampled PDF is directly given as . Noise in the resulting image will mainly be caused by changes in the evaluation of the BRDF and cosine terms, as well as changes in the reflected radiation Lr at the far point.


5.3.2 Importance Sampling of Indirect Light

Sampling uniformly over the hemisphere is a very reckless decision, as it appears as if we don’t have any knowledge of the integrand in the integral of indirect lighting. To reduce noise, we need some form of importance sampling. We can construct a hemispherical PDF that is proportional (or approximately proportional) to any of the following factors:

  • Cosine 
  • BRDF 
  • Indirect Radiation 
  • Any combination of the above


Cosine Term Sampling

Sampling proportional to the cosine wave around the normal Nx prevents oversampling near the horizon of the hemisphere, where  equals 0. We can expect the noise to decrease eventually since we lower the probability of directions that contribute little to the generated directions. so:


Assuming the BRDF has only diffuse reflection at this point, we get:


Here, the only noise comes from Lr.


BRDF Sampling

Some materials only reflect light in very specific directions. This is usually related to the roughness of the material. Mirrors, such as the one on the left, require far fewer samples than the diffuse surface on the right.


This depends on the bidirectional reflectance distribution function (BRDF) of the material. This distribution function determines how much light is reflected in a given direction. By sampling directions directly from this distribution function (importance sampling), we can avoid taking samples that contribute little to the final image.


BRDF sampling is a great noise reduction technique when using glossy or highly specular BRDFs. It reduces the probability of sampling orientation when the BRDF value is low or zero. However, only a small number of specific BRDF models are possible to accurately sample according to the BRDF scale. A better way is to try sampling proportionally to the product of the BRDF and the Cosine term. Analytically, this is harder to do, except in rare cases. Typically, sampling in conjunction with rejection is required from such a PDF.


Another time-consuming approach is to build a table of values of the cumulative probability function and use this table to generate directions. The PDF value does not exactly equal the product of the BRDF and the Cosine factor, but significant variance reduction can still be achieved, which means that the modified Phong BRDF should be considered for sampling according to BRDF.


Here  of d is the perfect specular direction.  is the diffuse reflection part, and   is the highlighted part, so we can divide the indirect light part into:


For a point, we are conducting a classification discussion, and the proportion of diffuse reflection is q1.


The ratio q2 at which highlights occur.


The proportion q3 of absorption occurs, and the contribution value is 0.


For the diffuse reflection part, our better choice is based on the cosine distribution, and the highlighted part can be based on .


In addition, the values of q1, q2, and q3 will also affect the final result. In principle, no matter which set of values is selected, it will eventually provide an unbiased estimator, but careful selection of these values will have an impact on the final result. A good option is to choose these values proportional to the (maximum) reflected energy in the different modes. These values can be calculated by integrating the reflection along the direction of incidence along the surface normal Nx:


It is worth noting that for any other direction of incidence than normal, the value of q is actually greater than the actual reflected energy because the cosine wave portion around is below the surface at x. This can be adjusted by not resampling any directions in the portion of the wave that lies below the wave, thus maintaining a balance between diffuse energy, specular energy, and absorption.


5.4 Code

  for each pixel
    radiance = 0;
    H = integral(h(p));
    for each sample // Np viewing rays
      pick sample point p within support of h;
      construct ray at eye, direction p-eye;
      radiance = radiance + rad(ray)*h(p);
  radiance = radiance/(#samples*H);

  find closest intersection point x of ray with scene;
  return(Le(x,eye-x) + computeRadiance(x, eye-x));

computeRadiance(x, dir)
   estimatedRadiance += directIllumination(x, dir);
   estimatedRadiance += indirectIllumination(x, dir);

directIllumination (x, theta)
  estimatedRadiance = 0;
  for all shadow rays // Nd shadow rays
    select light source k;
    sample point y on light source k;
    estimated radiance +=Le * BRDF * radianceTransfer(x,y)/(pdf(k)pdf(y|k));
  estimatedRadiance = estimatedRadiance / #paths;

indirectIllumination (x, theta)
   estimatedRadiance = 0;
   if (no absorption) // Russian roulette
     for all indirect paths // Ni indirect rays
        sample direction psi on hemisphere;
        y = trace(x, psi);
        estimatedRadiance +=compute_radiance(y, -psi) * BRDF *cos(Nx, psi)/pdf(psi);
     estimatedRadiance = estimatedRadiance / #paths;

  transfer = G(x,y)*V(x,y);


Path tracing is suitable for any type of geometry, any type of BRDF, and sampling all types of paths with low memory consumption, but its biggest problems are standard Monte Carlo problems: slow convergence and noise.


The disadvantages are also obvious. First of all, the effect is not as good as distributed ray tracing. However, the convergence speed of pure path tracing is very slow. Sometimes it is difficult to find a path with a large contribution (that is, the probability of finding such a path is very small), resulting in a lot of noise in the image, and each pixel requires thousands of samples to achieve a satisfactory effect.


Regarding Russian roulette, it is impossible to simply terminate the tracking with a fixed step size, because the length of the higher contribution should be longer, and the lower should be shorter. The essence is mainly energy conservation.

6. Bidirectional Path Tracing

The bidirectional path tracing algorithm, proposed by Veach and Guibas and Lafortune and Willems respectively, can generate caustics and specular surfaces well. The basic idea of bidirectional path tracing (Bidirectional Path Tracing) is to shoot rays from the viewpoint and light source at the same time, and after several bounces, connect the vertices on the eye path and light path (when connecting Need to test visibility) to generate many paths quickly.


Generate subpaths of length nE from the eye: x1, x2, x3, x4… and nL subpaths from the light source, y0, y1…. Then, we connect each node on the two sub-paths separately, and only two nodes on the two sub-paths that are not occluded by other objects can be connected. In this way, we can get a set of paths of different lengths, and there are multiple paths of the same length in this set. For example, if we want to get a path of length k, we can select the path of length s from the light source sub-path and then select the path of length t(t=k+1) from the eye sub-path, so for the length of k path, we can have k choices, and each choice of s actually represents a different sampling strategy for the path space. The effects have their own effects, and by combining these strategies, we can get a more adaptable rendering algorithm and get better rendering effects. These strategies can be broadly classified into three categories,

  1. s = 0, such as x0, x1…xk. This situation is the basic Monte Carlo ray tracing, that is, the light source is hit during the process of generating the eye subpath.
  1. t=1, for example, x0(E)↔yk-1y0(L). This case is light source ray tracing, which directly connects the viewpoint and light source subpath. The pixels affected by this path are not the original pixels, they must be calculated, and it is necessary to pay attention to whether the endpoints of the eye and light source subpaths are visible. In addition, here t≠0, because here we do not consider the situation that the viewpoint has a certain area, but regard the viewpoint as an abstract point.


S=1,…k, for example, x0(E)x1…xi↔yk-i-1y0(L), this is the general case, and it is necessary to pay attention to whether the two interconnected points are visible.


[Veach 1995] proposed a multiple importance sampling (Multiple Importance Sampling) methods, which can combine the sampling results of various sampling methods with appropriate weights to reduce image noise very effectively. For example, now it is required to solve the contribution of all light paths with length 4 in the integral, there are several sampling methods: the length of the viewpoint sub-path is 0, the length of the light source sub-path is 3, denoted as (0,3); and (1, 2), (2,1), (3,0). The path contributions sampled by these methods may vary greatly, and using multiple importance sampling to combine these samples can achieve a good noise reduction effect. However, bidirectional path tracing also has obvious disadvantages. When two vertices are on a pure mirror or refraction surface, the bidirectional path tracing method will not connect the two vertices, which makes it difficult to effectively simulate the reflection and refraction of Caustics.


Choose a light ray from the light//
Find ray-surface intersection//
Reflect or transmit//
u = Uniform()
if u < reflectance(x)
Choose new direction d drawn from BRDF(O|l)/ 
Goto 2/
else if u < reflectance(x) + transmittance(x)
Choose new direction d drawn from BTDF(O|l)
Goto 2/

7. Metropolis Ray Tracing

A core issue in Path Tracing is how to sample as many paths as possible with large contributions. To solve this, Eric Veach et al. proposed Metropolis Light Transport method in 1997 (often referred to as MLT).


In 1953, Metropolis and others proposed a new sampling method in order to solve the complex sampling problem in the field of computational physics, which is the Metropolis method. The original intention of using the Metropolis method was to calculate the material properties of the fluid. Now, the Metropolis method has been widely used in many fields such as physics and chemistry. Similarly, we can also apply the Metropolis method in the field of global illumination to solve complex sampling problems. The biggest problem with these previous ray tracing algorithms is that they all have uncomfortable scenes, that is to say, none of the methods works for any scene, because these methods do not sample the path from the light source to the viewpoint in the scene. Considering the path distribution characteristics of the scene, the random sampling method is used to sample the path and calculate the contribution. Therefore, once the path distribution in the path space of the scene is extremely uneven, these methods will cause an extreme reduction in efficiency because a large number of paths do not bring a contribution, for example, in a scene dominated by indirect lighting, the light source, and the viewpoint is severely occluded, only a very small channel can pass through, when random sampling, only a few paths can contribute through this channel, and other paths can only be discarded.


The selection of importance sampling requires samples to be proportional to function f, that is, the sampling samples must be proportional to the probability distribution of function f. In the actual unbiased rendering process, the function f is a known function, but due to its complex and irregular form, the probability distribution formed by the selection of samples in the sampling process is difficult to be proportional to the function f, which will lead to a large sample variance. Metropolis proposes a method to generate samples whose probability distribution is proportional to any estimable function.


The samples are generated by a Markov chain, so each sample is only related to its previous sample and has nothing to do with other samples. The newly generated samples are generated by random mutation of the current sample, and the mutation process satisfies the set mutation strategy. The selection of mutation strategy is relatively open, but it needs to satisfy the ergodicity. Ergodicity means that in the sampling process, no matter from any initial sample, the final sampling process will gradually converge to the distribution of pi.


The sampling distribution is obtained by accepting and rejecting the mutation samples, and the acceptance and rejection of the final samples depend on the setting of the acceptance probability. Acceptance probability means that the probability that the received mutation sample y is the current sample x is expressed as a(y|x). Finally, the Markov process will converge to a stationary distribution. If the process is in a stationary state, the transfer densities of the two samples in the space will also be in a stationary state. This property is called the detail balance condition.


When the receiving probability satisfies the detail balance condition and the mutation strategy satisfies ergodicity, the probability density of the generated sampling sequence will obey the stationary distribution. In order to make the sampling process converge to a smooth distribution as soon as possible, it is a better strategy to set the acceptance probability as large as possible.


7.1 Markov Chain

7.1.1 Concept

X is a random variable whose possible values come from the set . The value of X at discrete time t is Xt, if the transition probability of X over time only depends on its value Xt at the current moment, that is 


Then, the process of random variable X changing with time is a Markov process. The sequence (X0, X1, …, Xt) generated by X changing with time in [0,t] time is a Markov chain.


7.1.2 Transition Probability Matrix

Let the probability that the random variable X takes the value si at any time t+1 be , that is , where t is an arbitrary time. The transition probability of a random variable from state sj to state si is Pji:


t is any time, then we can get the calculation formula of  as follows:


We noticed that in the above formula , so the above formula can be written in the following form:


Then we can write the probability that the random variable X takes the value  at time t+1 as follows:


The matrix on the right side of the above formula is recorded as Ptran, that is:


Ptran is the transition probability matrix of the Markov process.


7.1.3 Stationary Distribution

If a certain state of a Markov process returns to itself after a finite number of state transitions, then we call this Markov process periodic. If there are two states in a Markov process, and they are transferred to each other, then we call this Markov process reducible. If a Markov process is neither periodic nor irreducible, then the Markov process is ergodic. My personal understanding of the concept of traversal of various states is that each state has a certain probability of appearing.


The Markov process of each state traversal has an important property: no matter what the initial state probability  value is, after enough state transitions, the Markov process of each state traversal will tend to a stationary distribution :


At this time, the probability that the random variable X takes a value from no longer changes but obeys the stationary distribution .


The stationary distribution satisfies the property  that is, the probability that the random variable takes a value after the stationary distribution undergoes a state transition does not change.


Author: papalqi,

Translation: UWA Team

Thanks to the author for your contribution. Welcome to repost and share, but please do not reprint without the authorization of the author.
The full content can be viewed here.

UWA Website:

UWA Blogs:

UWA Product: 

Related Topics

1 Visitor Comment

Post a Reply

Your email address will not be published.