# Short answer:

**Importance sampling** is a method to reduce variance in Monte Carlo Integration by choosing an estimator close to the shape of the actual function.

pdf(x)$pdf(x)$ gives the probability of a random sample generated being x$x$.

# Long Answer:

To start, let's review what Monte Carlo Integration is, and what it looks like mathematically.

Monte Carlo Integration is an technique to estimate the value of an integral. It's typically used when there isn't a closed form solution to the integral. It looks like this:

∫f(x)dx≈1N∑i=1Nf(xi)pdf(xi)$$\int f(x)\phantom{\rule{thinmathspace}{0ex}}dx\approx \frac{1}{N}\sum _{i=1}^{N}\frac{f({x}_{i})}{pdf({x}_{i})}$$

In english, this says that you can approximate an integral by averaging successive random samples from the function. As N$N$ gets large, the approximation gets closer and closer to the solution. pdf(xi)$pdf({x}_{i})$ represents the probability density function of each random sample.

Let's do an example: Calculate the value of the integral I$I$.

I=∫2π0e−xsin(x)dx$$I={\int}_{0}^{2\pi}{e}^{-x}\mathrm{sin}(x)dx$$

Let's use Monte Carlo Integration:

I≈1N∑i=1Ne−xsin(xi)pdf(xi)$$I\approx \frac{1}{N}\sum _{i=1}^{N}\frac{{e}^{-x}\mathrm{sin}({x}_{i})}{pdf({x}_{i})}$$

A simple python program to calculate this is:

```
import random
import math
N = 200000
TwoPi = 2.0 * math.pi
sum = 0.0
for i in range(N):
x = random.uniform(0, TwoPi)
fx = math.exp(-x) * math.sin(x)
pdf = 1 / (TwoPi - 0.0)
sum += fx / pdf
I = (1 / N) * sum
print(I)
```

If we run the program we get I=0.4986941$I=0.4986941$

Using separation by parts, we can get the exact solution:

I=12(1−e−2π)=0.4990663$$I=\frac{1}{2}(1-e-2\pi )=0.4990663$$

You'll notice that the Monte Carlo Solution is not quite correct. This is because it is an estimate. That said, as N$N$ goes to infinity, the estimate should get closer and closer to the correct answer. Already at N=2000$N=2000$ some runs are almost identical to the correct answer.

A note about the PDF: In this simple example, we always take a uniform random sample. A uniform random sample means every sample has the exact same probability of being chosen. We sample in the range [0,2π]$[0,2\pi ]$ so, pdf(x)=1/(2π−0)$pdf(x)=1/(2\pi -0)$

Importance sampling works by *not* uniformly sampling. Instead we try to choose more samples that contribute a lot to the result (important), and less samples that only contribute a little to the result (less important). Hence the name, importance sampling.

If you choose a sampling function whose pdf very closely matches the shape of f$f$, you can greatly reduce the variance, which means you can take less samples. However, if you choose a sampling function whose value is very different from f$f$, you can *increase* the variance. See the picture below:
Image from Wojciech Jarosz's Dissertation Appendix A

One example of importance sampling in Path Tracing is how to choose the direction of a ray after it hits a surface. If the surface is not perfectly specular (ie. a mirror or glass), the outgoing ray can be anywhere in the hemisphere.

We *could* uniformly sample the hemisphere to generate the new ray. However, we can exploit the fact that the rendering equation has a cosine factor in it:

Lo(p,ωo)=Le(p,ωo)+∫Ωf(p,ωi,ωo)Li(p,ωi)|cosθi|dωi$${L}_{\text{o}}(p,{\omega}_{\text{o}})={L}_{e}(p,{\omega}_{\text{o}})+{\int}_{\mathrm{\Omega}}f(p,{\omega}_{\text{i}},{\omega}_{\text{o}}){L}_{\text{i}}(p,{\omega}_{\text{i}})|\mathrm{cos}{\theta}_{\text{i}}|d{\omega}_{\text{i}}$$

Specifically, we know that any rays at the horizon will be heavily attenuated (specifically, cos(x)$\mathrm{cos}(x)$ ). So, rays generated near the horizon will not contribute very much to the final value.

To combat this, we use importance sampling. If we generate rays according to a cosine weighted hemisphere, we ensure that more rays are generated well above the horizon, and less near the horizon. This will lower variance and reduce noise.

In your case, you specified that you will be using a Cook-Torrance, microfacet-based BRDF. The common form being:

f(p,ωi,ωo)=F(ωi,h)G(ωi,ωo,h)D(h)4cos(θi)cos(θo)$$f(p,{\omega}_{\text{i}},{\omega}_{\text{o}})=\frac{F({\omega}_{\text{i}},h)G({\omega}_{\text{i}},{\omega}_{\text{o}},h)D(h)}{4\mathrm{cos}({\theta}_{i})\mathrm{cos}({\theta}_{o})}$$

where

F(ωi,h)=Fresnel functionG(ωi,ωo,h)=Geometry Masking and Shadowing functionD(h)=Normal Distribution Function$$F({\omega}_{\text{i}},h)=\text{Fresnel function}\phantom{\rule{0ex}{0ex}}G({\omega}_{\text{i}},{\omega}_{\text{o}},h)=\text{Geometry Masking and Shadowing function}\phantom{\rule{0ex}{0ex}}D(h)=\text{Normal Distribution Function}$$

The blog "A Graphic's Guy's Note" has an excellent write up on how to sample Cook-Torrance BRDFs. I will refer you to his blog post. That said, I will try to create a brief overview below:

The NDF is generally the dominant portion of the Cook-Torrance BRDF, so if we are going to importance sample, the we should sample based on the NDF.

Cook-Torrance doesn't specify a specific NDF to use; we are free to choose whichever one suits our fancy. That said, there are a few popular NDFs:

Each NDF has it's own formula, thus each must be sampled differently. I am only going to show the final sampling function for each. If you would like to see how the formula is derived, see the blog post.

**GGX** is defined as:

DGGX(m)=α2π((α2−1)cos2(θ)+1)2$${D}_{GGX}(m)=\frac{{\alpha}^{2}}{\pi (({\alpha}^{2}-1){\mathrm{cos}}^{2}(\theta )+1{)}^{2}}$$

To sample the spherical coordinates angle θ$\theta $, we can use the formula:

θ=arccos(α2ξ1(α2−1)+1−−−−−−−−−−−−√)$$\theta =\mathrm{arccos}\left(\sqrt{\frac{{\alpha}^{2}}{{\xi}_{1}({\alpha}^{2}-1)+1}}\right)$$

where ξ$\xi $ is a uniform random variable.

We assume that the NDF is isotropic, so we can sample ϕ$\varphi $ uniformly:

ϕ=ξ2$$\varphi ={\xi}_{2}$$

**Beckmann** is defined as:

DBeckmann(m)=1πα2cos4(θ)e−tan2(θ)α2$${D}_{Beckmann}(m)=\frac{1}{\pi {\alpha}^{2}{\mathrm{cos}}^{4}(\theta )}{e}^{-\frac{{\mathrm{tan}}^{2}(\theta )}{{\alpha}^{2}}}$$

Which can be sampled with:

θ=arccos(11=α2ln(1−ξ1)−−−−−−−−−−−−−−√)ϕ=ξ2$$\theta =\mathrm{arccos}\left(\sqrt{\frac{1}{1={\alpha}^{2}\mathrm{ln}(1-{\xi}_{1})}}\right)\phantom{\rule{0ex}{0ex}}\varphi ={\xi}_{2}$$

Lastly, **Blinn** is defined as:

DBlinn(m)=α+22π(cos(θ))α$${D}_{Blinn}(m)=\frac{\alpha +2}{2\pi}(\mathrm{cos}(\theta ){)}^{\alpha}$$

Which can be sampled with:

θ=arccos(1ξα+11)ϕ=ξ2$$\theta =\mathrm{arccos}\left(\frac{1}{{\xi}_{1}^{\alpha +1}}\right)\phantom{\rule{0ex}{0ex}}\varphi ={\xi}_{2}$$

## Putting it in Practice

Let's look at a basic backwards path tracer:

```
void RenderPixel(uint x, uint y, UniformSampler *sampler) {
Ray ray = m_scene->Camera.CalculateRayFromPixel(x, y, sampler);
float3 color(0.0f);
float3 throughput(1.0f);
// Bounce the ray around the scene
for (uint bounces = 0; bounces < 10; ++bounces) {
m_scene->Intersect(ray);
// The ray missed. Return the background color
if (ray.geomID == RTC_INVALID_GEOMETRY_ID) {
color += throughput * float3(0.846f, 0.933f, 0.949f);
break;
}
// We hit an object
// Fetch the material
Material *material = m_scene->GetMaterial(ray.geomID);
// The object might be emissive. If so, it will have a corresponding light
// Otherwise, GetLight will return nullptr
Light *light = m_scene->GetLight(ray.geomID);
// If we hit a light, add the emmisive light
if (light != nullptr) {
color += throughput * light->Le();
}
float3 normal = normalize(ray.Ng);
float3 wo = normalize(-ray.dir);
float3 surfacePos = ray.org + ray.dir * ray.tfar;
// Get the new ray direction
// Choose the direction based on the material
float3 wi = material->Sample(wo, normal, sampler);
float pdf = material->Pdf(wi, normal);
// Accumulate the brdf attenuation
throughput = throughput * material->Eval(wi, wo, normal) / pdf;
// Shoot a new ray
// Set the origin at the intersection point
ray.org = surfacePos;
// Reset the other ray properties
ray.dir = wi;
ray.tnear = 0.001f;
ray.tfar = embree::inf;
ray.geomID = RTC_INVALID_GEOMETRY_ID;
ray.primID = RTC_INVALID_GEOMETRY_ID;
ray.instID = RTC_INVALID_GEOMETRY_ID;
ray.mask = 0xFFFFFFFF;
ray.time = 0.0f;
}
m_scene->Camera.FrameBuffer.SplatPixel(x, y, color);
}
```

IE. we bounce around the scene, accumulating color and light attenuation as we go. At each bounce, we have to choose a new direction for the ray. As mentioned above, we *could* uniformly sample the hemisphere to generate the new ray. However, the code is smarter; it importance samples the new direction based on the BRDF. (Note: This is the input direction, because we are a backwards path tracer)

```
// Get the new ray direction
// Choose the direction based on the material
float3 wi = material->Sample(wo, normal, sampler);
float pdf = material->Pdf(wi, normal);
```

Which could be implemented as:

```
void LambertBRDF::Sample(float3 outputDirection, float3 normal, UniformSampler *sampler) {
float rand = sampler->NextFloat();
float r = std::sqrtf(rand);
float theta = sampler->NextFloat() * 2.0f * M_PI;
float x = r * std::cosf(theta);
float y = r * std::sinf(theta);
// Project z up to the unit hemisphere
float z = std::sqrtf(1.0f - x * x - y * y);
return normalize(TransformToWorld(x, y, z, normal));
}
float3a TransformToWorld(float x, float y, float z, float3a &normal) {
// Find an axis that is not parallel to normal
float3a majorAxis;
if (abs(normal.x) < 0.57735026919f /* 1 / sqrt(3) */) {
majorAxis = float3a(1, 0, 0);
} else if (abs(normal.y) < 0.57735026919f /* 1 / sqrt(3) */) {
majorAxis = float3a(0, 1, 0);
} else {
majorAxis = float3a(0, 0, 1);
}
// Use majorAxis to create a coordinate system relative to world space
float3a u = normalize(cross(normal, majorAxis));
float3a v = cross(normal, u);
float3a w = normal;
// Transform from local coordinates to world coordinates
return u * x +
v * y +
w * z;
}
float LambertBRDF::Pdf(float3 inputDirection, float3 normal) {
return dot(inputDirection, normal) * M_1_PI;
}
```

After we sample the inputDirection ('wi' in the code), we use that to calculate the value of the BRDF. And then we divide by the pdf as per the Monte Carlo formula:

```
// Accumulate the brdf attenuation
throughput = throughput * material->Eval(wi, wo, normal) / pdf;
```

Where *Eval()* is just the BRDF function itself (Lambert, Blinn-Phong, Cook-Torrance, etc.):

```
float3 LambertBRDF::Eval(float3 inputDirection, float3 outputDirection, float3 normal) const override {
return m_albedo * M_1_PI * dot(inputDirection, normal);
}
```