Debugging Cascaded Shadow Map

I want to try something different for this post. Rather than just a boring rundown of the technology I implemented, I am going to write about how I approached at solving a bug in my shadow system. I think this post is a lot more down to earth than the other ones and I hope you will enjoy it.

Cascaded Shadow Map(CSM) bug

Alright, just when I want to take a shot from a different angle to show off my new assets, my shadow map broke. You can clearly see the huge gap in shadow on the right of this screenshot here:

From the look of it, it seems to be a problem with how my cascade volumes are getting computed. The first thing to do is stepping through the code that calculates the cascade map volumes. Well, the values look right until they get multiplied with a bunch of other matrices and turned into some numbers that are hard to reasoned with.

Debugging the CSM

In situations like these, a graphical debugger is priceless. I just have to pass the 8 view frustum corners into some renderer that draws them out in the game for me, and immediately I can see what's going on. Unfortunately, Monter doesn't have that yet, and for the time being, I'm too lazy to implement one right now. Since I have the shadow map texture stored, I can start by inspecting that first. So I quickly wrote a shadow map texture viewer routine. When my camera hit an angle that produces the artifact above, I switch to texture viewing mode to see what's going on with the shadow map.

Immediately I can see what's causing the artifact. This is the shot taken:

And this is the shadow map texture at that exact frame:

(Each of these four splits is used for one of the four shadow cascades)
The shadow map is offseted too much to one side for some strange reason.

After playing around with it a little bit, a spark of brilliance stroke me: I could just tune up the PCF rate and make the light direction tangential to the plane surface, so that the area that's covered by the shadow map will have noise on it. That way, I can easily inspect which part of the scene the shadow map covers! That solution worked beautifully:

From the gif, you can clearly see four splits of shadow maps, and depending on camera's orientation, these shadow map resizes, and is incorrectly sized when the camera is facing toward negative X axis.

Guided by what appeared on the screen, I arrived at these lines of code that most likely cause this behavior:

for (int CornerIndex = 0; CornerIndex < ARRAY_COUNT(FrustumCorners); ++CornerIndex)
{
    FrustumCorners[CornerIndex] = ApplyMat4(FrustumCorners[CornerIndex], Inverse(View) * LightSpaceView);
}

All this code does is to transform the view frustum corners from view space to light space, and that's probably where everything went wrong.

To make it easier to see what's going on in the code, I pulled stuff out to make each operation more explicit:

mat4 InverseView = Inverse(View);
for (int CornerIndex = 0; CornerIndex < ARRAY_COUNT(FrustumCorners); ++CornerIndex)
{
    FrustumCorners[CornerIndex] = ApplyMat4(FrustumCorners[CornerIndex], InverseView);
}
for (int CornerIndex = 0; CornerIndex < ARRAY_COUNT(FrustumCorners); ++CornerIndex)
{
    FrustumCorners[CornerIndex] = ApplyMat4(FrustumCorners[CornerIndex], LightSpaceView);
}

Inverse() is a new function that I introduced not long ago, so it might be the culprit. So I stepped in and inspect InverseView when the artifact appeared. The result is quite strange; InverseView's first row is basically a zero vector, and I expected it to be a normalized vector (by the way, the convention I’m using is row-major matrix and left-handed coord system).

Actual values inside InverseView:

As a view matrix, even when inversed, the first 3 row vectors should be the three orthogonal axis of that local view coordinate system, but here the first row vector is a zero vector. I used a trustworthy matrix inverse calculator online to compare the results and it also agrees with me that the first row vector should be a normalized vector even after inverting it. Therefore I concluded that my Inverse() is busted.

Correctly inverted view matrix values:

Diving into Inverse()

I use gaussian-jordan elimination method to invert matrices, so there’s quite some procedures to step through to find what went wrong. After some digging, I found a subtle bug in this code snippet:

//scale all pivots to 1
for (int R = 0; R < 4; ++R)
{
    for (int C = 0; C < 4; ++C)
    {
        Result.Data[R][C] /= Augment.Data[R][R];
        Augment.Data[R][C] /= Augment.Data[R][R];
    }
}

At the end of gaussian-jordan elimination algorithm, every row is scaled so that the pivot becomes 1 again. When this operation is done in my head, it is done in parallel. However, when machine executes this operation, it can only scale one element at a time. In this code, we scale each element by the pivot point, but the pivot point itself is also getting scaled. If the pivot point gets scaled before the other elements in the same row gets scaled, the subsequent scaling will produce incorrect results.

We can fix this problem by caching the pivot value first, then apply it to each row element:

//scale all pivots to 1
for (int R = 0; R < 4; ++R)
{
    f32 Scale = 1.0f / Augment.Data[R][R];
    for (int C = 0; C < 4; ++C)
    {
        Result.Data[R][C] *= Scale;
        Augment.Data[R][C] *= Scale;
    }
}

In fact, since this is the last part of the algorithm, the identity matrix part of the augmented matrix really has no use anymore. We can stop modifying the augmented matrix and use it just to scale the result matrix.

//scale all pivots to 1
for (int R = 0; R < 4; ++R)
{
    for (int C = 0; C < 4; ++C)
    {
        Result.Data[R][C] /= Augment.Data[R][C];
    }
}

Now CSM works in all view angles. I am happy again.

Here’s a full shot of the scene, with SSAO turned on: