Game Engine Review: Shader Managment

Chen  —  10 months, 3 weeks ago [Edited 1 minute later]
Hi everyone. This is post is just me picking up where I left off from last blog post, so it's gonna be a short one.

It is often the case that temporary shaders are written for debugging purposes and need to be added and removed frequently. Same for merging shaders to speed up rendering and splitting shaders for more reusability. If we just build shaders separately and store them in their own files, managing those files and editing the code that build and store the shaders become an unpleasant hassle very quickly. Unable to bear this chore, I set out to resolve this issue.

Runtime Uber Shader

My first approach to this rising problem is using a runtime uber shader. Runtime uber shader is a shader that contains all the shader subroutines. Instead of binding different shaders by calling glUseProgram() for different operations, I permanently bind the uber shader and some kind of flag is uploaded to this shader to change which routine of it is executed.

An example uber shader would look something like this:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
*some data layout here*

uniform int Mode;

void main() 
{
    if (Mode == SHADER_MODE_PHONG_SHADE) 
    {
        //phong shading code
    }
    else if (Mode == SHADER_MODE_BLUR)
    {
        //blur code
    }
    else if (Mode == SHADER_MODE_DEPTH_PASS)
    {
        //depth pass code
    }
    //and a lot of other subroutines
}


And the c++ code that uses this shader will look like this:

1
2
3
4
5
UploadShaderUniform(“Mode”, SHADER_MODE_PHONG_SHADE);
DrawWorld();
PrepareStatesForBlur();
UploadShaderUniform(“Mode”, SHADER_MODE_BLUR);
DrawFullscreenQuad();


It pretty much solves the shader management problem. You only have to build one single shader and keep all the shader code in one place, and it will run different code depending on the mode you set it too. It worked pretty well at first, so I adopted it and used it to build out the renderer.

Uber Shader Performance

After I finished the renderer pipeline with the uber shader, it had a dozen of branches. It was holding up pretty well until I moved my development environment from a PC to a laptop with a crappy GPU. When I run my game on the new laptop, the rendering process is dramatically slower; it was so severe that the game barely runs at 60FPS with just phong shading on. That led me to suspect it’s the whole uber shader approach that slowed everything down. There’s no way to prove it unless I pull the shader out into small pieces that run on their own, so I started pulling them out.

My suspicion turned out to be correct. When running the shaders without the branchings, it was substantially faster. Now, this behavior is probably hardware dependent, but I want Monter to run on machines even with crappy hardware, so uber shader approach isn’t gonna cut it.

”Compile-time branching” in Shader

We are back to square one with small pieces of shaders lying around, which need to be managed manually. Another simple solution that came to mind is to #define each part of the shader code and store them in one file. When we are compiling the file to a certain type of shader, we insert “#define <SHADER_TYPE>” on top of the string so that shader compiler will only compile the code for that certain shader.

Here’s what the shader looks like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
//common code

#if defined(SHADER_PHONG_SHADE)
//phong shading code
#endif

#if defined(SHADER_DEPTH_PASS)
//depth pass code
#endif

#if defined(SHADER_BLUR)
//blur code
#endif


Here’s the c++ code that builds these shaders:

1
2
3
shader PhongShader = BuildShader(ShaderCode, “#define SHADER_PHONG_SHADE”);
shader DepthPassShader = BuildShader(ShaderCode, “#define SHADER_DEPTH_PASS”);
shader BlurShader = BuildShader(ShaderCode, “#define SHADER_BLUR”);


Automatic Shader Construction

We now have one single place to store all the shaders, but it’s still a huge hassle to visit the shader building code this frequently. I wanted to do better.

First thing I noticed is that, when we are compiling the shader, we can already deduce what shaders there are by looking at the compile-time branching code. If each shader segment is marked, we can compile all those segments separately and dump them into a shader table. To “mark” the shader segments, I replaced the #define’s with my own annotation syntax that will be preprocessed by the shader builder code. I also designed annotations to contain name tags, which will be used as its key when the shader table is built. This way, the shader builder can intelligently pull out the shader segments hidden inside that file and compile them into separate shaders, then insert them into a table with their own unique keys.
So things become very convenient; we only have to edit the shader code, and everything else is automated.

Here’s the previous shader code transformed into annotated form:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
//common code

@begin SHADER_PHONG_SHADE => PhongShade
//phong shading code
@end

@begin SHADER_DEPTH_PASS => DepthPass
//depth pass code
@end

@begin SHADER_BLUR => Blur
//blur code
@end


If the game code needs to use the shader, all it needs to do is:
1
2
BindShader(ShaderTable[“DepthPass”]);
//draw calls that does something


Victory

At this point, all the hassles have been automated. Adding a new shader to our system is as easy as typing dozens of characters in the shader file. Not to mention that Monter has a live shader editing system, so new shaders can be inserted as the game is running. How nice is that!
#13635
SedatedSnail  —  10 months, 3 weeks ago
If I recall correctly glsl 4.0 has shader subroutines. They're similar to function pointers and can be set via uniforms. I haven't used them myself, but they might be useful.
#13640
ratchetfreak  —  10 months, 3 weeks ago
SedatedSnail
If I recall correctly glsl 4.0 has shader subroutines. They're similar to function pointers and can be set via uniforms. I haven't used them myself, but they might be useful.


But hardware that can use that will not have an issue with a uniform powered switch (it's very likely implemented the exact same way)
Log in to comment