In this post I will be discussing the collision algorithm I used to handle interaction between the player and his surroundings. It is based on Fauerby’s paper.

I really struggled to find good resources on collision detection and collision response so hopefully my post can serve as somewhat of a starting point if you want to implement your own collision system.

**Collision object representation**

When you render a game scene, almost all the objects are represented by a triangle meshes. They help provide fine-grained details that you see on the screen. However, they aren’t always the best candidates to represent objects in a collision algorithm due to performance implications. Even when they are used, there are strict requirements on what type of mesh they have to be; maybe they have to be convex or have a low polygon count for the algorithm to work well. It is much more common to use a simple primitive shape, such as a sphere or ellipsoid, to approximate complex geometry instead. However, it requires artists and game world editors to manually fit primitives onto meshes, which is a lot of hassle.

In my implementation, the player is approximated by an ellipsoid and the scene is represented by the same triangle mesh I use to draw it. That’s really convenient because player’s ellipsoid can be automatically generated by bounding the player’s mesh and the scene’s triangle mesh representation is already loaded into the game anyway. I don’t really have to do any manual work here, and that’s great.

**Finding time of impact (TOI)**

So let me present what the problem is first. Every frame, the player will move along a certain direction by a certain amount, let’s call it velocity vector. Our job here is to ensure no matter what, the player won’t be able to move through a solid wall, clip through terrain, or do any other weird stuff. In order to do that, we need to obtain the initial time of impact (TOI) along this velocity vector when our player first collides with the scene geometry. Let’s call this time TOI, and it has a range of [0, 1]. TOI is how much percentage of the velocity vector we are allowed to travel and not penetrate the environment, hence it’s normalized.

For example, say our player is walking up a staircase. In the illustration below, vector V is obviously too big and we can only travel a portion of it. TOI will be the how much of V we can travel, and TOI * V will be the adjusted vector.

How do we obtain TOI? We can iterate over each triangle in the triangle mesh and obtain their own respective TOI, then compare all of them to find the minimum TOI. That minimum TOI is the TOI with respect to the entire mesh. So if we have a way to find the TOI of a moving ellipsoid to a triangle, we can find the TOI of a moving ellipsoid to an entire triangle mesh.

**Simplify the problem by change of basis**

This problem could be a lot easier if we had a moving sphere instead. Well, we can easily scale all points and vectors in our collision space by a certain amount so that our original ellipsoid turns into a unit sphere. What that basically does is transforming all our geometry into a different basis where our ellipsoid is a unit sphere. Let’s call this new basis eSpace. Then, we can find TOI in eSpace, multiply TOI to the motion vector in eSpace, then transform the result vector back to default basis. We lost our TOI, but we have the correct motion vector instead. [Edit: We actually haven’t lost our TOI yet. TOI is the same in both basis since it’s a scaling factor]

**In preparation for collision response**

This is all good if we just have to detect the collision and stop our player from penetrating. But just doing that will result in player stuck in wall or a slope where they should be able to slide along them. This “sliding” process is called collision response. In order to handle it correctly, we need the intersection point of the collision in addition to the TOI. It will make sense when we get to the collision response part.

**Moving (swept) unit sphere vs triangle collision detection**

If a sphere were to intersect with the triangle along its way, then there are three ways they could intersect each other:

Sphere touches triangle’s plane and simultaneously touches the inside that triangle’s area

Sphere touches one of triangle’s vertices

Sphere touches one of triangle’s edges

In his paper, Fauerby called the first case “inside the triangle”, which I think is a very bad way to word it. If we count triangle’s vertices and edges as insides, then inclusively, all three cases are collision “inside the triangle”. Just want to emphasize it here for those who are going to read his paper.

These cases aren’t mutually exclusive. You can imagine a sphere first touches the inside of a triangle within the same plane, then sinks through it and touches its edge. One interesting observation is that if case 1 were to occur at all, it must occur before case 2 and case 3. Therefore, if case 1 happens, then we can assume it’s the minimal TOI and return. Testing case 2 and case 3 would be unnecessary at this point.

So how can we test them?

For case 1, we can construct a plane out of the triangle. Its normal can be computed by the cross product of its edges in its winding order, and project any of its vertices onto the normal to obtain the plane constant, i.e. distance from plane to origin. Having acquired our plane, we can test intersection by finding TOI when the distance between sphere’s center and the plane is exactly 1.

Here’s the derivation for TOI:

If TOI is within [0, 1], that means sphere collides with the plane. Let’s name the collision point CP. We can derive CP as follows:

After acquiring CP, we can test whether or not it’s within the triangle by using barycentric coordinate and cramer’s rule:

The test above does not always succeed. When plane is completely horizontal, the determinant used as denominator will be zero. In that case, different axis must be chosen to solve with cramer’s rule. As an alternative, you can also use the test in Fauerby’s paper, but he never explains how it works.

If CP is within the triangle, then we can early exit and say this is the TOI with respect to this certain triangle. If not, we must proceed to case 2 and case 3.

For case 2, you simply find TOI when sphere collides against a vertex and determine if it’s within [0, 1]. Again, collision occurs when the distance between sphere’s center and the vertex is sphere’s radius, which is 1. Derivation:

CP is just the vertex in this case.

For case 3, it’s really the same thing. You first find the TOI when sphere is 1 unit away from the infinite edge made up of any two vertices. Then, check if the collision point is between these two vertices, if so, then that TOI is valid. CP can be acquired by using that result to interpolate between the two vertices. Do it for all edges and pick the minimum. Case 3’s derivation follows the same procedure as case 2 but is substantially more work than case 2, and I’m pretty drained at this point, so solve it as an exercise :), or you can cheat by looking at Fauerby’s answers.

Repeat this procedure for all triangles inside the mesh, and we will obtain time of impact and the collision point if a collision occurs.

**Collision response**

Fauerby proposes a pretty good way to resolve collisions. His idea is project the remaining velocity onto the tangent plane that collision point makes on the unit sphere’s surface. Now, the procedure will recurse and use the new sliding vector as its new input. Recursion stops when input vector is too small or it reaches the recursion limit. It achieves smooth gliding even in extreme situations, such as climbing stairs.

First we acquire the normal of this tangent plane. It’s just the normalized vector from collision point to sphere’s center. Then we can use this normal to construct a reflecting vector by using the remaining velocity. Adding the reflecting vector to the remaining vector, we achieve the sliding vector.

**Numerical Robustness**

This algorithm is perfectly fine on paper. It ensures sphere never penetrates the triangle mesh and smoothly glide it along the mesh. However, if you implement the algorithm just like that, your character would still frequently get stuck and fall through the ground. Why?

The reason is because we cannot store numbers at infinite precisions on a computer. Floating point numbers only have limited precisions and gives imprecise results. If we apply an algorithm that pushes the sphere to exactly touches the triangle mesh at one single point, then the actual point stored in the computer might be too much into the mesh due to floating-point errors. If any penetration happens, then the algorithm fails.

Code that anticipates and mitigates this type of error is called numerically robust. In order to make my code numerically robust, I had to make some adjustments. I first pushed sphere to when it initially collides with the mesh, then pull it back by a tiny amount. This way, I would ensure that floating point error is handled by this tiny adjustment.

**Results**

I loaded in some extra asset just to test out the collision system. So far the collision system is handling them pretty well. Player nicely glide along edges and it doesn’t awkwardly get stuck on edgy surfaces.

However, the performance is terrible. When I load the entire scene into collision mesh, each frame took about 30-40ms to process, which is unacceptable given that Monter is expected to run at 120 FPS. (but in the video you will notice Monter runs smoothly. How I achieved that will be the topic of my next blog post)

Here’s a recording demonstrating the collision system:

I really struggled to find good resources on collision detection and collision response so hopefully my post can serve as somewhat of a starting point if you want to implement your own collision system.

When you render a game scene, almost all the objects are represented by a triangle meshes. They help provide fine-grained details that you see on the screen. However, they aren’t always the best candidates to represent objects in a collision algorithm due to performance implications. Even when they are used, there are strict requirements on what type of mesh they have to be; maybe they have to be convex or have a low polygon count for the algorithm to work well. It is much more common to use a simple primitive shape, such as a sphere or ellipsoid, to approximate complex geometry instead. However, it requires artists and game world editors to manually fit primitives onto meshes, which is a lot of hassle.

In my implementation, the player is approximated by an ellipsoid and the scene is represented by the same triangle mesh I use to draw it. That’s really convenient because player’s ellipsoid can be automatically generated by bounding the player’s mesh and the scene’s triangle mesh representation is already loaded into the game anyway. I don’t really have to do any manual work here, and that’s great.

So let me present what the problem is first. Every frame, the player will move along a certain direction by a certain amount, let’s call it velocity vector. Our job here is to ensure no matter what, the player won’t be able to move through a solid wall, clip through terrain, or do any other weird stuff. In order to do that, we need to obtain the initial time of impact (TOI) along this velocity vector when our player first collides with the scene geometry. Let’s call this time TOI, and it has a range of [0, 1]. TOI is how much percentage of the velocity vector we are allowed to travel and not penetrate the environment, hence it’s normalized.

For example, say our player is walking up a staircase. In the illustration below, vector V is obviously too big and we can only travel a portion of it. TOI will be the how much of V we can travel, and TOI * V will be the adjusted vector.

How do we obtain TOI? We can iterate over each triangle in the triangle mesh and obtain their own respective TOI, then compare all of them to find the minimum TOI. That minimum TOI is the TOI with respect to the entire mesh. So if we have a way to find the TOI of a moving ellipsoid to a triangle, we can find the TOI of a moving ellipsoid to an entire triangle mesh.

This problem could be a lot easier if we had a moving sphere instead. Well, we can easily scale all points and vectors in our collision space by a certain amount so that our original ellipsoid turns into a unit sphere. What that basically does is transforming all our geometry into a different basis where our ellipsoid is a unit sphere. Let’s call this new basis eSpace. Then, we can find TOI in eSpace, multiply TOI to the motion vector in eSpace, then transform the result vector back to default basis. We lost our TOI, but we have the correct motion vector instead. [Edit: We actually haven’t lost our TOI yet. TOI is the same in both basis since it’s a scaling factor]

This is all good if we just have to detect the collision and stop our player from penetrating. But just doing that will result in player stuck in wall or a slope where they should be able to slide along them. This “sliding” process is called collision response. In order to handle it correctly, we need the intersection point of the collision in addition to the TOI. It will make sense when we get to the collision response part.

If a sphere were to intersect with the triangle along its way, then there are three ways they could intersect each other:

Sphere touches triangle’s plane and simultaneously touches the inside that triangle’s area

Sphere touches one of triangle’s vertices

Sphere touches one of triangle’s edges

In his paper, Fauerby called the first case “inside the triangle”, which I think is a very bad way to word it. If we count triangle’s vertices and edges as insides, then inclusively, all three cases are collision “inside the triangle”. Just want to emphasize it here for those who are going to read his paper.

These cases aren’t mutually exclusive. You can imagine a sphere first touches the inside of a triangle within the same plane, then sinks through it and touches its edge. One interesting observation is that if case 1 were to occur at all, it must occur before case 2 and case 3. Therefore, if case 1 happens, then we can assume it’s the minimal TOI and return. Testing case 2 and case 3 would be unnecessary at this point.

So how can we test them?

For case 1, we can construct a plane out of the triangle. Its normal can be computed by the cross product of its edges in its winding order, and project any of its vertices onto the normal to obtain the plane constant, i.e. distance from plane to origin. Having acquired our plane, we can test intersection by finding TOI when the distance between sphere’s center and the plane is exactly 1.

Here’s the derivation for TOI:

If TOI is within [0, 1], that means sphere collides with the plane. Let’s name the collision point CP. We can derive CP as follows:

After acquiring CP, we can test whether or not it’s within the triangle by using barycentric coordinate and cramer’s rule:

The test above does not always succeed. When plane is completely horizontal, the determinant used as denominator will be zero. In that case, different axis must be chosen to solve with cramer’s rule. As an alternative, you can also use the test in Fauerby’s paper, but he never explains how it works.

If CP is within the triangle, then we can early exit and say this is the TOI with respect to this certain triangle. If not, we must proceed to case 2 and case 3.

For case 2, you simply find TOI when sphere collides against a vertex and determine if it’s within [0, 1]. Again, collision occurs when the distance between sphere’s center and the vertex is sphere’s radius, which is 1. Derivation:

CP is just the vertex in this case.

For case 3, it’s really the same thing. You first find the TOI when sphere is 1 unit away from the infinite edge made up of any two vertices. Then, check if the collision point is between these two vertices, if so, then that TOI is valid. CP can be acquired by using that result to interpolate between the two vertices. Do it for all edges and pick the minimum. Case 3’s derivation follows the same procedure as case 2 but is substantially more work than case 2, and I’m pretty drained at this point, so solve it as an exercise :), or you can cheat by looking at Fauerby’s answers.

Repeat this procedure for all triangles inside the mesh, and we will obtain time of impact and the collision point if a collision occurs.

Fauerby proposes a pretty good way to resolve collisions. His idea is project the remaining velocity onto the tangent plane that collision point makes on the unit sphere’s surface. Now, the procedure will recurse and use the new sliding vector as its new input. Recursion stops when input vector is too small or it reaches the recursion limit. It achieves smooth gliding even in extreme situations, such as climbing stairs.

First we acquire the normal of this tangent plane. It’s just the normalized vector from collision point to sphere’s center. Then we can use this normal to construct a reflecting vector by using the remaining velocity. Adding the reflecting vector to the remaining vector, we achieve the sliding vector.

This algorithm is perfectly fine on paper. It ensures sphere never penetrates the triangle mesh and smoothly glide it along the mesh. However, if you implement the algorithm just like that, your character would still frequently get stuck and fall through the ground. Why?

The reason is because we cannot store numbers at infinite precisions on a computer. Floating point numbers only have limited precisions and gives imprecise results. If we apply an algorithm that pushes the sphere to exactly touches the triangle mesh at one single point, then the actual point stored in the computer might be too much into the mesh due to floating-point errors. If any penetration happens, then the algorithm fails.

Code that anticipates and mitigates this type of error is called numerically robust. In order to make my code numerically robust, I had to make some adjustments. I first pushed sphere to when it initially collides with the mesh, then pull it back by a tiny amount. This way, I would ensure that floating point error is handled by this tiny adjustment.

I loaded in some extra asset just to test out the collision system. So far the collision system is handling them pretty well. Player nicely glide along edges and it doesn’t awkwardly get stuck on edgy surfaces.

However, the performance is terrible. When I load the entire scene into collision mesh, each frame took about 30-40ms to process, which is unacceptable given that Monter is expected to run at 120 FPS. (but in the video you will notice Monter runs smoothly. How I achieved that will be the topic of my next blog post)

Here’s a recording demonstrating the collision system:

That being said, Fauerby's is an O(n) algorithm that doesn't scale well as my scene grows, but I have ways to reduce its time complexity. If it turns out to still be slow, I will have to resort to GJK+EPA, but for now I prefer the convenience Fauerby's algorithm provides.

And that is why GJK+EPA isn't my top preference as far as static scene collision is concerned. However, I do plan to use GJK for dynamic object collision detection/response, but I have no idea how they work. The links you provided are really handy for that. Thanks!

nice job!

So a first step to speed things up would be to add some kind of broadphase filter, first a spacial partition so you don't have to check collision with objects 2 miles away every frame. Then you can use AABB (or GJK on the convex hull) of groups of triangles to get an early out again.

Keep triangles as large as possible, also consider that if you are never meant to go through a hole you can simply not model it in your collision mesh. There is no need for the visual and logical geometry to match up perfectly.

Like 4, 5 or 6 or more at the same time - some of them pushing/squeezing character into all sorts of directions.

If you find a bug, and then you *fix it*, how can you ever be sure that it's been fixed?

Unless the code is super-simple, with floating point numbers - I can never quite tell!

I had always sorta assumed (without giving any extra thought) that floating point numbers are just thing to use games.

Now i think it's a shitty leaky abstraction.

And apart from floats being faster than integer math (by now at least), is there really any reason to use them (except for astrology) in games?

ratchetfreak

A cursory look through the paper lookse like it's based on elipsoid (player) with triangle (static mesh) collisions,

So a first step to speed things up would be to add some kind of broadphase filter, first a spacial partition so you don't have to check collision with objects 2 miles away every frame. Then you can use AABB (or GJK on the convex hull) of groups of triangles to get an early out again.

Keep triangles as large as possible, also consider that if you are never meant to go through a hole you can simply not model it in your collision mesh. There is no need for the visual and logical geometry to match up perfectly.

Yep, totally agreed.

pragmatic_hero

Test your collision system when the character collides with more planes!

Like 4, 5 or 6 or more at the same time - some of them pushing/squeezing character into all sorts of directions.

If you find a bug, and then you *fix it*, how can you ever be sure that it's been fixed?

Unless the code is super-simple, with floating point numbers - I can never quite tell!

I had always sorta assumed (without giving any extra thought) that floating point numbers are just thing to use games.

Now i think it's a shitty leaky abstraction.

And apart from floats being faster than integer math (by now at least), is there really any reason to use them (except for astrology) in games?

Ya I agree with you. This is kind of a nasty situation to handle. One way to go about bugs involving floating point numbers is to analyze the size of error margin. Like, we already know how fast the character can go per frame and how small or big triangles can get in our scene. From that we can estimate how many bits of a floating point number are actually being used in our collision routine. Then we can estimate the maximum possible margin of error and use that to calculate the amount of bias that would cancel out that error, even in the worst case. Despite floating point number's unstable behavior, we can still somehow clamp the error and ensure the result is robust in most cases.

And ya I see what you are saying. Floating numbers are not robust for this type of calculation. Custom number types are definitely a viable way to make this kind of code numerically robust. I'm going with floating point numbers because they seem to require the least amount of work in hindsight, but I might be wrong.

Decent first attemp.

Chen96

Like, we already know how fast the character can go per frame and how small or big triangles can get in our scene.

It's not anywhere that simple I feel.

It's very easy to screw up.

The precision is non-uniform for everything in the game world, that's not helpful. And surely you should do CD&R using relative coordinates, but there's many avenues to screw up regardless.

Then you put variable time-step into the equation and multiple-planes squeezing the object every-which way. And it's very easy to end up on the wrong side of the plane.

The small values can bite you as hard as the big ones. I find it very hard to have any sort of confidence for the floating point code I write.

I have to keep so much stuff in my brain to write any sort of *reliable* floating point code.

And the more I know about FP, the more I don't want to use any FP in any code of consequence.

If I wrote my CD&R in fixed-point, I would have saved so much time and mental bandwidth. And just flat out eliminated most kinds of rare-hard-to-replicate bugs.

pragmatic_heroChen96

Like, we already know how fast the character can go per frame and how small or big triangles can get in our scene.

It's not anywhere that simple I feel.

It's very easy to screw up.

The precision is non-uniform for everything in the game world, that's not helpful. And surely you should do CD&R using relative coordinates, but there's many avenues to screw up regardless.

Then you put variable time-step into the equation and multiple-planes squeezing the object every-which way. And it's very easy to end up on the wrong side of the plane.

The small values can bite you as hard as the big ones. I find it very hard to have any sort of confidence for the floating point code I write.

I have to keep so much stuff in my brain to write any sort of *reliable* floating point code.

And the more I know about FP, the more I don't want to use any FP in any code of consequence.

If I wrote my CD&R in fixed-point, I would have saved so much time and mental bandwidth. And just flat out eliminated most kinds of rare-hard-to-replicate bugs.

I was totally wrong about it being that simple. You have convinced me fixed-point is the way to go for CD&R. A rewrite of this code that uses fixed point arithmetic is in order now. Thanks for your input.

Chen96

I was totally wrong about it being that simple. You have convinced me fixed-point is the way to go for CD&R. A rewrite of this code that uses fixed point arithmetic is in order now. Thanks for your input.

I certainly don't want you to waste your effort.

Perhaps your code is a-okay. And your algorithm is robust and handles all the edge cases.

My CD&R is in floating point, and while I haven't rewritten it in fixed-point yet. I did spend alot of time with it, fixing rare getting stuck bugs (more so than I would have if I wrote it in fixed point straight up). I'm very tempted to re-do it in fixed-point, as that would give me a fair bit of confidence and reduce the complexity significantly. Complexity in terms of number-of-things which can go wrong.