Sunday, February 19, 2017

OpenGL Matrices (done by hand)

Last time I mentioned that since OpenGL version 3, someone thought it was funny to deprecate the OpenGL matrix math functions. Professionals were probably already doing the math by themselves, and if you reason that the OpenGL library should just do the job of talking to the graphics hardware, then indeed there is no place for math functions in a graphics library. And so it came to be that functions like glTranslate() and glRotate() were removed from OpenGL, leaving hobbyists stumped and in the dark. Luckily, there is an excellent C++ glm library now, specifically for handling OpenGL matrices. I like rolling my own however, and I’d like to show you how.

OpenGL 3D matrices are four by four elements, typically of type GLfloat. GLSL has a mat4 type for this, so in the vertex shader code you can write:

layout(location = 0) in vec4 input;

uniform mat4 projection;
uniform mat4 view;
uniform mat4 model;

void main() {
    gl_Position = projection * view * model * input;
}

This mat4 type we would like to have in C++ as well, so let’s go and write it.

What is the Matrix?

In OpenGL, a 4x4 matrix is a mathematical representation of a position, scale and orientation in 3D space. All in one. Position and scale are clear (I hope). Think of orientation as an airplane that has pitch, roll, and yaw. In 3D graphics we call that rotation about an axis.

When you study math, matrices are typically written in row major format. OpenGL however uses column major format. [Direct3D uses row major format]. It’s confusing as hell if you switch between formats. When programming with OpenGL, it seems wise to stick to column major format.

Row major                    Column major
m0  m1  m2  m3               m0  m4  m8  m12
m4  m5  m6  m7               m1  m5  m9  m13
m8  m9  m10 m11              m2  m6  m10 m14
m12 m13 m14 m15              m3  m7  m11 m15

Note the memory layout; in row major format it is just a straight C array of 16 floats. In column major format however, we have an array of four columns. From this we can build our C++ classes. Constructing a matrix class becomes easier if you first define a decent vector class:

struct vec4 {
    union {
        struct {
            GLfloat x, y, z, w;
        };

        GLfloat v[4];
    };

    GLfloat& operator[](int idx) {
        return v[idx];
    }
};

Wait, what is this thing with an anonymous union / struct? It is a programming trick so that we can write both v.x and v[0]. These notations refer to the same thing; the first element of the vector.

The 4x4 matrix class is an array of four columns:

struct mat4 {
    union {
        vec4 c[4];

        GLfloat m[16];
    };

    GLfloat& operator[](int idx) {
        return m[idx];
    }
};

You can make things more interesting by templating struct vec4 and using typedefs, like so:

typedef struct tvec4<GLfloat>   vec4;
typedef struct tvec4<GLdouble> dvec4;
typedef struct tvec4<GLint>    ivec4;
typedef struct tvec4<GLuint>   uvec4;

Matrix multiplication

The most important operation for matrices in 3D graphics is multiplication. Moving, rotating, and scaling objects are all done by means of matrix multiplication. Matrix multiply works the same in C++ as on paper, but do mind that we are working with column major matrices. Also note that matrix multiplication is not commutative, meaning that A x B != B x A. In other words, the order of operations is important. Implementing matrix multiply gets easier if you first write vector multiplication. A vector can be scaled by multiplying by a scalar:

vec4& operator*=(GLfloat f) {
    v[0] *= f;
    v[1] *= f;
    v[2] *= f;
    v[3] *= f;
    return *this;
}

For matrix multiply we can now work with entire columns. We use a temporary variable so that the original values aren’t changed while still doing the operation:

mat4& operator*=(const mat4& o) {
    vec4 res[4];

    res[0] = c[0] * o.c[0].x + c[1] * o.c[0].y + c[2] * o.c[0].z + c[3] * o.c[0].w;
    res[1] = c[0] * o.c[1].x + c[1] * o.c[1].y + c[2] * o.c[1].z + c[3] * o.c[1].w;
    res[2] = c[0] * o.c[2].x + c[1] * o.c[2].y + c[2] * o.c[2].z + c[3] * o.c[2].w;
    res[3] = c[0] * o.c[3].x + c[1] * o.c[3].y + c[2] * o.c[3].z + c[3] * o.c[3].w;

    c[0] = res[0];
    c[1] = res[1];
    c[2] = res[2];
    c[3] = res[3];
    return *this;
}

Manipulating objects in 3D space now becomes a game of matrix multiplication. For completeness, I should mention that I use a right-handed coordinate system, which is the OpenGL default. Again, the given matrices are in column major format.

Translation

By translation we mean displacing an object by an offset, or in plain English, moving an object. When moving an object in a certain direction, it moves along a vector. This vector is itself represented by a matrix that only has a position, and no special orientation. Multiplying these matrices together results in a matrix that represents the moved object. The translation matrix is defined as:

Scaling

Scaling doesn’t change the position nor the orientation of the object, but it does change the length of the vector representing the object. Multiply by the scaling matrix:

Rotation

Rotation changes the orientation of the object. We will multiply by a matrix that is centered at the origin, but has a different orientation. There are optimized cases for rotating around a single X, Y, or Z-axis, or you can rotate around an arbitrary axis by using the uber rotation matrix. Beware that OpenGL rotations used angles in degrees, while C math functions use radians.

radians = degrees * M_PI / 180.0f;

Rotation matrix for rotating about the X-axis:

1  0       0       0
0  cos(a) -sin(a)  0
0  sin(a)  cos(a)  0
0  0       0       1

Rotation matrix for rotating about the Y-axis:

 cos(a)  0  sin(a)  0
 0       1  0       0
-sin(a)  0  cos(a)  0
 0       0  0       1

Rotation matrix for rotating about the Z-axis:

cos(a) -sin(a)  0  0
sin(a)  cos(a)  0  0
0       0       1  0
0       0       0  1

Rotate around an arbitrary, normalized vector (x, y, z). Remember that a normalized vector has length one; divide the vector by its length.

c = cos(a)
c_1 = 1 - cos(a)
s = sin(a)

x * x * c_1 + c       y * x * c_1 - z * s   z * x * c_1 + y * s   0
x * y * c_1 + z * s   y * y * c_1 + c       z * y * c_1 - x * s   0
x * z * c_1 - y * s   y * z * c_1 + x * s   z * z * c_1 + c       0
0                     0                     0                     1

Float Precision

An important remark to make at this point is that GLfloat is typically a single precision floating point number. Multiplying the same matrix in-place time and again gives floating point drift due to precision errors. This manifests itself as gimbal lock: rotations gone mad. The best solution is to avoid it; simply do not reuse the result of a multiplication as a new input for the next iteration of the game loop. You may have heard of quaternions as a solution to gimbal lock. Quaternions are a different mathematical way of representing 3D rotations. While quaternions do help prevent gimbal lock, they still work with the same single precision floats, and will therefore still drift when used inappropriately. Since OpenGL natively works with matrices, it makes sense to use matrices in favour of quaternions, at least for the basic stuff.

Closing words

Phew! That was a fair bit of matrix math. This stuff is highly confusing so let me apologize for any bugs beforehand—even though I do use it exactly like this in actual running code. We still haven’t touched upon constructing the projection matrix, be it orthogonal or perspective projection. Unless you really like math, it can get quite tedious. So let me stop here by saying that if you can do the math, good for you! Otherwise, just grab the glm library and be on your way.