OpenGL Matrices (done by hand)
Last time I mentioned that since OpenGL version 3, someone thought it was
funny to deprecate the OpenGL matrix math functions. Professionals were
probably already doing the math by themselves, and if you reason that the
OpenGL library should just do the job of talking to the graphics hardware,
then indeed there is no place for math functions in a graphics library.
And so it came to be that functions like glTranslate()
and glRotate()
were removed from OpenGL, leaving hobbyists stumped and in the dark.
Luckily, there is an excellent C++ glm
library now, specifically for
handling OpenGL matrices. I like rolling my own however, and I’d like to show
you how.
OpenGL 3D matrices are four by four elements, typically of type GLfloat
.
GLSL has a mat4
type for this, so in the vertex shader code you can write:
layout(location = 0) in vec4 input;
uniform mat4 projection;
uniform mat4 view;
uniform mat4 model;
void main() {
gl_Position = projection * view * model * input;
}
This mat4
type we would like to have in C++ as well, so let’s go and
write it.
What is the Matrix?
In OpenGL, a 4x4 matrix is a mathematical representation of a position, scale and orientation in 3D space. All in one. Position and scale are clear (I hope). Think of orientation as an airplane that has pitch, roll, and yaw. In 3D graphics we call that rotation about an axis.
When you study math, matrices are typically written in row major format. OpenGL however uses column major format. [Direct3D uses row major format]. It’s confusing as hell if you switch between formats. When programming with OpenGL, it seems wise to stick to column major format.
Row major Column major
m0 m1 m2 m3 m0 m4 m8 m12
m4 m5 m6 m7 m1 m5 m9 m13
m8 m9 m10 m11 m2 m6 m10 m14
m12 m13 m14 m15 m3 m7 m11 m15
Note the memory layout; in row major format it is just a straight C array of 16 floats. In column major format however, we have an array of four columns. From this we can build our C++ classes. Constructing a matrix class becomes easier if you first define a decent vector class:
struct vec4 {
union {
struct {
GLfloat x, y, z, w;
};
GLfloat v[4];
};
GLfloat& operator[](int idx) {
return v[idx];
}
};
Wait, what is this thing with an anonymous union / struct? It is a
programming trick so that we can write both v.x
and v[0]
. These notations
refer to the same thing; the first element of the vector.
The 4x4 matrix class is an array of four columns:
struct mat4 {
union {
vec4 c[4];
GLfloat m[16];
};
GLfloat& operator[](int idx) {
return m[idx];
}
};
You can make things more interesting by templating struct vec4
and using
typedefs, like so:
typedef struct tvec4<GLfloat> vec4;
typedef struct tvec4<GLdouble> dvec4;
typedef struct tvec4<GLint> ivec4;
typedef struct tvec4<GLuint> uvec4;
Matrix multiplication
The most important operation for matrices in 3D graphics is multiplication. Moving, rotating, and scaling objects are all done by means of matrix multiplication. Matrix multiply works the same in C++ as on paper, but do mind that we are working with column major matrices. Also note that matrix multiplication is not commutative, meaning that A x B != B x A. In other words, the order of operations is important. Implementing matrix multiply gets easier if you first write vector multiplication. A vector can be scaled by multiplying by a scalar:
vec4& operator*=(GLfloat f) {
v[0] *= f;
v[1] *= f;
v[2] *= f;
v[3] *= f;
return *this;
}
For matrix multiply we can now work with entire columns. We use a temporary variable so that the original values aren’t changed while still doing the operation:
mat4& operator*=(const mat4& o) {
vec4 res[4];
res[0] = c[0] * o.c[0].x + c[1] * o.c[0].y + c[2] * o.c[0].z + c[3] * o.c[0].w;
res[1] = c[0] * o.c[1].x + c[1] * o.c[1].y + c[2] * o.c[1].z + c[3] * o.c[1].w;
res[2] = c[0] * o.c[2].x + c[1] * o.c[2].y + c[2] * o.c[2].z + c[3] * o.c[2].w;
res[3] = c[0] * o.c[3].x + c[1] * o.c[3].y + c[2] * o.c[3].z + c[3] * o.c[3].w;
c[0] = res[0];
c[1] = res[1];
c[2] = res[2];
c[3] = res[3];
return *this;
}
Manipulating objects in 3D space now becomes a game of matrix multiplication. For completeness, I should mention that I use a right-handed coordinate system, which is the OpenGL default. Again, the given matrices are in column major format.
Translation
By translation we mean displacing an object by an offset, or in plain English, moving an object. When moving an object in a certain direction, it moves along a vector. This vector is itself represented by a matrix that only has a position, and no special orientation. Multiplying these matrices together results in a matrix that represents the moved object. The translation matrix is defined as:
1 0 0 x
0 1 0 y
0 0 1 z
0 0 0 1
Scaling
Scaling doesn’t change the position nor the orientation of the object, but it does change the length of the vector representing the object. Multiply by the scaling matrix:
x 0 0 0
0 y 0 0
0 0 z 0
0 0 0 1
Rotation
Rotation changes the orientation of the object. We will multiply by a matrix that is centered at the origin, but has a different orientation. There are optimized cases for rotating around a single X, Y, or Z-axis, or you can rotate around an arbitrary axis by using the uber rotation matrix. Beware that OpenGL rotations used angles in degrees, while C math functions use radians.
radians = degrees * M_PI / 180.0f;
Rotation matrix for rotating about the X-axis:
1 0 0 0
0 cos(a) -sin(a) 0
0 sin(a) cos(a) 0
0 0 0 1
Rotation matrix for rotating about the Y-axis:
cos(a) 0 sin(a) 0
0 1 0 0
-sin(a) 0 cos(a) 0
0 0 0 1
Rotation matrix for rotating about the Z-axis:
cos(a) -sin(a) 0 0
sin(a) cos(a) 0 0
0 0 1 0
0 0 0 1
Rotate around an arbitrary, normalized vector (x, y, z). Remember that a normalized vector has length one; divide the vector by its length.
c = cos(a)
c_1 = 1 - cos(a)
s = sin(a)
x * x * c_1 + c y * x * c_1 - z * s z * x * c_1 + y * s 0
x * y * c_1 + z * s y * y * c_1 + c z * y * c_1 - x * s 0
x * z * c_1 - y * s y * z * c_1 + x * s z * z * c_1 + c 0
0 0 0 1
Float Precision
An important remark to make at this point is that GLfloat
is typically
a single precision floating point number. Multiplying the same matrix in-place
time and again gives floating point drift due to precision errors.
This manifests itself as gimbal lock: rotations gone mad. The best solution
is to avoid it; simply do not reuse the result of a multiplication as
a new input for the next iteration of the game loop.
You may have heard of quaternions as a solution to gimbal lock. Quaternions
are a different mathematical way of representing 3D rotations. While
quaternions do help prevent gimbal lock, they still work with the same single
precision floats, and will therefore still drift when used inappropriately.
Since OpenGL natively works with matrices, it makes sense to use matrices
in favour of quaternions, at least for the basic stuff.
Closing words
Phew! That was a fair bit of matrix math. This stuff is highly confusing so
let me apologize for any bugs beforehand—even though I do use it exactly
like this in actual running code. We still haven’t touched upon constructing
the projection matrix, be it orthogonal or perspective projection. Unless you
really like math, it can get quite tedious. So let me stop here by saying
that if you can do the math, good for you! Otherwise, just grab the glm
library and be on your way.