Quaternion Rotations
Back in February, I wrote about how you use matrices to do rotations in 3D space for use with OpenGL computer graphics. I showed how a 4x4 matrix represents the position, orientation, and scale of an object in 3D space. By multiplying matrices together, you can move, rotate, and scale objects. Near the end of the post I mentioned quaternions, an alternative way of doing rotations.
Like matrices, quaternions are a mathematical means to do rotations. The benefits of quaternions are that they are much smaller than matrices, require less computation, in turn causing less floating point drift. They prevent gimbal lock. They are great for combining rotations (e.g, skeletal animations) and they can do smooth spherical linear interpolation (SLERP). For example, you can have a camera smoothly follow the main character in a 3rd person view.
If that doesn’t sound a bit magical, wait till you get to the next part.
What exactly is a quaternion? A quaternion is a vector plus an angle in complex space. It represents an orientation, and therefore we can use it to perform rotations. It is in complex space, which is non-intuitive to human beings, so it can be difficult to grasp. An analogy to help explain it: suppose we have a 2D object (e.g, a drawing of a puppet) in a 2D coordinate system. Now let’s rotate it anti-clockwise by 45 degrees. When we rotate the 2D object in its plane, what we really are doing is rotating it about its local Z-axis, that is pointing out of the plane. Note however, this “Z-axis” was never a part of the coordinate system to begin with (!). We need an extra imaginary dimension to make it happen.
Crazy as it may seem, we can extend this example to 3D objects, and use an
imaginary 4th dimension to rotate them. What does it mean, to use the 4th
dimension—does the object travel back in time? No, you’ve been reading too
much sci-fi. We need an extra dimension to describe the performed rotation.
By introducing an imaginary dimension we’ve just wandered into the realm of
complex numbers, where i^2 = -1
.
Before running away screaming, I’m just going to say “shut up and calculate”. The math works.
The imaginary axis for the vector are given by sqrt(-1)
, and as you know,
computers have huge problems with negative square roots. Lucky for us,
we can safely ignore the imaginary parts and only deal with the real parts
of the complex numbers. A quaternion is thus given by only four values:
(x, y, z, w)
. [Some prefer the notation (w, x, y, z)
]. We will see how
to find these values in a moment. First we define the data structure:
struct quat {
union {
struct {
float x, y, z, w;
};
float q[4];
};
float& operator[](int idx) {
return q[idx];
}
const float& operator[](int idx) const {
return q[idx];
}
};
Just as with matrices, we use a programming trick with an anonymous union
so that we may write both a.x
and a[0]
. You can make things more
interesting by templating struct quat
and using typedefs, like so:
typedef struct tquat<float> quat;
typedef struct tquat<double> dquat;
We can initialize a quaternion from a 3D vector and an angle. The vector must be in the unit sphere, so it must be normalized. The quaternion’s values are a projection of the supplied vector. So, another way to think of a quaternion is to imagine a 3D shadow of 4D axis. Right … anyway, this adds to the fact that quaternions are difficult to interpret; just don’t bother trying.
v.normalize()
half_angle = angle * 0.5
radians = half_angle * PI / 180.0
s = sin(radians)
x = v.x * s
y = v.y * s
z = v.z * s
w = cos(radians)
The quaternion works with a half angle. The reason for this is higher math, so I’m just going to say that’s the way it is. If you wish to know more and are not afraid of math, search math.stackexchange for “quaternion half angle”. The angle is given in degrees (like it usually is in OpenGL), but in programming the trigonometry functions use radians. Therefore, convert to radians.
Now, how to rotate? The quaternion already represents a rotation, so chances are, you can stop here. But suppose you have modeled a rotated turret of a tank, and now the tank itself rotates as it drives around, then you need a combined rotation to find the correct orientation of the turret with respect to the world. Combining rotations is done by multiplying them together. As with matrices, quaternion multiplication is non-commutative: A × B != B × A.
cx = ax * bw + ay * bz - az * by + aw * bx
cy = -ax * bz + ay * bw + az * bx + aw * by
cz = ax * by - ay * bx + az * bw + aw * bz
cw = -ax * bx - ay * by - az * bz + aw * bw
Now that we know how to calculate rotations using quaternions, there is only one problem left. How to tell OpenGL what to render? The OpenGL shaders work with matrices, they know nothing of quaternions. So we must convert the quaternion back to a 4x4 matrix. That matrix is given by (in column major format):
xx = x * x
xy = x * y
xz = x * z, etc.
1 - 2 * (yy + zz) 2 * (xy - zw) 2 * (xz + yw) 0
2 * (xy + zw) 1 - 2 * (xx + zz) 2 * (yz - xw) 0
2 * (xz - yw) 2 * (yz + xw) 1 - 2 * (xx + yy) 0
0 0 0 1
Do mind that the numbers are floating point values, not integers.
Converting the quaternion back to a 4x4 matrix takes some computation, so there is a certain cost involved. I never said that you should do all rotations with quaternions, just that they are especially suited for certain situations.
It is also possible to extract the individual Euler angles (pitch, yaw, roll) but I won’t show it here. One problem with Euler angles is that the inverse sin/cos/tan functions only return angles smaller than 90 degrees. This follows from trigonometry, but it’s rather inconvenient that when you initially rotate by 120 degrees, you don’t get out the value 120 again.
As you may suspect, it’s also possible to construct a quaternion starting from a matrix. Doing so is a bit of an involved process. In general it is easier to keep Euler angles and create quaternions from that.
I have to say I struggled for a bit to get it to work correctly. Then
all of a sudden the penny dropped, you must use a unit vector to
initialize, and then it “just worked”. I actually found some other code on
the net that was more optimized, but it wouldn’t work for me, so … maybe
he had some other weird optimal case. The things presented here have been
tried and tested. On top of that I have double checked with the glm
implementation, which is comparable. Finally, I would like to state that
glm
is rather excellent, and there is no real reason to not use it.
Still, it’s a good exercise to write your own, and know where all those
bits are coming from.