Saturday, May 31, 2008

Optimizing an OpenGL star map

It’s time for some fun stuff, and write about programming. A while back, I was working on a way cool 3D OpenGL space thing. It had lots of stars, the orbits of all major planets (and their moons!) and you could fly around through this 3D space. The only drawback was that I’m still relatively new to 3D programming and it was *hard* to get things right. So, after a break of a couple of months, I took some of the data and turned it into a 2D star map.

Graphics programming is fun, but even in 2D it is not always easy. As always when programming, there are many ways to accomplish what you want, but you should try and find the most optimal way. Even with great NVIDIA hardware in your machine, your program will be slow if you don’t optimize.

The reason I have to optimize the star map code, is because it features over 300,000 background stars. The stars are plotted as GL_POINTs. The magnitude or brightness of the star is also taken into account.
Another reason is that it features some planets. I like drawing planets with many polygons so they don’t look blocky. Having many polygons makes it go slow. Any modern 3D game has a high polygon count, so how do they do that?

Optimizing the stars

Today’s hardware can output a staggering amount of pixels per second, but dealing with 300K+ stars was just too much for my poor old 6800 GT. (One of the bad things of speedy hardware is, no one tries to write good code any more). Of course, it is amazingly stupid to draw 300,000 stars when you can only see a couple of hundred in the viewport at once. One of the most annoying things of OpenGL is that it has this cool viewport that doesn’t seem to do anything for you; it doesn’t cull, it doesn’t scissor, this is all left to the programmer as an exercise. The reason for this is that OpenGL is not a magical tool and will not do anything right unless you get it right. There are many ways to determine what is visible and what is not, and it greatly depends on what kind of program you are making and how your data is laid out. In fact, the layout of the data greatly determines whether the program will be fast as lightning, or slow as a donkey.

For the star map code, I decided to chop the entire space up into sectors. A sector is like a square on a map with longitude and latitude lines. Only a few sectors will be visible at any given time, so only a limited number of stars have to be drawn. The chopping-up of the star data file was done by writing a small python script. Running the script on this large input took a while, but remember that when you are preprocessing, you have all the time in the world. When you are rendering frames in realtime, that’s when you don’t have that luxury. Data partitioning is a very fundamental way of making programs act fast on huge amounts of data.

In full screen, it still draws a considerable amount of stars. This results in a large amount of calls to OpenGL functions, and the sheer number of function calls is what is making the program slow down at this point. Luckily, there is a way to make it better. You can point OpenGL to an array of vertex data and tell it to render the entire array in one go.
Something like this:

glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_COLOR_ARRAY);

/* point to 2D coordinates */
glVertexPointer(2, GL_FLOAT, 0, sector->star_coordinates);

/* each star has RGB colors */
glVertexPointer(3, GL_FLOAT, 0, sector->colors);

glDrawArrays(GL_POINTS, 0, count);

Do this for every visible sector, and it’s blazing fast.

Optimizing the planets

For drawing planets I take the easy way and use a quadrics object, a glusphere. A quadric is really a 3D object, so it’s odd to use it in a 2D program, but on the other hand it looks kind of cool too. I don’t know what OpenGL does, but what I learned is, it is common practice to compile the quadric into a display list.
From the top of my head:

display_sphere = glGenLists(1);
glNewList(display_sphere, GL_COMPILE);
gluSphere(sphere_quadric, radius, slices, slices);
glEndList();

Now draw the sphere:

glCallList(display_sphere);

The display list now has a compiled-in radius. Unfortunately, not all planets are the same size. Therefore, the sphere has to be scaled to match the size of the planet. This is not without consequences. When you scale objects in OpenGL, so will their normals. The normal vectors are used by OpenGL to compute the correct lighting on the object. Planets without shade don’t look nice, so after scaling, the normals of every vertex (of the many sided sphere) have to be recomputed.

Now, because this planet is not going to change shape, it makes no sense to compute this object over and again for every frame. You can compute it once at program startup and keep a separate copy for every planet. This takes up more memory, but it will be fast.

In this blog entry I’ve given a couple of examples of how to optimize OpenGL programs. One problem that you will encounter is that optimizing sometimes means turning the code upside-down and inside-out. Optimizing performance can involve making major design changes. It is wise to draw things out on paper, and keep a couple of good copies of your code when trying new approaches.