Home       About Me       Contact       Downloads       Part 12    Part 14   

Part 13: The Cake is a Lie!

February 18, 2011

To recap, I was coding this game using DirectX 9, and my best timing was this:

Opaque: 1.87 ms;   Transparent: 16.49 ms;   Total: 18.36 ms;   Frame rate: 54 fps

I needed a feature called texture arrays, which weren't available in DX9, so I decided to try OpenGL. My first version of that, using the same logic as the DirectX version (no texture arrays) got this timing:

Opaque: 1.95 ms;   Transparent: 27.19 ms;   Total: 29.14 ms;   Frame rate: 34 fps

This was much slower for the transparent data. This is the same code generating the same triangle list, which takes up the same amount of space, so there's exactly the same amount of work to do. Although I haven't dug into it, I assume there's some extra overhead in Windows to get to OpenGL calls, and added up over thousands of individual triangles, this slows things down.

I wasn't worried though, since I was now going to use texture arrays, which would reduce all the separate transparent triangles down to a single API call. This benefits the opaque data too, since they are also a single API call now, instead of the 20+ separate buffers I was using. The texture array version has this timing:

Opaque: 1.56 ms;   Transparent: 9.88 ms;   Total: 11.44 ms;   Frame rate: 87 fps

Almost all of this time is in my code, to generate the transparent triangle list. Also, understand that this is not the typical drawing case for the final game. The code is drawing a 128 by 128 by 128 chunk of scenery, all in one shot. The real game will render multiple smaller chunks and page them in and out as you move around. This is just my speed test case.

This program ran fine under Windows 7 and Windows XP, which was good. But it wouldn't run on my Windows 7 laptop, due to differences between the shader compilers on NVidia and ATI displays.

I had written my shaders based on the "stock shaders" supplied with the OpenGL SuperBible. In my ignorance, I didn't realize they are old version 1.2 shaders, not the newer 3.3 version. I hadn't even declared the version in the shader, so I was getting the defaults under the two different compilers. When I added the line "#version 330", the behavior of the compilers changed. They both hated my code...

I got the shaders working on my main desktop machine, which has an NVidia display. The changes were trivial, since I'm not doing anything interesting in my shaders yet. I just needed to change the names of some keywords, etc. This again ran fine on my Windows XP machine, which also has an NVidia display. On the laptop with the ATI display, it compiled the shaders cleanly, but they didn't run. At all. I got a big black window with a pure white rectangle in the top-right quarter of the screen.


Turns out that for reasons I don't understand, the declaration of texture coordinates as "in vec2 vTex0" does nothing on the ATI display. It has to be "smooth in vec2 vTex0" or "centroid in vec2 vTex0". Under NVidia, all three work. I thought that "smooth" and "centroid" were options to change the interpolation of texture coordinates ("smooth" gets you better perspective transform of the texture), but I didn't think they were required.

ATI was also giving me a warning about re-declaring the builtin variable "gl_FragColor", whereas NVidia would give me an error (and fail) if I didn't. So I wasn't exactly getting warm fuzzy feelings about debugging a real game that people would be able to run on the vast variety of hardware out there. But since this was all running nicely on my three Windows machines, I moved on to the next platform.

Mac OS X

I fired up my seldom-used Hackintosh, which is a Shuttle desktop machine with an NVidia display. I had compiled an OpenGL sample for the Mac two weeks ago, before even considering OpenGL, so I didn't expect any problems. I read through enough of the documentation and tutorial on XCode, the Mac development environment, to get started.

The part of my code which is intended to be platform independent compiled more or less cleanly. I had a few Windows system calls to get rid of, and some minor variations in the standard C library to deal with. Unfortunately, the OpenGL code was throwing off all kinds of errors. I didn't take that seriously at first, since I thought I just needed to find the right include file and it would compile.

Under Windows, you fire up the ancient version 1.1 interface for OpenGL that Windows supports. Then using a library called "glew", you get the real interface to a more recent version of OpenGL. Version 3.3 was the best my display supports, so that's what I coded against. I expected the same kind of deal under MacOS. The OpenGL SuperBible had chapters on getting their code to work under MacOS, and all their examples were version 3.3.

But I just couldn't find the right include file, no matter what I tried from the book. So I went to their website, which included this little note:

So here's the thing. When we first started this book we thought it inconceivable that Apple would not have an OpenGL 3.x implementation available for OS X by the time the book shipped. The latest version of OpenGL is 4.2, and as of this date, OS X 10.6.5 (Snow Leopard), Apples most advanced and sophisticated operating system still only supports OpenGL 2.1.

Yes, Apple, you should be embarrassed. I love my Mac, and will never go back to Windows... but honestly, really?

The cake is a lie.

I had actually visited the web site to download their Windows source code, but hadn't bothered to read the rest of the page where it talks about Mac OS X. If I had, I wouldn't have even tried to use OpenGL.

I'm not sure what to do now. I can back the shaders off to version 1.2, which is what the Mac supports. And I can change the C++ code to use version 2.1 of OpenGL everywhere, even on Windows and Linux. After all, the whole point of using OpenGL was to have a common interface on all platforms. Still, I hate the idea of learning a new API which is already out of date, and using it just for compatibility.

Of course, since I won't finish this project for months yet, I could also just wait until the next MacOS release (Lion) comes out later this year, and hope it has 3.3 support. Even using OpenGL 3.3 on Windows only has its advantages -- I can write 3.3 shaders that work on both Windows 7 and XP. I can't do anything comparable with DirectX.

I suppose the thing to do would be to try Linux next and see what I get. My Linux box runs some version of Ubuntu and has no display card. Based on my experience so far, what are the odds that there will be good OpenGL support for Linux on an integrated ATI chipset?

I could of course dual-boot Linux on one of my more recent machines, but there's no good choice. The Hackintosh is incredibly fragile and I'm not messing with that again after the nightmare it was to get working in the first place. The Windows XP machine is 6 year old hardware. The Windows 7 machine is fine, but my experience with upgrades is like something out of this xkcd cartoon. Putting Linux on the laptop will be the same as my server -- integrated ATI graphics.

I suppose I might buy an NVidia display card and stick it in the Linux server. Or I have an NVidia 7900 GS display card that might run in that machine. Do any of you know if OpenGL 3.3 will be supported on that card under Linux?

I haven't given up on OpenGL yet, but I am disgusted with the whole situation. I will take another whack at MacOS or Linux after I cool down a bit.

DirectX again

Commenter "kpt" mentioned that I could use a "texture atlas" instead of a texture array to improve my performance under DX 9. This isn't an API feature, just a different use of the standard textures. You combine all your little textures into one great big one, and then only call out subsections of it when you paint triangles. It would work, but it has two problems:

First, you can't repeat your textures. If you had a big area of water or grass, a single texture can be repeated over the entire area. Using an atlas, trying to repeat would extend the texture coordinates into the next texture.

In Minecraft, Notch apparently makes his textures into a big column, so it gets wrapped in one of the directions (width). Then if he limits himself to runs of blocks in only one coordinate (X), he doesn't hit this problem. The texture coordinates never stray into the next texture vertically, and he gets repetition horizontally. Unfortunately in my code, I texture 2 by 2 or 4 by 4, etc. faces of larger blocks in the octree, so I would have to code around this problem by breaking large faces down into individual cubes.

Second, there are round-off error and scaling issues. If the texture coordinates go out of the 0-1 range, you will stray into the next texture in the atlas, which will look very bad (ex: a black edge of obsidian flicking in and out around your grass as you move.) The way to avoid that is enlarge all the small textures with a border, so that there's room for a little slop of coordinates.

The problem with this is mipmapping. The API creates reduced scale versions of the texture for use at a distance. I can put a border on the top 128 by 128 image, but this border will disappear in the reduced versions. So I have to put a huge border on the textures to still have one at the smaller scales. It's a nuisance.

Since I went to a lot of trouble to isolate the DirectX vs OpenGL interfaces from the rest of my code, and since I wanted some kind of demo to release this week, I went ahead and did a new DX version that uses the texture atlas approach. Here is the timing on that:

Opaque: 1.62 ms;   Transparent: 11.73 ms;   Total: 13.35 ms;   Frame rate: 74 fps

The transparent data drawn with the atlas technique is a bit slower than OpenGL because I have to expand all the 2x2x2 and larger cubes in the Octree into single cubes for texturing. The code to test for that is in the inner loop of creating the transparent index list.

The Demo

The new demo is at The Part 13 Demo. I've been doing so many experiments with it that many features were broken. To simplify things, I deleted all the code that isn't working from the source below. That means there are no text overlays for help, and no interference checking.

The UI is back to where it was in Part 2. Use WASD keys for movement, or use the cursor arrows. Hit ESC to exit the program.

The program defaults to the OpenGL implementation. To select the DirectX version, edit the "options.xml" file with a text editor, and set the "platform" attribute to "DirectX9", or back to "OpenGL3.3".

If the program fails under OpenGL, you'll find a file called "errors.txt" in the demo directory. Please email that to me at

The Source Code

Download The Part 13 Source for the C++ code, a roadmap to the source, and a build directory. This includes the executable demo and the files it needs. If you download this, you don't need to also download the demo zip above.

If you want to compile the code, the project is built with Microsoft Visual C++ 2010 Express. You will also need the DirectX Software Development Kit. It's possible you'll need the Windows 7 Software Development Kit too. These are all free from Microsoft.

Unfortunately, all three of these downloads are huge. Hopefully, if you are interested in game development, you already have them or an equivalent development environment. If not, I hope you have a fast internet connection! Download and install them in that order - Visual C++, then the DirectX SDK, then the Windows SDK.


The code has been substantially rearranged in the last two months, and there have been many experiments like the ray tracer which are not part of the demo source code. I want to start publishing the size and time statistics again though, so here's a new start point.

   Full Project
TheGame lines 5,822
Framework lines   6,336
Utilities lines 6,821
Total 18,979

Coding hours 299.0
Writeup hours 66.9

I have company coming this weekend, so the next part may be delayed.


When I reintegrated the DirectX9 code back into the demo, I left out a bit of initialization and a bit of the resize logic. For some reason, it still worked under Win7, and I didn't test it again under XP. (my bad!)

I've fixed this and there is now a new version of the demo and source. If DirectX was failing for you, try it again.

Home       About Me       Contact       Downloads       Part 12    Part 14   

blog comments powered by Disqus