Home       About Me       Contact       Downloads       Part 65    Part 67   

Part 66: Rendering Again

September 27, 2012

I'm working on low-level rendering issues and still don't have a demo. This is the same work I was doing over a year ago, which makes me feel like I'm making absolutely no progress. I find graphics programming really difficult, and the performance issues (especially with shaders) very hard to predict. On top of that, debugging is slow, since it's hard to isolate test cases for subtle glitches in the graphics.

I don't think I have a choice though. If I restrict myself to the 200 meter landscapes that I can render with blocks, the project loses a lot of interest for me. You could be on a planet, an asteroid, inside a cylindrical generation ship, and none of it will show. It will be the same 200 meter view, just with different skies. To build the world I want to build, I have to be able to show scenery out to the horizon, and reflect changes that people have made. Doing that well means working on these performance issues.

Fewer Vertexes

Fig 1: Greedy combination of faces
Last week, I discovered that most of my time was being spent in the vertex shaders and not in the fragment shaders as I had thought. So to improve performance, I need to cut the number of vertexes by combining cube faces. Mikola Lysenko's 0 FPS blog has a post on that topic.

He describes a "greedy algorithm" that is simple to implement. You have a grid of cube faces (Figure 1). Start with the first block, extend to the right as far as you can, then extend this row down as far as you can. That's your first quad to draw. Eliminate that rectangle of faces from the grid, and repeat.

If you do this from all six directions in the chunk, stepping through each layer of bricks, you'll have the quads needed to draw the complete chunk.

There's one obvious problem with this though. When you draw a quad, you want to fill it with a single face texture (stone, wood, etc.). This means you'd actually only extend a quad as long as bricks were the same type as the brick which started the quad.

However, it's not just the cube face that makes it unique -- there's also lighting. Since I'm doing the cheap "ambient occlusion" lighting at the vertexes, and torch/sky lighting on the entire face, that would also have to be taken into account. Only a row of cube faces where all of these things are identical would be combined.

The ambient occlusion at the corners is particularly trickly. Since I'm counting on the averaging done in the shader between vertexes, two adjacent faces with the same lighting at the corners can't in general be combined. They could only be combined if the larger quad with lighting set at its four corners matched the faces I was replacing. If you note all the lighting variations in Figure 2, you'll see that the number of faces I can really combine this way could be quite limited.

Fig 2: Lighting on cubes

Anti-Aliasing

What I wanted was a shader that could draw a quad with multiple cube faces. That way, I could combine faces based on the shape alone, getting much better compression. To implement a shader like that, I needed to add a layer of indirection.

Instead of having a texture with actual pixel values in it, I could have a texture with face type indexes in it. Then in the fragment shader, I could first look up the index to use, then the actual pixels with a second lookup into my face texture array. Florian tells me this is called "tile mapping." It was easy to implement, but produced Figure 3. As you can see, the aliasing effects are very bad, even at relatively close distances. I wasn't getting any benefit from multisampling or mipmapping.

Fig 3: Tile Mapping
Fig 4: Mipmapped 2x2 texture

To see what was going on, I switched to a black and white tile pattern. Figure 4 shows a 2 by 2 texture repeated over a single large quad. That allows mipmapping to work on the interior of the tiles. I'm using the "Linear Mipmap Linear" style on the textures. This is about as good as it gets.

I can't actually get anti-aliasing this good without creating a separate texture combining multiple cube faces for each quad. The storage for that would be prohibitive. Instead, I'm currently drawing a separate quad for each face (two triangles, actually). That also gets antialiasing from multisampling (Figure 5). In the enlarged portion, you can see the multisampling is blending the edges of the quads.

One theory of why Figure 3 looks so bad is that multisampling only applies to the edges of primitives. Since all my tile changes are done within the shading of a single quad, I don't get any multisampling. But in Figure 6, I'm using my tile mapping shader again, but only doing a single cell at a time with it. I would have thought that multisampling would again blend the edges, but it doesn't appear to.

Fig 5: Multisampling quads
Fig 6: Tile mapping single quads

Florian sent me a long email about multisampling and mipmapping that I'm still digesting. I need to do an experiment with mipmapping. If I create my own mipmap levels with a different color for each level, I should be able to see easily if I'm getting any mipmapping on my tile map shader. I'm wondering if the odd way I address the texture array is causing mipmapping to be disabled along with multisampling.

Even if I could get multisampling working again, if it only applies to the edges of the quads, I'm going to get bad aliasing as in Figure 3. To fix that, I think I have to do my own AA calculations in the shader. I have no idea what that would look like or how expensive it would be.

Lighting

On top of these aliasing issues, there's also the problem of lighting. How am I going to set individual lighting values for the corners of cube faces when I draw several at a time? I am sending the texture array index in the red channel of the tile map. There's no reason I can't send lighting information in the green, blue and alpha channels.

Right now, I evaluate the cheap ambient occlusion lighting (see Part 35) on each corner of a face. There are only 3 possible states at the corner (number of blocks that can be connected to the corner.) So for all four corners, there are 3*3*3*3 = 81 possible cases. I could do the ambient lighting by sending a second index into another texture array, with a texture for each possible case. This would actually allow me to do slightly nicer shading, since I could precalculate better gradients.

The overall block lighting is currently the average of the neighbors, so it would have far too many combinations to have another texture array index. I'm not sure what to do to handle that. Even if I did have a texture-based approach, then we'd be talking four texture lookups per pixel. I have no idea what that would do to my performance.

The bottom line is that a shader which draws multiple faces might be a dead end. Of course, if lighting means the cube faces are all different from one another, then combining faces might be pretty much a dead end as well. I don't know yet.

Test Cases

Testing various approaches to showing distant scenery is a problem. My full Crafty app has a lot going on -- it's loading the chunks from the object store, creating vertex and index buffers for them, and moving them into the display. It's also culling chunks outside the view frustum, and handling the split and join of different resolution areas as you move. It all makes for a very complicated test case.

On top of that, I can theorize about what various techniques would do to buildings in the distance, but I haven't had any demos that actually had distant buildings. I've been doing all my testing on procedurally generated landscape.

Then I realized that my draw code can now handle a fair amount of landscape at interactive speeds. So I converted the old TwentyMine server save files to my format and extracted a 128 by 2048 brick chunk. That looks like Figure 7.

Fig 7: A strip of landscape, 2048 by 128 by 128

This is 5.9 million solid face vertexes, 1 million transparent face vertexes, 9.3 million non-cube vertexes (grass, rails, etc.) for a total of 16 million vertexes. On my machine, it renders in about 30ms, making it a reasonable test case. This is also 2km of distance in the world, which is a decent amount.

So far, I've only done some quick tests on this. I sampled it down by powers of two in the distance, which I already know looks bad for landscape. It's not quite as horrible as I expected on this Minecraft data. If you want to look at a full-resolution version without any compression, go here (1.3 meg)

Looking at this, I think I need more test cases. I'd really like to have mockups of the various worlds I've been thinking of. The problem is adding buildings. Back in Part 22, I just hand-selected a few buildings and placed them on the landscape. I want something denser here.

Since the Minecraft landscapes are made of just a few block types, I should be able to skim off the buildings by filtering these out. Then the remaining bricks could be added to a new procedural landscape. I'll have to see what works.

More next week.

Home       About Me       Contact       Downloads       Part 65    Part 67   

blog comments powered by Disqus