Home       About Me       Contact       Downloads       Part 63    Part 65   

Part 64: Distant Blocks

September 13, 2012

Today we see the price of doing weekly blog entries -- I don't have anything even remotely finished, I'm annoyed at several issues, and I don't feel well. So this should be a really upbeat entry!

Fig 1: Blocks and polygons

I've talked for ages about mixing blocks and polygon-based scenery in SeaOfMemes. I want to use blocks there for building, but have the landscape be smooth. The only changes you'd make in the landscape would be digging holes or leveling a building area.

In Crafty on the other hand, the whole world is blocks, and it's more a matter of how to represent distant blocks. Figure 1 shows what I had months ago -- a heightmap-based landscape in the distance, and editable blocks nearby. This has problems:

  • The heightmap scenery doesn't necessarily match the block data. Players might have carved the mountain or built fortresses there, but you won't see that in the heightmap data. Even if I tried to summarize the block data, what would it look like? A reduced-resolution heightmap version of a building would be some lumpy shape.

  • Where the two styles meet up, the edge is a mess. I have to draw vertical walls at the boundary of the heightmap cells, and it doesn't look right at all. For one thing, what color should they be? Exposed earth?

  • The lighting doesn't match. The surfaces of all the blocks are flat or vertical, and the polygons are at arbitrary angles. There's no single lighting formula that makes them look the same. I also have the problem that blocks are textured with grass or stone patterns, and the distant polygons aren't -- each vertex is a pure color.

  • I don't think it's right artistically. True, the blocks would fade into landscape as they got tiny in the distance, but I can't render blocks out that far. I find the transition from blocks to smooth to be ugly.

So Use Blocks!

Fig 2: Larger blocks in the distance
The alternative is to do the distant scenery with blocks too, just at lower resolution. I've showed screenshots like that before, and people have commented that it doesn't look bad. But I've been selective about what I show. Figure 2 shows a version out of the newest code. It also has problems:

  • The transitions between different scales are really abrupt. Notice the curve of bricks at the bottom of the image, and the way it vanishes at the next power of two. That's because the source heights from the landscape generation function are rounded vertically to the coarser resolution.

  • The horizon is generally not flat. Even if the landscape were all y=123, for example, the different resolutions produce steps. At size=1 blocks, we get 123. At size=2, we get 122. Continuing through the powers of two, we get heights at 120, 120, 112, 96, 128, etc. At the far horizon, we get these huge blocks. Since my view distance goes out to 64km, and grids are 32 by 32 by 32, these will be 2km on a side. Really huge, in other words.

  • Even with fog (not shown), I find the scale difficult to work out visually. The distant blocks look too much like nearby ones.

  • I use vertex lighting to do cheap "ambient occlusion" as in Minecraft. What should I do on larger blocks? They shouldn't have these huge shadows. If I cut them back somehow, the landscape gets very monochrome, since everything is lit from the top.

I decided to try a compromise. I render the distant scenery with a heightmap, but draw it as blocks. Patches are still 32 by 32 and get coarse in the distance, but the heights are accurate. This gives a more gradual landscape and the blocks in the distance match the nearby blocks. It all looks like a "block world" and I can use actual block data to build the reduced resolution versions. So a large structure would still show in the distance. That looks like Figure 3.

Fig 3: Accurate heights on blocks

Fig 4: Lighting blocks
This looks promising and I'll probably continue along these lines. I do have a number of issues though:
  • If I light this from above like Minecraft, it completely washes out. You just can't see the distant steps.

  • Water and other transparent blocks are a problem. If I just paint the scenery with opaque water blocks, the transition between real cube data and scenery doesn't work. You can see through the nearby water blocks, and there's nothing below the surface in the scenery.

    I could just hack in water at a constant height above the bottom, but that won't accurately summarize distant scenery. You might have structures in the middle of a lake, or fountains, or buildings with water running down the side. It won't handle other transparent block types. To do that right, I need to keep a second heightmap with transparent blocks, drawn above the one you see.

  • If I summarize real block data, a heightmap is kind of limiting. I still need to import some real Minecraft data and see how this looks. Imagine the distant buildings with all underhangs removed, and then sampled down by powers of two. Could get nasty!

  • The reduced resolution cubes (Fig 1) approach is completely general. A heightmap doesn't let me do floating islands or any other 3D structure. But if I start getting fancy there (use a convex shape around the cube data?) it gets more expensive to draw. The only way I can give players a view out to the horizon is with simple fast graphics like heightmaps.

  • Lighting in an all-block world is fundamentally strange. In Figure 4, you can see light nearby, because you are above the tops of the blocks. As height increases in the distance, you can't see the tops, and it all goes dim. This would happen in the real world too, but only if your world were made of blocks!

The bottom line is I'm still fussing over this.

Instancing

Back at the beginning of the project, people told me to use instancing to draw all my cubes. Unfortunately, instancing isn't supported in OpenGL 2.1, so I couldn't use it and support the Mac (at that time) or tablets. Also, I didn't really understand how it worked.

I could see from the documentation that the instanceID variable is incremented for each draw of the buffer, but I didn't know what you could do with that. The example I looked at just placed blades of grass on a grid. I knew that I needed to place cubes at arbitrary locations and wasn't sure how to get that data into the shader. I hadn't realized you could pass coordinate data or transforms through textures. (OpenGL 2.1 also doesn't have anything but 8-bit textures.)

What I did instead was packed my vertex data into bit fields with integers, instead of using floats. Part 16 and Part 17 cover my efforts on that. The result was that instead of 36 bytes per vertex (9 floats), I use 8 bytes. That let me leave a large number of chunks in the display without swapping them as you moved.

Unfortunately, this doesn't work at all under OpenGL 2.1, where integer vertex attributes are not supported and there are no bit shift operators in the shaders. So ironically, on the older displays with limited memory, I'm using a lot of memory, and on the newer displays with lots of memory, I compress vertexes and need less.

Things got worse in Part 34 when I introduced non-cube shapes. These have lots of triangles, and there are parts of the scenery where many blocks are non-cubes (grass, for example.) This puts a huge amount of data into the vertex buffers. Again, under OpenGL 3.2, I compress the vertexes, and under OpenGL 2.1, I can't.

In fact, if you play with McrView on a complex world, you'll notice that it pages scenery chunks in and out a lot (fades them in again when you turn your head). This is because the entire region around the player just doesn't fit in display memory. Sometimes it also drops frames, because the chunks are now so huge that they can't be transferred to the display in time.

It can be done!

I understand how instancing works now, and have frequently thought I should rewrite the code to use it. Then complex objects like grass or rail tracks would take almost no room in the display. But that would require two completely different versions of the code for OpenGL 2.1 and 3.3. So I didn't do anything about it.

Then I realized there is a way to do instancing under OpenGL 2.1. It requires three things to happen:

  1. We have to draw N copies of the object. Instancing lets you specify a count and then the buffer is repeated that many times. But we can define a buffer with 1000 cubes in it. Then if we want 10,000 cubes drawn, just draw that buffer ten times. For less than 1000 cubes, DrawArrays takes a range of indexes, allowing you to draw less than the full buffer.

  2. There has to be an instanceID. But, I can add an id to each vertex in my buffer and initialize them in my code to 0 through 999. Then the shader can key off this field of the vertex just as it would a real instanceID. To draw more than 1000 instances, I can add a uniform "baseIndex", which is added to the vertex id. So the first draw is with baseIndex=0, then the second time with baseIndex=1000, and so on.

  3. I need to send data to the shader via a texture. Textures with floating point values would be required for arbitrary transforms, but not for my application. All I want to send is x,y,z for a block offset within a chunk, and these are all in the range 0-31. A texture with 8 bits of precision per color can handle this without problems.
Fig 5: Instancing
I spent a day playing with this to see if it would work. The results are mixed so far. I first drew a large collection of cubes -- 75 by 75 by 75 (Figure 5) all in one buffer, to get a time. On my machine this was 21 ms.

If I use instancing to render my cube data, I need to have a buffer for each different shape (grass, rails, etc.) and draw each one with a texture giving locations. I called DrawArrays 75 times, each time with a subsection of my larger buffer. To my surprise, this takes the exact same amount of time as drawing it all in one buffer.

To check this again, I split the buffer into 5000 calls (75*75 actually) and drew that. This also took the same time as a single buffer! Then I got suspicious and tried drawing the chunks in reverse order, from high to low indexes. This should be the same amount of work, but increases the time to 35ms. Apparently, something in OpenGL is noticing if I draw consecutive pieces out of the same buffer and just enlarging the initial draw...

After that, I split the data into 75 actual buffers and drew those. This takes 25 ms, less than drawing the single buffer in 75 calls. That's very odd, since it's the same amount of work, other than starting at a different point in the buffer. It still takes longer than drawing it all with a single buffer though.

Then I implemented my work-around for instancing. The shader combines the baseIndex uniform variable with the vertex index field, converts that into a row/column in a texture and reads the pixel there. X, y, and z times 256 give me my coordinates, and I draw the vertex. That takes 33 ms.

This is considerably slower than doing it all as one buffer, which implies instancing is going to save a lot of space, but cost a lot of time. I assume the extra time is going to accessing the texture in the vertex shader, but I haven't played with it enough yet to be sure.

Unfortunately, I realized this doesn't solve all my problems. If I rendered all cubes as instances of a single cube, I'd be drawing all six faces, which is more than I should for partially obscured cubes. I could handle this with 2**6 = 64 separate cube buffers, one for each possible arrangement of faces. If that's too many calls, I could add rotation info and cut the number of cases.

Even this doesn't work with transparency though. Transparent cubes have to be drawn in order from back to front, and I can't draw any extra sides without changing the look. So I can't just draw them all as 6-faced cubes, and I can't draw all the ones with just tops, then the ones with just bottoms, etc.

To do transparency correctly, I'd have to make a cube face my instance, and draw them all in order from back to front. The only savings over what I do now would be that instead of 24 vertexes (6 faces times 4 corners), of 8 bytes each (192 bytes), I'd be drawing 6 faces of 4 bytes each (24 bytes.) On OpenGL 2.1, this would be a much bigger savings, since there I use 36 bytes per vertex (864 bytes per cube!) and that would become 24. So definitely worth doing, if I can afford the time.

A smart cube shader is what I was trying to write in Part 17, and it did not go well. I used a constant array in the shader code and it was 30 times slower than the simple case. Commenters had told me that uniforms might be faster than constants, but I was so disgusted with the situation that I didn't try it.

This time I coded it up with array uniforms for corners, normals and texture coordinates. To my surprise, it is actually a bit faster than my instanced cubes -- 30 ms vs. 33 ms. This can only be due to the smaller vertex. In the "face instance" case, the vertex is just the instance ID and a face number (0-5). In the instanced cube case, the vertex is the full 9 floats.

I haven't tried this on any slow hardware or ATI vs. Nvidia, and I haven't turned it into full rendering code for my landscapes to get a final time. I haven't even coded a 2.1 shader for this, although I don't think I'm using any operators that aren't supported there.

If this works, I can at least do all the non-cube shapes with instancing, even under OpenGL 2.1. That will reduce the memory use dramatically without making things too complicated. Whether I can use it for all my cubes without a performance penalty remains to be seen.

That's all for this week.

Home       About Me       Contact       Downloads       Part 63    Part 65   

blog comments powered by Disqus