Foreword
It so happened that I started working on programming quite recently - the first C ++ book was bought when I went to grade 11, in September 2011, and the first working “hello_world.exe” was compiled only in November. Before that, I successfully spent 6-7 years on shooters, and on the economy, the olympiads on which provided me with an entrance to an economic university (which I already regret) and familiarity with a group of economists who in parallel mastered web-programming and who pushed me to learn programming .
I decided to master C ++ by writing simple toys, while studying in parallel various aspects of programming that could be needed to write them.
The first and second were a peculiar clone of the game Pong. They differed only by the presence of color and timer in the second version.
The third was planned as a mixture of platformer and shooter, but one unforeseen circumstance happened, because of which the writing of the third game has been delayed for almost a year, and is still ongoing.
Introduction to voxel graphics
While searching for collision calculation algorithms on the GameDev website, I came across a small article about the
idTech 6 engine and became interested in voxel graphics, which are contrasted with polygonal graphics, on which almost all computer graphics are now based.
In general, a voxel stands for “volumetric pixel”, but now a voxel basically means a certain primitive, most often a cube or a rectangular parallelepiped, which has a certain size and color. In idTech 6 and in the Ken Silverman
Voxlap engine, they are stored in a sparse octree tree (SVO - sparse voxel octree), which saves memory and makes possible a simple implementation of the “level of detail”.
In other words, voxels are such a LEGO constructor, where from the same parts of different sizes (as well as connecting small parts into large ones), you can implement a wide variety of models, shapes, etc.
One of the biggest advantages of voxels relative to polygons is that they can be destroyed - the program knows for sure that if a voxel is partially destroyed, then it can be divided into smaller voxels, which will be from the same material as their parent.
Destruction in the test game on the engine Voxlap First attempts
I did not dare to storm the three-dimensional space at once - the whole list of formulas that I would have to learn (they will be discussed below) looked too scary, and decided to simply rewrite the code so that the platformer would use the “boxes” - the squares octree and sparse quadtree.
The first code was terrible - all the functions of working with a tree, such as deleting, traversing the tree, creating nodes, were iterative functions separate from the QuadTree class. In other matters, even adding them to the class limits did not play a special role, since it later turned out that recursive functions greatly benefit in this case. The only thing that brought these useful first attempts was a clear statement of what functions I needed, as well as the basics of implementing C ++ trees, which later would help a lot and, I hope, will still help. And, of course, it was precisely those first attempts that pushed to study OpenGL (although Direct2D had been seduced by this, but very quickly became disillusioned with it).
')
Transition to volume
I started to learn OpenGL on
NeHe gamedev and, as it were, very quickly got into three-dimensional space and began to plan the engine for a Quake-like game. The quadtree was rewritten in octree and the first difficulties began. Octree trees consume much more memory, and even though all the basic functions became recursive, they still spent too much time and memory. To solve this problem, the following methods were implemented:
Memory optimization
In the octree tree, it is very often necessary to use the new / delete operators, which allocate a place in the dynamic memory (heap) for the pointer. Dynamic memory is slower than static (stack), and the new / delete functions themselves were too slow for me. This is why the own MemoryPool class and mem_pool_tree template were written.
mem_pool_tree was written under the impression of the BST-tree, which I met from the book by T. Korman “Algorithms. Construction and analysis ", and does not work directly with memory, but only operates on numbers, which are subsequently used to offset the pointer from the beginning of the array in the static memory region. It was not possible to predict deletions, but to allocate “correct” pieces of memory was real, which is why I took the idea and turns from the BST tree and added “blockiness” - mem_pool_tree stores nodes in which two variables store the beginning and end of the block, and two more variables - the beginning and the end of the occupied space. If an attempt is made to delete a piece in the middle of the occupied space, then the node is divided; if the function of selecting a piece is called, the algorithm searches for such a block, where the allocation of space will allow it to merge with the neighboring block. And the balancing function is called periodically.
Multithreading
Because of the structure of the tree, in which the parent node has a pointer to an array of eight child nodes, functions that require a complete traversal of the tree (such as deleting the entire tree, removing redundant elements, calculating average voxels, etc.) were written with the ability to enable multithreading. Multithreading has been implemented using OpenMP. For example, you need to optimize the tree (for example, why store eight child nodes, if you can transfer their color to the parent node, and remove them). We realize:
class Octree { class Node { Node * son; void Optimize(); }; void Optimize() { #pragma omp parallel { #pragma omp for for(int n = 0; n<8; ++n) { son[n].Optimize(); } } } };
Since the child nodes are not connected to each other in any way, such an operation does not require mutexes, which is very good in conditions when minimal memory costs are required.
Loading and saving voxels
It took a long time to search for the optimal method of storing voxels in a file - after all, in conditions where RAM is valuable, storing extra voxels in RAM is an unaffordable luxury. After a long search, the choice focused on SQLite3, in which there is caching, as well as the ability to load only those voxels that are required based on the “level of detail” values. The fastest work with SQLite3 databases turned out when embedding the source code sqlite3 into the project and self-compiling (I don’t remember the exact numbers, but something like half a million variables for 200-250 ms, and on a netbook with Intel Atom). Naturally, SQLite3 was used to speed up “Begin transaction ;,” “Commit transaction;”, “PRAGMA journal_mode = MEMORY;”, “PRAGMA synchronous = OFF;”, etc.
Screenshots
Actually, here I will show some small screenshots, as the code descriptions go further, which is at the implementation stage. The objects on the screenshots are, of course, very simple, but the only reason for this is that my hands do not reach everyone to draw a normal complex model, or to convert existing ones. Moreover, these are the very first screenshots, and for rasterization a tiny code was written using GDI, rather than OpenGL, and the ray tracing performed the most usual cycle in which calculations of the rotation matrix and other calculations were performed on the CPU.
Current tasks and conclusion
Polymorphism
Oktotreevo is once again being rewritten using polymorphism. The main task is to make the tree not a pure octree tree, but a crossing with a
kd-tree (a tree in which there is not a voxel split into 8 small voxels, but a split into two voxels with a certain proportion and along a certain axis), and other modifications.
Raycasting
Oktreewood allows
Ray Casting , the algorithm of "throwing rays", with which I now write a rasterizer. Also, the implementation of the algorithm uses OpenGL (generating texture from an array and displaying it on the polygon), “group tracing” and
C ++ AMP . In general, this topic is well disclosed on
ray-tracing.ru .
Conclusion
In general, the topic is interesting, and you can find a lot of interesting things on it. For example: an article on Habré about the
Atomontage engine and the
presentation of SVO technology with SIGGRAPH 2012.
UPD
The class of memory allocation written by me with the use of an array in static memory, after measurements, produced the following data:
- deleting 20 million objects of 100 bytes each (in total, all this took more than 220 MB) took 2-3 seconds (of course, the objects were deleted in random order)
- measurements of memory allocation spent a long time, but the latest results were like 50 million objects of 8 bytes in 4-5 seconds
- reading is faster, because stack memory is faster than heap memory, but for now I'm testing on different hardware to figure out the average speed difference
- when I tried to allocate memory with the usual operator new from dynamic memory for 50 million objects, the program threw me exceptions to std :: bad_alloc, and when I wrote try-catch to process them, the program loaded the memory so much that almost all running applications on the computer crashed .
Of course, this class is unlikely to be interesting for any serious projects, but in principle, would it be interesting for someone to see the post with a detailed description and implementation of this class?