Cascade tracing of voxel cones in The Tomorrow Children

What: tracing voxel cones cascades

For The Tomorrow Children, we implemented an innovative lighting system based on voxel cone tracing. Instead of using traditional direct or deferred lighting systems, we created a system that covered everything in the world by tracing cones through voxels.

Both direct and reflected lighting are processed in this way. It allows us to count on the PlayStation 4 three reflections of global illumination in semi-dynamic scenes. We trace cones in 16 fixed directions through six cascades of 3D textures and perform light absorption using screen space directional occlusion and spherical occluders of dynamic objects to obtain the final result. The engine also supports a harmonic-based spherical lighting model, which makes it possible to calculate particle illumination and implement special effects, such as approximated subsurface scattering and refractive materials.

Who: James McLaren, Director of Engine Technology at Q-Games

I started programming at the age of 10 when I was presented with a good old ZX for my birthday, and since then I have never regretted it. When I was a teenager, I worked on the 8086 PC and Commodore Amiga, and then entered the University of Manchester in the specialty "computer science".
')
After university, I spent several years working at Virtek / GSI on flight simulators for PCs ( F16 Aggressor and Wings of Destiny ), then I started racing games ( F1 World Grand Prix 1 and 2 for Dreamcast) in the Manchester office of the Japanese company Video System. Thanks to her, I got the opportunity to fly to Kyoto several times, fell in love with this city, and then moved to it, starting to work at Q-Games in early 2002.

In Q-Games I worked on Star Fox Command for DS, the base engine was used in the PixelJunk game series. I was also lucky to work directly with Sony on graphics and music visualizers for PS3 OS. In 2008, I moved to Canada for three years to work at Slant Six games on Resident Evil Raccoon City , and in 2012 I returned to Q-Games to participate in the development of The Tomorrow Children .

Why: fully dynamic world

At the early concept of The Tomorrow Children, we already knew that we wanted a completely dynamic world, modified and changed by players. Our artists began to render image concepts using the Octane graphics processor renderer. They illuminated objects with very soft gradient sky lighting, and were delighted with the beautiful light reflections. Therefore, we wondered how we could achieve the desired effect of global illumination in real time without baking.

Screenshot of an early concept rendered with Octane. He demonstrates the style sought by artists.

First, we tried many different approaches with VPL and crazy attempts to trace rays in real time. But we almost immediately realized that the most interesting direction was the approach proposed by Cyril Crassin in his 2011 work on tracing voxel cones using sparse voxel octree (Sparse Voxel Octree). I was especially attached to this technique because I liked the ability to filter the scene geometry. We were also inspired by the fact that other developers, such as Epic, also explored this technique. Epic used it in the Unreal Elemental demo (unfortunately, they later abandoned this technique).

What is “tracing cones”?

The technique of tracing cones is somewhat similar to ray tracing. In both techniques, we strive to obtain many samples of incident radiation at a point, emitting primitives and crossing them with the stage.

If we calculate enough samples with a good distribution, we can put them together, getting an estimate of the incident light at the right point. We can then pass this data through DFOS , which represents the properties of the material at a point, and calculate the output illumination in the direction of the angle of view. Obviously, I omitted details in the description, especially with regard to ray tracing, but for comparison, they are not too important.

When crossing the ray, we get a point, and in the case of a cone, a region or volume, depending on how you approach this. It is important that this is no longer an infinitely small point, and because of this, the properties of the estimation of illumination change. First, since we need to evaluate the scene in the field, our scene must be filtered. In addition, due to filtering, we get not the exact value, but the average, and the accuracy of the estimate decreases. On the other hand, since we estimate the average, the noise, usually obtained by tracing the rays, is practically absent.

This property of tracing cones caught my attention when I saw the presentation of Cyril at Siggraph. Suddenly, we realized that we have a technique for an acceptable estimate of the illumination of a point with a small number of samples. And since the scene geometry is filtered, there is no noise, and we can quickly perform calculations.

Obviously, the problem is this: how to get cone samples? The purple surface area in the figure above, which defines the intersection, is not so easy to calculate. Therefore, instead we take several volume samples along the cone. Each sample returns an estimate of the amount of light reflected towards the top of the cone, as well as an estimate of the absorption of light in that direction. It turns out that we can combine these samples with simple rules used in ray tracing through volume.

Cyril's original work required voxelization to a sparse voxel octree, so in early 2012, before returning to Q-Games, I started experimenting only with them in DX11 on a PC. Here are screenshots of a very early (and very simple) first version. (The first shows voxelization of a simple object, the second shows the lighting added to a voxel of light with a shading scheme):

There were a lot of ups and downs in the project, so as a result I postponed my research for some time to help implement the first prototype. Then I adapted it to the equipment for developing the PS4, but there were very serious problems with the frame rate. My initial tests worked poorly, so I decided that I needed a new approach. I always doubted a little whether voxel octodules in the graphics processor make sense, and I tried to make everything much easier.

A cascade of voxels appears on the scene:

Thanks to a cascade of voxels, we can simply store voxels in a 3D texture instead of octree trees, and create different levels of stored voxels (six in our case), each with its own resolution, but with the size of the volume covered each time doubling. This means that in the vicinity we will have volumetric data with good resolution, but at the same time a rough representation of objects that are far away will remain. This type of data placement should be familiar to those who implemented clipmapping or light propagation volumes.

For each voxel, we need to store basic information about the properties of its material, such as albedo, normal, radiation, etc., in all six of its directions (+ x, -x, + y, -y, + z, -z) . Then we can insert lighting into the volume for any voxel located on the surface, and several times trace the cones to get reflected light and save this information again for each of the six directions in a different voxel cascade texture.

The six directions mentioned are important because then the voxels become anisotropic, and without it the light, which, for example, is reflected from one side of the thin wall, can leak to the other side, which does not suit us.

After we calculated the lighting in the voxel texture, we need to display it on the screen. You can do this by tracing cones from pixel positions in the space of the world, followed by a combination of the result and material properties, which we also render into a more traditional two-dimensional G-buffer.

Another subtle difference from the original Cyril method is in the rendering method: all lighting, even direct, is obtained from tracing the cones. This means that we do not need to do anything difficult to insert lighting, we simply trace the cones to obtain direct lighting, and if they then go beyond the limits of the cascade, we accumulate the lighting of the sky, reducing it with all the partial absorption obtained. At the same time, dynamic point light sources are processed using another voxel cascade, in which we use a geometry shader to fill the values of the radiation. You can then sample the data from this cascade and accumulate it when tracing cones.

Here are a couple of screenshots of the technology implemented in the early prototype.

First, here is our scene without reflected lighting from the cascades, only with the lighting of the sky:

And now with reflected lighting:

Even cascades instead of oktotreev when tracing cones are quite slow. Our first tests looked great, but we didn’t fit into the technical requirements at all, because the cone tracing alone took about 30 milliseconds. Therefore, in order to use the equipment in real time, we needed to make additional changes to it.

The first big change we made: the choice of a fixed number of directions for tracing cones. In the original Cyril Crissin technique, the number of sampled cones on pisel depended on the surface normal. This means that for each pixel we potentially have to go through texture cascades irrespective of the nearest pixels. There are also hidden costs in this technique: to determine the illumination in the pixel / voxel direction of the field of view, at each sampling, from the illuminated cascade of voxels, it was necessary to perform sampling for three of the six edges of the anisotropic voxel and take the average weighted bypass direction. When tracing cones, they touch several voxels on their way, and this each time leads to unnecessary sampling and waste of bandwidth and ALU resources.

If we fix trace directions instead (we did this with 16 directions in the field), we can use a well-coordinated passage through the voxel cascade and very quickly create radiation voxel cascades that give information on how our anisotropic voxels will look from each of the 16 directions. Then we use them to speed up the tracing of cones, so instead of three texture samples, we need only one.

We have made another important observation: in the process of tracing cones and passing through the cascade levels, voxels, which we access for adjacent pixels / voxels, become more similar, the further we move away from the top of the cone. This observation is a well-known simple effect: “parallax”. When moving the head, the objects that are closer look to the eyes as moving, and distant objects move much more slowly. Taking this into account, I decided to try to calculate another set of texture cascades, one for each of the 16 directions, so that we could periodically fill the back half of the cone from the center of each voxel with previously calculated results of tracing the cones.

These data show that the lighting of distant objects hardly varies from the user's point of view. You can then combine this data, which you can simply sample from our texture using the full cone trace of the “near” part of the cone. Having correctly adjusted the distance after which we turn to the “far” cone data (for our purposes, 1-2 meters is enough), we will get a significant increase in speed with a slight decrease in quality.

Even these optimizations did not give us the desired result, so we had to make changes to the volumes of calculated data, both temporally and spatially.

It turns out that the human visual system "forgives" the delay in updating the reflected refreshment. With a delay in updating the reflected light up to 0.5-1 seconds, well-perceived images are still created by a person, and it is difficult to recognize the “incorrectness” of the calculations. Realizing this, I realized that you can not upgrade every level of the system cascade in each frame. Therefore, instead of updating all 6 levels of the cascade, we select only one and update it every frame. At the same time, fine detail levels are updated more often than coarse levels. Therefore, the first cascade is updated every second frame, the next - every fourth frame, the next - every eighth, and so on. In this case, the results of the calculation of the reflected lighting close to the player will be updated with the necessary speed, and the rough results away will be updated much less frequently, saving us a lot of time.

Obviously, to calculate the result of lighting our final screen space, you can significantly increase the speed, slightly losing in quality with the help of calculations of the results of screen space tracing in a lower resolution. Then, by increasing the resolution, taking into account the geometry and additional correction of shading errors, we can get the final results of the screen space illumination. We found that for The Tomorrow Children you can get good results by tracing with 1/16 of the screen resolution (1/4 in height and width).

The results are good because we apply this principle only to information about light. The properties of materials and the normals in the scene are still computed in two-dimensional G-buffers with full resolution, after which they are combined with the illumination data at the pixel level. It is very easy to understand why this works: you need to present the illumination data obtained in 16 directions around the sphere, like a small map of the environment. If we move small distances (from pixel to pixel) then, provided there are no sharp interruptions in depth, this map will change very little. Therefore, it can be simply restored from low-resolution textures.

Combining all this information, we managed to achieve the update of voxel cascades every frame during a time of about 3 ms. The calculation time for pixel-by-pixel diffuse illumination also decreased to about 3 ms, which allows using this technology in games with a frequency of 30 Hz.

Based on this system, we were able to add to it some other effects. We have developed a type of directional shading in screen space (Screen Space Directional Occlusion) that computes the visibility cone for each pixel, and uses it to modify the illumination obtained from 16 directions of tracing cones. We also added something similar to the shadows of our characters (because the characters move very quickly and they are too detailed for voxelization) and used the trees of the spheres obtained from the collision volumes of the characters to calculate the directional absorption.

For particles, we created a simplified four-component version of the lighting cascade textures, which are of much lower quality, but still look good enough for particle illumination. We also managed to use this texture to create sharp reflections with the help of accelerated ray tracing over distance fields, and to write effects similar to simple sub surface scattering. A detailed description of all these processes will take a lot of time, so I recommend that anyone who wants to know the details read my presentation from GDC 2015 , where this topic is discussed in more detail.

Result

As you can see, as a result, we have a system that is radically different from traditional engines. It took a lot of work, but it seems to me that the results speak for themselves. Having chosen our own direction, we managed to get a completely unique style in the game. We created something new that we would never achieve using standard techniques or other people's technologies.

Source: https://habr.com/ru/post/320530/

All Articles