Render Diablo3. How it works

How are the graphic engines of popular games with a world name? What technologies are used by developers in the largest gaming companies? Is it really necessary to use the most advanced technologies of modern 3D graphics to make beautiful game graphics? We will try to answer these questions with an example of a render of a part of the game Diablo3, from Blizzard Entertainment.

I have been engaged in the field of game development for a long time, and my hobby is reverse engineering of graphic engines of popular gaming products. When the long-awaited sequel of the Diablo series came out, I immediately wanted to know what technologies the developers used in their brainchild.

The render of the game is based on Direct3D 9 technology. This allows you to cover a wider hardware base of video cards, and the advanced features offered by D3D 10 and 11 are often either not needed at all or are implemented by various means in the ninth version.

Shadows
')
For all static level geometry, pre-calculated lightmaps are used. Yes, the good old way, which is used since the days when 3D accelerators began to support multitexturing.

The lightmap is calculated in advance in the 3D modeling package (3ds max, Maya), or by its own raytracer in the level editor. One such texture is used for several game objects, or their parts, if the object is large (for example, a terrain).

For dynamic objects (monsters, character figures), dynamic shadows made using the “shadow map” technology are used (stencil shadows are practically not used nowadays). The developers decided to retreat from the classical canons in this area, and did not use hardware shadows (textures that can be used as depth buffers, and supporting hardware Percentage Closer Filtering - PCF), which are offered by all popular video card manufacturers. Instead, the technology was used Variance Shadow Maps. It allows you to get soft edges by standard blurring the texture of the shadow (for classic shadow maps, this method is not applicable, since averaging the depth values of a pixel does not make sense). I will not describe the details of VSM (see the useful links at the end of the article), I will only say that for its implementation it is necessary to store 2 values: the pixel depth and the pixel depth in the square. It is the second value that dictates quite stringent conditions for the accuracy of storing this information, so the texture of the A32B32G32R32 float format was chosen. Its size is at the maximum shadow quality settings of 2048x2048.

The process of creating a shadow map is standard. We draw all the objects casting a shadow (occluder) into the shadow map from the position of the light source. Blur the shadow map first horizontally and then vertically. When rendering objects that should receive a shadow (receivers), sample the shadow map, determine the degree of illumination of the pixel and darken the final color accordingly. The shadow map sampling should occur with bilinear filtering. The hardware filtering of the A32B32G32R32F format is not supported by the entire shader model 3.0 capable line of video cards, so it is implemented programmatically in the shader (although it is supported on my video card, but this was not taken into account).

Rendering of shadows occurs with orthographic projection, if the shadow is from a directional (directional) source (sun), or from a perspective one for conical (omni). Perspective shadow map distortion techniques (for example, Perspective Shadow Maps, Trapezoidal Shadow Maps, etc.) for the camera position used in the game (gaze direction from top to bottom, at a slight angle to the direction of the main light source) are not needed and not used. Cascading Shadow Maps into Sectors (Cascaded Shadow Maps, Parallel Split Shadow Maps) are not implemented for the same reasons.
Shadow map in 512x512 resolution with and without anti-aliasing:

The shader for the patch terrain in the version with anti-aliasing consists of 12 textural and 59 arithmetic instructions. Without smoothing 10 and 29 respectively. The difference in the arithmetic instructions is the implementation of bilinear filtering and VSM.

Dynamic lighting

Surprisingly, all dynamic lighting is vertex. Like good old times. There are no normal maps in the game. A bold decision, but given the final result, it is obvious that he justified himself 100%. There is no lack of detail in the geometry. In the vertex shader, there is one point light source with quadratic attenuation (as in classical FFP formulas), one cylindrical source (it is used as a character backlight, to illuminate the hero’s close environment) and up to 16 point sources with the simplest linear attenuation by distance.

The game also implemented volumetric sources of lighting. They are made as follows. Draw a sphere, or another convex figure, in the place of the source of illumination. In the vertex shader, we calculate the vertex alpha based on the normal and direction vector of the camera. The greater the angle between them, the greater the transparency should be. We obtain a translucent sphere with an increase in transparency from the center to the edges. Since this sphere will intersect the level geometry, we will get a visual image artifact at the intersection of the level objects and the sphere itself. This disadvantage is corrected by absolutely the same method as the so-called soft particles are made. A sample is taken from the depth buffer and compared to the depth of the pixel drawn. If the values are close, then modifying alpha (reducing it to zero), we make the intersection of the geometry invisible.

Special effects

Of the interesting effects can be distinguished projective texturing. For the imposition on the surface of the earth textures of various game spells (for example, the cries of the barbarian, the puddles of poison, the fiery paths of monsters, etc.) all these effects are rendered into a separate texture:

Then, the whole geometry is re-rendered, onto which projective texturing should be performed using the constructed cumulative image with projected graphic effects. Image mixing is performed on the alpha channel.

For some effects (post process, in particular), information about the depth of the scene at a given point is needed. Standard Direct3D 9 tools do not allow getting a depth buffer as a texture for later reading. An obvious option would be to render the whole scene again, displaying the depth of the pixels in the texture of the R32F format. This method is in most cases unacceptable, since doubling the geometry that is drawn will greatly affect the overall performance of the game. Graphics card manufacturers have long been aware of this problem and offer special texture formats that can be used both as textures in the shader and as a depth buffer for rendering. One of these formats is the so-called INTZ format. It is used in Diablo III. This type of texture is used when rendering the scene as a depth buffer, and then the values from it can be obtained in shaders, where depth information is needed. I do not know how rendering is performed on hardware that does not support INTZ textures (not all shader model 3 video cards support this “hack”), I do not have a video card without such support. It is possible that an additional pass is performed, or the effects depending on the depth are implemented differently, or are completely turned off.

Highlighting objects under the cursor is implemented by rendering the selected object into a separate texture. The shader uses the simplest one - outputting the unit to the alpha channel of the render target, and the highlight colors to the rgb channels. Then the resulting texture is blurred horizontally and vertically. For the correct blending of the effect and obtaining the final image, it is necessary to leave only the object’s aura, but not its main silhouette. Having an original (not blurred) image, the final overlay shader checks the alpha channel value in this texture. If it is equal to 1 (there is an object in this pixel), then the output alpha channel is set to zero. If the value is 0 (there is no object in this pixel), then the alpha channel of the blurred texture is used.

Post process effects

The number of post-process effects the game can not boast. Among the entire arsenal were bloom, full screen distortion and full-screen anti-aliasing, made using FXAA technology. Distortion is implemented according to the classical scheme. Particles are drawn, which should introduce distortion in the final image (hot air, for example) in a special texture. The recorded data is u and v offsets of the texture coordinates. In the next full-screen passage, use this texture and shift the texture coordinates for sampling the main image of the scene.

Image Hosted by ImageShack.us

Full-screen anti-aliasing is also performed as a post-process effect. There are several reasons for this. Using the INTZ depth buffer becomes impossible (you cannot create a pure multisampled INTZ depth buffer for later copying it into a non-multisampled INTZ texture), and the shadow map will take up a lot of memory (recall that its format is A32R32G32B32F, that is, 16 bytes per pixel ). Full-screen anti-aliasing in the game is made using Fast Approximate Anti-Aliasing (FXAA) technology.

Image Hosted by ImageShack.us

Geometry and Materials

All the vertices of the game models are packed in a cache-friendly 32 byte format. The exception is animated models, there are 48. Additional data are bone weights and their indices. The game uses skeletal animation, which is performed in the shader. For this reason, for animated models, the number of point sources is limited to seven, due to the lack of constant registers for storing the parameters of light sources and bone matrices.

The total number of draw call's is small. The value ranges from 300-800 DIP, which is a good indicator.

Shaders are made using uber-shader technology, i.e. compiling multiple variants of a single effect by iterating through the set of defines of the preprocessor. For example, the effect can be with fog and without, with shadow and without, with lightmap and without. Behind the fog, shadow and lightmap is responsible for a certain define type: #define USE_FOG 1. In the body of the shader, the code block responsible for imposing the fog is made inside the #if USE_FOG block ... #endif. Thus, switching the value of USE_FOG 1/0, we get a shader with and without fog. All effects are done in a similar way. The assembly system for all variants of shaders automatically enumerates the entire set of defaults values, and compiles a shader for each set.

User interface

The in-game interface is pretty standard on the screen. There is no special grouping of elements in order to reduce calls for rendering (DIP calls). I would like to note the rendering of the text. Preparing symbols for rendering is very similar to the method used in Scaleform GFX. All unique characters are drawn into a separate texture, and this texture is already used for text rendering. Despite the similarity of the text rendering, Scaleform itself is not used.

Afterword

The render itself leaves a pleasant impression. Such a mix of old school and some modern trends. Performance at a height with a beautiful picture (as always with games from Blizzard, in fact). A great role in this beauty is played by the work of artists and designers. Diablo III proves once again that very beautiful graphics can be done on a not very technologically advanced renderer.

useful links
- Variance Shadow Maps. www.punkuser.net/vsm
- FXAA. developer.download.nvidia.com/assets/gamedev/files/sdk/11/FXAA_WhitePaper.pdf
- List of known GPU "hacks". aras-p.info/texts/D3D9GPUHacks.html

Source: https://habr.com/ru/post/157447/

All Articles

Render Diablo3. How it works

More articles: