Hell Visualization 1.1 - Book 2: Problems

Ad visualization 1.1:

Welcome to the second book! Here we will explore some of the problems that may arise during the visualization process. But, for starters, a little practice:

Knowing about the problem is useful. But to really feel the problem is much better for understanding. Let's try putting ourselves in the place of a CPU / GPU.

Experiment

Please create 10,000 small files (for example, 1 KB each) and copy them from one hard disk to another. This operation will take a long time, although the data size is only 9.7 MB.

')
Now create one 9.7 MB file and copy it in the same way. This operation will be performed much faster!

Why? After all, the size of the data is the same!

This is true, but each copy operation consists of many things that need to be done, for example: prepare a file for movement, allocate memory, move the read / write heads of a disk back and forth ... All this is the overhead for each write operation. As you might have felt the hard way, this overhead is huge if you copy many small files. The visualization of multiple polygonal meshes (that is, the execution of many commands) is much more complicated, but it feels the same.

Now let's look at the worst case that can occur during the rendering process.

Worst case

Having lots of small polygonal meshes is bad. If they use different material parameters, then everything gets worse. But why?

1. Many polygonal meshes

The graphics processor can draw faster than the central processor to send commands.

The main reason for reducing Draw Calls is that graphics hardware can change and visualize triangles much faster than you transfer them. If you send a small number of triangles to each call, then you will be completely bound by the CPU performance, and the GPU will for the most part be in standby mode. The CPU will not be able to “feed” the GPU fast enough. [ f05 ]

Everything else, each Draw Call incurs some overhead (as mentioned above):

There are driver-level overheads whenever you make an API call, and the best way to reduce them is to call the API as little as possible. [ a02 ]

2. Set Draw Calls

One example of such additional expenses is the command buffer. Do you remember that the CPU fills the command buffer and the GPU reads it? Yes, they have to report changes and this also creates overhead costs (read / write pointers change, you can read more here )! For this reason, it may be better not to send commands one by one, but first fill the buffer and transfer a whole block of commands to the GPU. This increases the risk that the GPU will have to wait until the CPU finishes building a block of commands, but at the same time reduces the cost of communication.

The GPU (fortunately) has many things to do while the CPU is compiling a new command buffer (for example, processing the previous block). Modern processors can fill several command buffers at once independently of each other, and then sequentially transfer them to the GPU.

Only one example has been described above. In the real world, not only CPU, GPU and command buffers are talking to each other. API (DirectX, OpenGL), drivers and many other elements are included in this process, which does not make it easier.

We have discussed only the case with many polygonal grids that use the same material (Render State). But what happens when we want to visualize objects with different materials?

3. Many polygonal meshes and materials

Reset conveyor.

Changing the state, sometimes you have to partially or completely reset the conveyor. For this reason, changing the shader or material parameters can be very expensive [...] [ b01 ]

You thought it won't be any worse? So ... if you use different materials with different polygonal meshes, you cannot group visualization commands. You set the Render State for the first grid, command it to display it, then set the new Render State, send the next render command, and so on.

I painted the command “Change State” in red, since a) it is expensive and b) for readability.

Setting the Render State values sometimes (not always, depending on the parameters you want to change) causes the entire pipeline to be reset. This means that each polygonal mesh that is being processed at the moment (with the current Render State) must be displayed before you can proceed to the next one (with the new Render State). It looks like the video above.
Instead of taking a huge number of vertices (for example, combining several grids with the same Render State, I will explain this optimization later), a small amount is displayed before the Render State change operation, which is obviously bad.

By the way: Since the CPU takes some minimum time to set the Draw Call parameters (regardless of the size of the polygonal grid), we can assume that there is no difference in the display of 2 or 200 triangles. The GPU is pretty darn fast and while the CPU prepares a new Draw Call, the triangles will already be the new pixels on the screen. Of course, this “rule” will change when we talk about combining several small polygonal meshes into one big one (we will look at it later).

I did not manage to find up-to-date data on the number of polygons that can be visualized “for free” on modern graphic maps. If you know anything about this or have taken any measurements recently, please let me know!

4. Polygonal meshes and multi materials

What if a polygon mesh is assigned not one material, but two or more? Basically, the grid breaks into several pieces, and then "fed" in parts to the command buffer.

Of course, this entails additional Draw Calls for each grid element.

I hope I managed to give you a quick idea of what is bad in a large number of polygonal meshes and materials. In the next book, we will look at some solutions, even if all this looks terrible. But there are wonderful games that prove that the problems described above somehow managed to be overcome.

the end

[a02] GPU Programming Guide GeForce 8 and 9 Series
[b01] Real-Time Rendering : p. 711/712
[f05] Why are draw calls expensive

Source: https://habr.com/ru/post/245823/

All Articles