Below are the results of the rendering speed during rotation of the assembly. Recall that 24 frames per second (fps) are considered comfortable indicators for the human eye.Alexander Tulup, programmer:
“The main performance problem of displaying large scenes is associated with a large number of so-called“ drawing calls ”. The old version of rendering is built on top of a mathematical data model. Thus, for each primitive — points, edges, faces — a separate method was called for its display.
For each “draw call” OpenGL (driver) performs a series of checks, simultaneously translating incoming commands in a format that is understandable to the video card, after which the calls are added to the queue and from there they are sent for execution.
')
GPU Command Transfer Scheme in OpenGL ( source )
With a large number of details, the number of calls to the CPU grows so much that the data simply do not have time to go to the video card. We get a situation where on a very strong video card it “slows down” as well as on a medium or weaker one.
You can fight this by reducing the number of drawings (state changes) - group by material, combine common geometry ( instancing ), and so on.
We should not forget that from the whole scene we see only a part of it. The algorithms for detecting invisible objects are applicable here (frustum culling, occlusion culling, etc.)
Inspired by the example of The Road to One Million Draws and AZDO , we decided to take a rather unusual way: to get rid of the state changes on the CPU side as much as possible. Now almost everything is done on the video card. All the necessary attributes are obtained directly from the video memory while drawing from the shader itself ( shader ), which was made possible by the increase in the amount of video memory ( VRAM ) and the advent of SSBO .
1,000,000 cubes
From the advantages of this approach: the display speed has become really high. The speed is limited only by the capabilities of the GPU, namely the amount of data it is capable of processing.
It also made it possible to effectively implement the mechanisms for cutting off invisible objects. The results of the visibility check are recorded directly into the video memory, and drawing commands are formed from there on the basis of them. That is, it is not necessary to wait on the CPU side.
One of the main disadvantages of this approach is the high complexity of the development. Much has to be implemented again, taking into account the chosen approach. In addition, we often had to deal with a situation where the same shader code worked differently or did not work at all on video cards from different manufacturers. Often this was “cured” by updating the driver, but sometimes after a long debugging it was necessary to rewrite the code.
Naturally, the requirements for the video card also increased. OpenGL 4.5 support is key, but not the only requirement.
Hereinafter, measurements were performed on a PC with the following configuration:Table 1. Frame rate (number of frames per second, fps) on various models. More is better. Display Mode: Halftone + Frame, Simplified Mode Off, Smoothing: Medium (MSAA 8x)
CPU: Intel Core i7-6700K 4.00 GHz
RAM: 32 Gb
GPU: NVidia Quadro P2000
OS: Microsoft Windows 10 x64 Professional
Model | amount components | Frame rate, fps | ||
V16.1 | v17.1 | v18 | ||
![]() A machine mosaic grinding | 2764 | 4.1 | 4.7 | 124.9 |
![]() PGU-410 | 108337 | 0.3 | 0.4 | 28.6 |
![]() Wagon tipper | 17342 | 1.1 | 1.4 | 124.7 |
![]() Trolley bus | 9783 | 1.9 | 2.4 | 124.9 |
![]() North tidal power station | 48445 | 0.3 | 0.5 | 76.1 |
![]() Installation vacuum technology | 7189 | 1.9 | 2.3 | 124.9 |
![]() Ship Reducer power plant | 6414 | 2.6 | 3.6 | 123.9 |
Criterion | Criterion Description |
File opening speed | Components to be added to the assembly need to be loaded from disk. |
Drawing speed | The assembly and the component to be inserted must be positioned; to do this, you need to rotate / move / zoom the image |
Object selection speed | To create mates, you need to select basic objects: faces, planes, edges, etc. |
Synchronization speed with the construction tree | The component added to the assembly and its mates must be represented in the construction tree |
Synchronization rate with the specification module | The component added to the assembly must be taken into account in the specification. |
The specification is a module of the KOMPAS-3D system, which is responsible for the formation of the design document of the same name. It is developed by a separate team.
In particular, the team accelerated synchronization when inserting by reworking the internal mechanisms of the specification module.
Component | Act | Time with | |||
V16.1 | v17.1 | v18 | |||
Insert component Bracket ![]() | Loading | 2.0 | 3.0 | 2.2 | |
Switch to mate mode | 0.6 | 0.4 | 0.4 | ||
First mate | Select the first object | 0.4 | 1.0 | 0.2 | |
Select the second object | 0.5 | 1.1 | 0.2 | ||
Select the desired mate | 3.8 | 3.6 | 1.0 | ||
Second mate | Select the first object | 0.5 | 1.4 | 0.5 | |
Select the second object | 0.5 | 1.4 | 0.2 | ||
Select the desired mate | 3.6 | 3.0 | 1.2 | ||
Third mate | Select the first object | 0.5 | 0.5 | 0.5 | |
Select the second object | 0.3 | 1.1 | 0.3 | ||
Select the desired mate | 3.7 | 3.2 | 1.1 | ||
Confirm Insert Creation | 7,8 | 5.2 | 2.3 | ||
Total insert kronshteyna | 24.2 | 24.6 | 10.1 | ||
Insert washers from the library standard products ![]() | Select first mate | 6.4 | 2.4 | 0.4 | |
Select the second pairing | 4.2 | 3.1 | 0.4 | ||
Confirm Insert Creation | 15.7 | 9.2 | 4.4 | ||
Totally to insert the puck | 26.3 | 14.7 | 5.2 | ||
Insert the bolt ![]() | Loading | 2.0 | 2.7 | 2.0 | |
Switch to mate mode | 0.5 | 0.5 | 0.5 | ||
First mate | Select the first object | 0.4 | 1.0 | 0.2 | |
Select the second object | 0.4 | 1.1 | 0.2 | ||
Select the desired mate | 3.4 | 2.7 | 1.0 | ||
Second mate | Select the first object | 0.4 | 1.2 | 0.4 | |
Select the second object | 0.5 | 0.5 | 0.4 | ||
Select the desired mate | 3.7 | 2.9 | 1.0 | ||
Third mate | Select the first object | 0.5 | 1.0 | 0.5 | |
Select the second object | 0.5 | 1.0 | 0.2 | ||
Select the desired mate | 4.2 | 3.9 | 1.2 | ||
Confirm Insert Creation | 32.5 | 5.4 | 2.2 | ||
Total for bolt insertion | 49 | 21.2 | 9.8 | ||
Totally insert three components | 99.5 | 60.5 | 25.1 |
In addition to speeding up the opening of files, partial reading also helped to reduce consumed resources - primarily RAM.Anton Sidyakin, programmer, teamlead:
“For some time the KOMPAS-3D file has been an archive that combines several service files. One of them contains data on the model / assembly document organized in a tree structure. The ability to navigate through this structure has already been. For partial reading it was necessary to ensure the independence of the parts from each other. Thus, the received parts should not refer to each other, otherwise the part with the reference would become “incomplete”.
As a result, for details, it was possible to separate performances from the document and from each other. In assemblies, the container of inserts and mates is highlighted separately. Inside the performances, it also turned out to separate the source data for the construction and the results in the form of triangulation and bodies.
If we talk about simplified loading types, then the edited assembly is loaded completely, and from its inserts only triangulation is loaded and, depending on the type, the boundary representation (B-rep). Some difficulties were represented by the display in this mode of inserts with modified external variables, since they were previously obtained on the fly by rebuilding when reading, and in simplified loading types there is no data for this. The decision was to record the results of rebuilding such inserts in the assembly. This gave acceleration due to the lack of rebuilding.
The described division of the document into parts allowed loading into the assembly only the performances selected in the inserts.
Custom download types are combinations of the system component download methods. This function is not new, but the improvements made in v18, allow you to get noticeable bonuses from its use.
Model | Download type | Opening time, with | ||
V16.1 | v17.1 | v18 | ||
![]() Installation vacuum technology | Empty | 12.8 | 11.7 | 2.5 |
Dimension | 21.2 | 20.8 | 2.6 | |
![]() Ship Reducer power plant | Empty | 31.0 | 15.9 | 7.2 |
Dimension | 371.5 | 114.8 | 7.3 |
The option is available only for standard projections. For clarifying images (sections, sections, detail views) "Rough projection" is not involved.
Model | Creation time of three standard projections, with | |||
V16.1 | v17.1 | v18 Enabled draft projection | v18 Disabled draft projection | |
![]() Installation vacuum technology | 124.1 | 47.5 | 12.9 | 34.6 |
![]() Ship Reducer power plant | 256 | 410 | 38.4 | 54.4 |
![]() Multipurpose unified box body | 99.9 | 123.4 | 44.9 | 53.5 |
Source: https://habr.com/ru/post/447516/
All Articles