
On March 20, 2014, Microsoft announced the release of the DirectX * 12 conference at the Game Developers Conference. By reducing redundant processing of resources, DirectX 12 will contribute to more efficient operation of applications and reduce power consumption, which will make it possible to play longer on mobile devices without recharging.
At the SIGGRAPH 2014 conference, Intel experts
measured the power consumption of the CPU when running a simple demo with asteroids on a Microsoft Surface * Pro 3 tablet. You can switch the demo application from API DirectX 11 to API DirectX 12 by pressing a button. This demo application draws a huge amount of asteroids in space at a fixed personnel speed. When using the DirectX 12 API API, the power consumption of the CPU is more than doubled compared to DirectX 11 **. The device operates in a less intensive thermal mode and is able to work longer from the battery. In typical gaming scenarios, all unused CPU power can be spent on improving physics, artificial intelligence, path finding algorithms, or other tasks with an intense load on the CPU. Thus, the game becomes more powerful in functionality or more economical in terms of electricity consumption.
Instruments
The development of games based on DirectX 12 requires the following.
- Windows * 10
- DirectX 12 SDK
- Visual Studio * 2013
- DirectX 12 compatible GPU drivers
If you are a game developer, try
participating in the Microsoft DirectX Early Access Program .
After accepting the conditions of the DirectX Early Access Program, you will receive instructions for installing the SDK and drivers for GPs.
Overview
From a higher level perspective, compared to DirectX 10 and DirectX 11, the architecture of DirectX 12 is different in managing states, tracking and managing resources in memory.
In DirectX 10, state objects have appeared to set up a group of states at run time. In DirectX 12, state-of-the-art (PSO) objects have appeared, which serve as even larger state objects along with shaders. This article discusses changes in working with resources; the grouping of states in the PSO will be described in further articles.
In DirectX 11, the system was responsible for predicting and tracking resource usage, which limited the ability to create applications with large-scale use of DirectX 11. In DirectX 12, the programmer (not the system and not the driver) is responsible for processing the following three usage models.
1. Resource BindingDirectX 10 and 11 tracked resource bindings to the graphics pipeline in order to maintain resources already released by the application, since unfinished GP operations could refer to these resources. In DirectX 12, the system does not track resource bindings. The application, that is, the programmer, should be involved in managing the life cycle of objects.
')
2. Resource Binding AnalysisDirectX 12 does not track resource snapping to determine if resource switching has occurred. For example, an application can write a rendering target using a render target view (RTV), and then read this render target as a texture using a shader resource view (SRV). In API DirectX 11, the GP driver had to “know” when such a switch of resources took place in order to avoid conflicts when reading, changing and writing data in memory. In DirectX 12, you must identify and track all resource switching using separate API calls.
3. Synchronize mapped memoryIn DirectX 11, the driver handles synchronization of mapped memory between the CPU and the GP. The system analyzed resource bindings to see if rendering delays are required because the resource mapping that was mapped for CPU access has not yet been canceled. In DirectX 12, an application must handle the synchronization of CPU and GP access to resources. A single mechanism to synchronize memory access requests an event to wake up a thread when processing is completed in the GP.
Moving these resource utilization models into applications required a new set of programming interfaces capable of supporting the widest set of GPU architectures.
Later in this article, new resource binding mechanisms are described, the first of which are descriptors.
Descriptors
Descriptors describe resources stored in memory. A descriptor is a data block that describes an object for a GP in an "opaque" format intended for GP. Some stretch handles can be seen as a replacement for the former system of “views” in DirectX 11. In addition to various types of DirectX 11 descriptors, such as shader resource representation (SRV) and unordered access representation (UAV), other types of descriptors appeared in DirectX 12, for example, samplers and constant buffer representation (CBV).
For example, SRV chooses which base resource to use, which set of embossed maps and array slices, and in what format to interpret the memory. The SRV descriptor must contain the virtual address of the Direct3D * resource (which may be a texture) in the GP. The application must ensure that the base resource is not destroyed and is not inaccessible due to non-residency.
In fig. Figure 1 shows the handle of the "view" texture.
Figure 1. Shader resource representation in the handle [used with permission © Microsoft Corporation]To create a shader resource representation in DirectX 12, use the following structure and device method Direct3D.
typedef struct D3D12_SHADER_RESOURCE_VIEW_DESC { DXGI_FORMAT Format; D3D12_SRV_DIMENSION ViewDimension; union { D3D12_BUFFER_SRV Buffer; D3D12_TEX1D_SRV Texture1D; D3D12_TEX1D_ARRAY_SRV Texture1DArray; D3D12_TEX2D_SRV Texture2D; D3D12_TEX2D_ARRAY_SRV Texture2DArray; D3D12_TEX2DMS_SRV Texture2DMS; D3D12_TEX2DMS_ARRAY_SRV Texture2DMSArray; D3D12_TEX3D_SRV Texture3D; D3D12_TEXCUBE_SRV TextureCube; D3D12_TEXCUBE_ARRAY_SRV TextureCubeArray; D3D12_BUFFEREX_SRV BufferEx; }; } D3D12_SHADER_RESOURCE_VIEW_DESC; interface ID3D12Device { ... void CreateShaderResourceView ( _In_opt_ ID3D12Resource* pResource, _In_opt_ const D3D12_SHADER_RESOURCE_VIEW_DESC* pDesc, _In_ D3D12_CPU_DESCRIPTOR_HANDLE DestDescriptor); };
An example SRV code might look something like this.
This code creates an SRV for a two-dimensional texture and specifies its format and virtual address of the GP. The final argument for
CreateShaderResourceView is the handle heap token that was allocated before calling this method. Descriptors are usually stored in heaps of descriptors, which are described in more detail in the next section.
Note. You can also transfer some types of descriptors to the GP using the so-called root parameters (taking into account the driver versions). See below for details.
Heaps of descriptors
A bunch of descriptors can be viewed as one allocated amount of memory for several descriptors. Different types of heaps can contain one or more types of descriptors. The following types are currently supported.
Typedef enum D3D12_DESCRIPTOR_HEAP_TYPE { D3D12_CBV_SRV_UAV_DESCRIPTOR_HEAP = 0, D3D12_SAMPLER_DESCRIPTOR_HEAP = (D3D12_CBV_SRV_UAV_DESCRIPTOR_HEAP + 1) , D3D12_RTV_DESCRIPTOR_HEAP = ( D3D12_SAMPLER_DESCRIPTOR_HEAP + 1 ) , D3D12_DSV_DESCRIPTOR_HEAP = ( D3D12_RTV_DESCRIPTOR_HEAP + 1 ) , D3D12_NUM_DESCRIPTOR_HEAP_TYPES = ( D3D12_DSV_DESCRIPTOR_HEAP + 1 ) } D3D12_DESCRIPTOR_HEAP_TYPE;
There is a heap type for CBV, SRV and UAV descriptors. There are also types for working with render target views (RTV) and depth format views (DSV).
The following code creates a bunch of descriptors for nine descriptors, each of which can be of type CBV, SRV, or UAV.
The first two entries in the heap description are the number of descriptors and the types of descriptors that can be on this heap. The third parameter,
D3D12_DESCRIPTOR_HEAP_SHADER_VISIBLE, describes this heap of descriptors as visible to the shader. You can use heaps of descriptors that are invisible to the shader, for example, for intermediate storage of descriptors in the CPU or for RTV that are not available for selection from inside the shaders.
This code sets a flag that causes a bunch of descriptors to become visible to the shader, but there is another level of indirect addressing. A shader can “see” a bunch of descriptors through the descriptor table (there are also root descriptors that do not use tables; for more details, see below).
Descriptor tables
The main purpose of the heap of descriptors is to allocate the necessary amount of memory to store all the descriptors for rendering in the greatest possible amount, say for one frame or more
Note. When switching between heaps of descriptors, cleaning of the GP conveyor can occur, depending on the equipment used. Therefore, it is necessary to minimize operations for switching between heaps of descriptors or to combine them with other operations, in which the conveyor is still cleaned.
The descriptor table points to a bunch of descriptors using offset. Instead of forcing the graphics pipeline to always scan the entire heap, switching descriptor tables will allow you to change the set of resources used by this shader at no significant cost. At the same time, the shader does not have to search for resources in the heap space.
In other words, an application can use several descriptor tables pointing to the same heap for different shaders, as shown in Fig. 2
Figure 2. Different shaders point to a bunch of descriptors using several descriptor tablesThe following code example creates descriptor tables for the SRV and sampler that are visible to the pixel shader.
At the same time, the descriptor table is visible only to the pixel shader; This restriction is set using the
D3D12_SHADER_VISIBILITY_PIXEL flag. The following listing defines the different levels of descriptor table visibility.
typedef enum D3D12_SHADER_VISIBILITY { D3D12_SHADER_VISIBILITY_ALL = 0, D3D12_SHADER_VISIBILITY_VERTEX = 1, D3D12_SHADER_VISIBILITY_HULL = 2, D3D12_SHADER_VISIBILITY_DOMAIN = 3, D3D12_SHADER_VISIBILITY_GEOMETRY = 4, D3D12_SHADER_VISIBILITY_PIXEL = 5 } D3D12_SHADER_VISIBILITY;
If you specify a flag that sets visibility to all, the arguments will be passed to all stages of the shader, although visibility is set only once.
A shader can discover resources using descriptor tables, but first the shader must “learn” about these descriptor tables using the root parameter in the root signature.
Root Signature and Parameters
The root signature stores the root parameters used by shaders to discover the resources to which access is required. These parameters exist in the form of a binding space at the list of commands for a set of resources that the application must make available to shaders.
Root arguments may be as follows.
- Descriptor tables. As described above, they contain an offset and the number of descriptors on the heap.
- Root Handles. Directly in the root parameter, you can store only a small number of descriptors. At the same time, the application no longer needs to place these descriptors on the heap of descriptors, indirect addressing is eliminated.
- Root constants. These are constants provided by shaders directly, without the need to work with root descriptors and descriptor tables.
To achieve optimal performance, applications typically sort the root parameters by decreasing the frequency of changes.
All root parameters, such as descriptor tables, root descriptors and root constants, are combined into a list of commands, and the driver will manage their versions on behalf of the application. In other words, whenever any of the root parameters change in between the render or send calls, the hardware will update the version number of the root signature. When any argument changes, each render or send call gets a unique complete set of root parameter states.
Root descriptors and root constants reduce the level of indirect addressing of the GP upon access; descriptor tables allow access to larger amounts of data, but the level of indirect addressing increases. Due to the higher level of indirect addressing when using descriptor tables, the application can initialize the contents before sending the list of commands for execution. In addition, the 5.1 shader model, supported by all DirectX 12 hardware, allows shaders to dynamically index all specified descriptor tables. Therefore, the shader can select the desired descriptor from the descriptor table during the execution of the shader. An application can simply create one large table of descriptors and always use an index (for example, using a material identifier) ​​to obtain the desired descriptor.
The performance of different architectures may vary when using large sets of root constants and root descriptors compared to using descriptor tables. For this reason, you need to optimally adjust the relationship between root parameters and descriptor tables, depending on the target hardware platforms.
A perfectly balanced application can use a combination of all types of bindings: root constants, root descriptors, descriptor tables for descriptors received on the fly as rendering calls are made, and also dynamic indexing of large descriptors tables.
In the following code, the two descriptor tables mentioned above are stored as root parameters in the root signature.
All shaders in the PSO must be compatible with the root signature specified with this PSO object; otherwise, the PSO object will not be created.
The root signature must be set for the list of commands or package. To do this, we call:
1 commandList-> SetGraphicsRootSignature (mRootSignature);
After setting the root signature, you need to define a set of bindings. In the example above, this is done with the following code.
The application must set the appropriate parameters in each of the two cells of the root signature before issuing a render call or a send call. For example, in the first cell, there is now a handle to the marker that matches the index of a bunch of descriptors with the SRV handle, and in the second cell there is a table of descriptors that maps on the index a bunch of descriptors to the sampler handle.
An application can change, for example, the binding of the second cell in the interval between rendering calls. This means that the second render call requires only the binding of the second cell.
Putting the components together
The large code snippet below shows all the mechanisms used to bind resources. This application uses only one texture, and this code provides a sampler and SRV for this texture.
Static samplers
So, we saw how to create a sampler using a heap of descriptors and a table of descriptors. But there is another way to use samplers in the application. Since many applications require only a limited set of samplers, static samplers can be used as the root argument.
Currently, the root signature is as follows.
typedef struct D3D12_ROOT_SIGNATURE { UINT NumParameters; const D3D12_ROOT_PARAMETER* pParameters; UINT NumStaticSamplers; const D3D12_STATIC_SAMPLER* pStaticSamplers; D3D12_ROOT_SIGNATURE_FLAGS Flags;
The set of static samplers can be defined independently of the root parameters in the root signature. As mentioned above, the root parameters define the binding space, where you can provide arguments at runtime, while static samplers are by definition unchanged.
Since root signatures can be created in HLSL, static samplers can also be created there. Currently, an application can have no more than 2032 unique static samplers. This is slightly less than the next power of two, and allows drivers to use some space for internal use.
The static samplers defined in the root signature are independent of the samplers selected by the application to be placed in the heap of descriptors, so both mechanisms can be used simultaneously.
If the sampler selection is completely dynamic and is unknown at the time of shader compilation, the application should manage the samplers in a heap of descriptors.
Conclusion
DirectX 12 maintains full control over resource usage models. The application developer is responsible for allocating memory in heaps of descriptors, for describing resources in descriptors, and for addressing a shader by index to heaps of descriptors through descriptor tables, which, in turn, are “expanded” for the shader using root signatures.
Moreover, using root signatures, you can define a customizable parameter space for shaders using the following four types of components in any combination:
- root constants;
- static samplers;
- root descriptors;
- descriptor tables.
The task is to select the desired form of binding for the respective types of resources and the frequency of their update.
Links and useful materials