
DirectX 11.2 included in the release of Windows 8.1 has a number of interesting and necessary innovations. In this post a brief overview of the main innovations will be conducted, as well as some application scenarios are considered. Despite the fact that there are not so many innovations, some of them will be very helpful when developing applications for mobile devices, and applications for the Windows Store.
Briefly about the updates.
Most of the work that has been done with DirectX 11.2 is primarily related to performance and efficiency, and does not directly affect programmers. Your applications will run faster and require less resources. However, a number of new APIs are included in the Direct3D 11.2 API:
- Hardware overlay support : a dynamic scaling tool with interesting scenarios.
- Compiling and linking HLSL shaders in runtime : the ability to build shaders at runtime, including for Windows Store applications.
- Memory- mapped buffers : Opportunity that eliminates the need for additional data copy operations when exchanging data with a GPU
- API for reducing input delays : A mechanism that can significantly reduce the time delay between user input and display of results on the screen.
- Tile resources : Improved rendering quality using texture maps.
Support for hardware overlays.
One of the features of almost any modern graphics accelerator is that graphics scaling procedures are a very cheap operation. In this regard, there are a number of scenarios that would be interesting to use if there is a shortage of resources or reduced rendering speed.

As is already clear from the image, the hardware overlay allows rendering the buffer to a low resolution, and then enlarging this image to the required size and mixing it with additional buffers via an alpha mask. The game can display a 3D scene in the first overlay with reduced quality, but at the same time HUD or other graphic elements of the application can be displayed with high quality.
At the same time, there are two main scenarios for using hardware overlays - static and dynamic.
')
Static overlay.
This type of overlay simply accepts a scaling level when initializing a buffer and does not change its values ​​in the future. For initialization, it is enough to specify the DXGI_SCALING_STRETCH flag:
DXGI_SWAP_CHAIN_DESC1 swapChainDesc = {0}; swapChainDesc.Width = screenWidth / 1.5f; swapChainDesc.Height = screenHeight / 1.5f; swapChainDesc.Scaling = DXGI_SCALING_STRETCH; ... dxgiFactory->CreateSwapChainForCoreWindow( m_d3dDevice.Get(), reinterpret_cast<IUnknown*>(m_window.Get()), &swapChainDesc, nullptr, &swapChain );
The applicability of this method is limited to cases in which you already know the level of scaling in advance.
Dynamic overlay.
A more interesting option, in which the zoom level can change on the fly, without re-initializing buffers (Swapchain). You just need to call the
SetSourceSize function before each render:
DXGI_SWAP_CHAIN_DESC1 swapChainDesc = {0}; swapChainDesc.Width = screenWidth; swapChainDesc.Height = screenHeight; swapChainDesc.Scaling = DXGI_SCALING_STRETCH; dxgiFactory->CreateSwapChainForCoreWindow( ... ); ... if (fps_low == true) { swapChain->SetSourceSize(screenWidth * 0.8f, screenHeight * 0.8f); }
Dynamic overlay allows, depending on the current load on hardware resources, to instantly change the picture quality without affecting the FPS. Sometimes even a 10% reduction in the resolution of the final image can speed up the rendering procedures by several times, which will have a positive effect on dynamic loaded scenes. Players will lose the feeling of “brakes” in cases when too many objects are displayed on the screen.
Compiling and linking shaders.
Dynamic shader compilation is a very convenient optimization tool while the application is running. Unfortunately, in Windows 8.0, this feature was not available for Windows Store applications, and developers needed to create binary shaders in advance. With the release of Windows 8.1, this feature is back for Windows Store apps.
In addition to this, the compile option shaders 'lib_5_0' has appeared, which allows compiling the computational blocks of shaders and then, during the execution of the program, not to compile the shaders, but only to assemble them from ready-made libraries. This feature allows you to significantly increase the shader connection time and eliminate the expensive compilation operation during the execution of the application.
Memory buffers.
In Windows 8.0, data exchange with GPU for computational shaders requires the use of auxiliary buffers. This imposes some costs, and just the same for computational shaders can be expensive.

If you are using Windows 8.1 and DirectX 11.2 you have the opportunity to remove two auxiliary operations using the CPU_ACCESS flag. Then the picture will look like this:

Thus, it is possible to achieve an increase in performance for computational shaders. It should be noted that while this feature works only for data buffers, but not for textures (Texture1D / 2D / 3D). In any case, the developer has a simple way to check and work directly or with the help of an auxiliary buffer:
D3D11_FEATURE_DATA_D3D11_OPTIONS1 featureOptions; m_deviceResources->GetD3DDevice()->CheckFeatureSupport( D3D11_FEATURE_D3D11_OPTIONS1, &featureOptions, sizeof(featureOptions) ); ... If (featureOptions.MapDefaultBuffers) { deviceContext->Map(defaultBuffer, ...); } else { deviceContext->CopyResource(stagingBuffer, defaultBuffer); deviceContext->Map(stagingBuffer, ...); }
API for reducing input delays
The time between the response to input and the actual display of the results on the screen is crucial for many applications, especially games. If this time is too long, then the player gets a feeling of "brakes" and discomfort. Optimizing this time is quite a painstaking process, but along with the release of DirectX 11.2, programmers have an additional mechanism that makes this task much easier. There is a new API IDXGISwapChain2 :: GetFrameLatencyWaitableObject which allows you to get a WAIT HANDLE and continue using WaitForMultipleObjectEx to wait for the most successful rendering moment:
DXGI_SWAP_CHAIN_DESC1 swapChainDesc = {0}; ... swapChainDesc.Flags = DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT; dxgiFactory->CreateSwapChainForCoreWindow( ... ); HANDLE frameLatencyWaitableObject = swapChain->GetFrameLatencyWaitableObject(); while (m_windowVisible) { WaitForSingleObjectEx( frameLatencyWaitableObject, INFINITE, true ); Render(); swapChain->Present(1, 0); }
For example, using this API can more than double the latency of devices such as Surface from 46 milliseconds to 20 milliseconds.
Tile resources

Modern games require more and more video memory, including for textures. The quality of the texture and resolution directly determine the quality of the final image. One of the methods for optimizing the video memory used is the Direct X 11.2 (Tiled resources) tile resource mechanism. To understand what it is about, it’s better to watch a three-minute video
from the Build plenary report .
Links and examples
- New in C ++ / DirectX 11.2 development for Windows 8.1
- Tile resources - Build'13 conference report
- DirectX Foreground swapchain sample
- HLSL Shader Compiler sample
- DirectX latency sample
- Tiled resources sample