Guide to building stars: kernel pool spraying and VMware CVE-2013-2406

If you are busy with kernel mode vulnerabilities in Windows, then sooner or later you have to deal with such techniques as kernel pool spraying (just do not call it “spraying a nuclear heap”). I think the ability to control the behavior of the kernel memory pool will be useful for the developer of exploits.

To master this technique, you must have at least an approximate idea of the device of the kernel pool. In this article I will try to provide a description of only the details of its implementation that are significant in the context of the pool spraying technique. The kernel pool device is well studied, so if you still need a deeper knowledge, you can contact any search service or links at the end of the article.

Kernel Pool Structure Overview

The kernel memory pool is a single place in the operating system kernel where you can go with a request for memory allocation. The kernel-mode stacks are small and only suitable for storing a few variables, and not being arrays. When a driver needs to create a large data structure or a string, it can use different interfaces to allocate memory, but in the end they will lead to memory from the pool. ”

There are several types of pools, but they all have the same structure (except for the special pool (special pool), which is used by the driver verification utility). Each pool has a control structure called a pool handle. Among other things, it stores lists of free blocks (chunk) of the pool, forming the free space of the pool. The pool itself consists of memory pages. They can be standard 4-kilobyte or large 2-megabyte. The number of pages used is dynamically adjusted.
')
The pages of the kernel pool are divided into fragments of different sizes - blocks (chunk). It is the blocks that are allocated to kernel modules when requesting a memory allocation from the pool.

The blocks contain the following metadata:

Previous size - the size of the previous block.
Pool index (pool index) is used in situations where there are several pools of the same type. For example, there are several swappable pools in the system. This field is used to determine which pool belongs to the block.
Block size - the size of the current block. Similar to the previous size field, its size is encoded as
(block data size + header size + optional 4 bytes of the pointer to the process that occupied the block) >> 3 (or >> 4 for x64 systems).
The pool type is a set of bit flags that are not documented (!).
- T (Tracked): the block is tracked by the driver check utility. This flag is used for debugging.
- S (Session): the block belongs to the paged pool session, which is used to allocate memory for user-specific session data.
- Q (Quota): the block is registered with the quota management system. This flag refers only to 32-bit systems. If it is set, a pointer to the process owning this block is written to the end of the block.
- U (In use): block is currently in use. Unlike the “in use” state, a block can be free, which means that you can allocate memory from it. This flag is in the second bit, starting with Windows Vista, before it was in the third bit.
- B (Base pool): This field determines which base pool the block belongs to. There are two basic pools - swappable and non-swappable. Non-downloadable is encoded by zero, bobbed - by one. Prior to Windows Vista, this flag occupied two bits, since it was encoded as (type of base pool + 1), i.e. 0x10 for the pool being paged and 0x1 for the non-pooling.
Pool tag is used for debugging purposes. Kernel modules indicate a signature of four printable characters identifying the subsystem or driver to which the block belongs. For example, the “NtFs” tag means that the block belongs to the NTFS ntfs.sys file system driver.

The block structure has a couple of differences on 64-bit systems. First, the header fields are larger, and second, there is an 8-byte field with a pointer to the process that uses this block.

Overview of memory allocation principles in the pool

Imagine that the pool is empty. I mean, there’s no room at all. If we try to allocate memory from it (say, less than 0xFF0 bytes), the memory page will be allocated first, and then the block located at the beginning of the page will be allocated on it.

Now we have two blocks - the one that we have identified, and free. Free, in turn, can be used in subsequent memory allocation operations. However, from this point on, the pool allocator will have allocated blocks at the end of the page or free space on the page.

When it comes to the release of blocks, the described process is performed exactly the opposite. Blocks become free and merge into one block if they are adjacent.

Notice that the situation described is fictional and is used only as an example, since in practice the pools are filled with memory pages long before the pool is ready for use by the kernel modules.

We control the allocation of memory from pools

Keep in mind that kernel pools are high-loaded operating system entities. First of all, they are used to create all kinds of objects and internal kernel data structures. In addition, pools are used in a variety of system calls to buffer parameters passed from user mode. Since the operating system constantly performs hardware maintenance through drivers and software through system calls, you can roughly estimate the frequency with which the pool is used, even when the system is idle.

Sooner or later pools become fragmented. This is due to the allocation and release of blocks of memory of different sizes in a different order. Therefore, the term spraying appears. With sequential allocation of memory from the pool, the blocks are completely not obliged to be contiguous, and, most likely, they will be in different parts of the memory. Therefore, when we fill the memory with controlled (red) blocks, it is more likely that we will see the picture on the left rather than on the right.

However, there is a significant circumstance in the context of exploitation: when there are no black regions in the “fill”, we will get a new one, without unnecessary stains. And from that moment on, the “spray brush” turns into a regular one, with a solid fill. This fact gives us a significant level of control over the behavior of the pool and over its “picture”. Significant is not complete control, because even in this case there are no guarantees that we fully own the “picture”, because someone else can always interrupt us with “splashes” of a different color.

Depending on the type of object used for pool spraying, we have the ability to create windows of a given size from free blocks by deleting the necessary number of objects created earlier. But the most important fact that allows us to control the allocation of memory from the pool is that the allocator strives for maximum performance. For the most efficient use of the processor cache, the last freed memory block will be the first to be allocated. This is the whole essence of controlled allocation, because it is possible to guess the address of the allocated block.

Of course, block size matters. Therefore, you must first calculate the size of the window of the freed blocks. If we want to controlfully allocate a block of 0x315 bytes with the size of objects for pool spraying of 0x20 bytes, it is necessary to free 0x315 / 0x20 = (0x18 + 1) blocks. I think this is understandable.

A few notes on how to successfully use the kernel pool spraying technique:

If there is no possibility of allocating memory from pools through an exploited driver, it is always possible to use operating system objects as objects for pool spraying. Since OS objects, oddly enough, are stored in the OS kernel, the memory for them is allocated from different pools.
- Processes, threads, semaphores, mutexes, etc. are stored in the non-pumped pool.
- In the paged pool, directory objects (directory objects), registry keys, sections (so-called file associations or file mapping), etc. are stored.
- Objects of the GDI and USER subsystems are stored in the session pool: palettes, device contexts (DC), brushes, etc.
In order to free the memory occupied by these objects, it suffices to close the corresponding descriptors.
By the time we start filling the pool with objects, it will contain a number of memory pages from which to allocate blocks. However, these pages will be fragmented. Since we need to get a space with continuous filling of controlled blocks, first of all we need to “spam” the pool so that there is no free space on the current pages. Only in this case, we will have access to fresh pages that can be sequentially filled with controlled blocks. In short, you need to create a lot of objects.
When calculating the required window size, also consider the size of the block header, as well as the fact that the final size is rounded up to 8 and 16 bytes on 32-bit and 64-bit systems, respectively.
Despite the fact that we can control the allocation of blocks, it is rather difficult to predict their relative position. However, when using OS objects for pool spraying, it is possible to find the address of an object by its descriptor using the NtQuerySystemInformation () function with the SystemExtendedHandleInformation parameter. The information it provides is needed to improve pool spraying accuracy.
Keep a balance when pool spraying. Do not be greedy when selecting objects. It is obvious that it is impossible to control the allocation of blocks if the memory in the system simply ended.
One of the tricks to increase the reliability of exploits using the kernel pool is to increase the priority of the thread that implements pool spraying and initiates vulnerability. Since the threads are essentially in a constant state of the race for the pool memory, it is useful to increase the priority of using the heap by increasing the chance of being executed more often than other threads in the system. This will help the technique to be more holistic. Also take into account the delay between pool spraying and the initiation of a vulnerability: the smaller it is, the greater the chance that we will get into the block we need.

VMware CVE 2013-1406

In early February, interesting recommendations were issued for upgrading VMware products. Judging by them, in the non-updated components there was a vulnerability leading to local privilege escalation on both the main and the guest OS. To avoid such "tasty" vulnerabilities.

The vulnerable component was vmci.sys. VMCI stands for Virtual Machine Communication Interface. This interface is used for interaction between virtual machines and the main OS. VMCI provides a proprietary socket type implemented as a Windows Socket Service Provider in the vsocklib.dll library. The vmci.sys driver creates a virtual device that implements the necessary functionality. It is always running on the main OS. As for guest systems, for VMCI to work, you need to install VMware tools.

When writing any review, it is nice to explain the high-level logic of vulnerability so that the review becomes a detective story. Unfortunately, in this case it will not be possible to do this, because there is very little public information about the implementation of VMCI. However, I think that exploit developers are not worried about this. At least it is more profitable to get a working exploit, rather than spend a lot of time analyzing how the whole system works.

PatchDiff has identified three patched features. All of them related to the processing of the same control code IOCTL 0x8103208C. Apparently, everything specifically went wrong with its processing ...

The third updated function was ultimately called from both the first and second. It had to allocate a block of the requested size, multiplied by 0x68, and initialize it, filling it with zeros. This block contains an internal data structure for processing the request. The problem was that the size of the allocated block was specified in user mode and was not properly checked, as a result of which the internal structure was not allocated, which led to some interesting consequences.

For control code 0x8103208C indicated input and output buffer. To get to the vulnerability, it is necessary that its size be 0x624 bytes. To process the request, an internal structure of 0x20C bytes was allocated. Its first 4 bytes were filled with the value specified at [user_buffer + 0x10]. It is these bytes that were used later to highlight the second data structure, the address to which was specified at the end of the first. With all this, regardless of the result of the allocation of the second structure, a certain dispatch function was called.

Dispatch function

.text:0001B2B4 ; int __stdcall DispatchChunk(PVOID pChunk) .text:0001B2B4 DispatchChunk proc near ; CODE XREF: PatchedOne+78 .text:0001B2B4 ; UnsafeCallToPatchedThree+121 .text:0001B2B4 .text:0001B2B4 pChunk = dword ptr 8 .text:0001B2B4 .text:0001B2B4 000 mov edi, edi .text:0001B2B6 000 push ebp .text:0001B2B7 004 mov ebp, esp .text:0001B2B9 004 push ebx .text:0001B2BA 008 push esi .text:0001B2BB 00C mov esi, [ebp+pChunk] .text:0001B2BE 00C mov eax, [esi+208h] .text:0001B2C4 00C xor ebx, ebx .text:0001B2C6 00C cmp eax, ebx .text:0001B2C8 00C jz short CheckNullUserSize .text:0001B2CA 00C push eax ; P .text:0001B2CB 010 call ProcessParam ; We won't get here .text:0001B2D0 .text:0001B2D0 CheckNullUserSize: ; CODE XREF: DispatchChunk+14 .text:0001B2D0 00C cmp [esi], ebx .text:0001B2D2 00C jbe short CleanupAndRet .text:0001B2D4 00C push edi .text:0001B2D5 010 lea edi, [esi+8] .text:0001B2D8 .text:0001B2D8 ProcessUserBuff: ; CODE XREF: DispatchChunk+51 .text:0001B2D8 010 mov eax, [edi] .text:0001B2DA 010 test eax, eax .text:0001B2DC 010 jz short NextCycle .text:0001B2DE 010 or ecx, 0FFFFFFFFh .text:0001B2E1 010 lea edx, [eax+38h] .text:0001B2E4 010 lock xadd [edx], ecx .text:0001B2E8 010 cmp ecx, 1 .text:0001B2EB 010 jnz short DerefObj .text:0001B2ED 010 push eax .text:0001B2EE 014 call UnsafeFire ; BANG!!!! .text:0001B2F3 .text:0001B2F3 DerefObj: ; CODE XREF: DispatchChunk+37 .text:0001B2F3 010 mov ecx, [edi+100h] ; Object .text:0001B2F9 010 call ds:ObfDereferenceObject .text:0001B2FF .text:0001B2FF NextCycle: ; CODE XREF: DispatchChunk+28 .text:0001B2FF 010 inc ebx .text:0001B300 010 add edi, 4 .text:0001B303 010 cmp ebx, [esi] .text:0001B305 010 jb short ProcessUserBuff .text:0001B307 010 pop edi .text:0001B308 .text:0001B308 CleanupAndRet: ; CODE XREF: DispatchChunk+1E .text:0001B308 00C push 20Ch ; size_t .text:0001B30D 010 push esi ; void * .text:0001B30E 014 call ZeroChunk .text:0001B313 00C push 'gksv' ; Tag .text:0001B318 010 push esi ; P .text:0001B319 014 call ds:ExFreePoolWithTag .text:0001B31F 00C pop esi .text:0001B320 008 pop ebx .text:0001B321 004 pop ebp .text:0001B322 000 retn 4 .text:0001B322 DispatchChunk endp

This dispatch function searched for a pointer to process. Processing included derefending a certain object and calling a certain function depending on the flags set in the structure. But since, with incorrect parameters, it was not possible to select the structure for processing, the dispatch function simply “drove” abroad of the first block. Such processing resulted in an access violation and a blue screen of death.

Thus, we are able to execute arbitrary code at a controlled address:

 .text:0001B946 UnsafeFire proc near .text:0001B946 .text:0001B946 .text:0001B946 arg_0 = dword ptr 8 .text:0001B946 .text:0001B946 000 mov edi, edi .text:0001B948 000 push ebp .text:0001B949 004 mov ebp, esp .text:0001B94B 004 mov eax, [ebp+arg_0] .text:0001B94E 004 push eax .text:0001B94F 008 call dword ptr [eax+0ACh] ; BANG!!!! .text:0001B955 004 pop ebp .text:0001B956 000 retn 4 .text:0001B956 UnsafeFire endp

Exploitation

Since the dispatcher function goes beyond the block boundary, it is encountered either with the neighboring block or with an unprojected page. If it goes into an unprojected memory, an unhandled exception will occur, and therefore a blue screen will be displayed. But when it hits a neighboring block, the control function interprets its header as a pointer to the structure to be processed.

Suppose there is an x86 system. The four bytes that the control function is trying to interpret as a pointer are actually the fields Previous Block Size, Pool Index, Current Block Size and Pool Type flags. Since we know the size and index of the pool for the block being processed, we know the meaning of the low word of the pointer:

0xXXXX0043 - 0x43 is the block size, which becomes the Previous Size field for the neighbor. 0 - index of the pool, which is guaranteed to be exactly zero, since these blocks are in the non-pumped pool, and it is only one in the system. Note that if neighboring blocks share the same page of memory, they belong to the same type and index of the pool.

The high word holds the block size, which we cannot predict, and the pool type flags, which, on the contrary, can be foreseen:

B = 0: block from non-downloadable pool
U = 1: implies that the block is used
Q = 0/1: the block can be quota
S = 0: the pool is not session
T = 0: block is not trackable by default
Unused bits are zero

Thus, we have the following memory regions valid for Windows 7 and 8:

0x04000000 - 0x06000000 for ordinary blocks
0x14000000 - 0x16000000 for quota units

Based on the above information, you can independently calculate the regions of memory for Windows XP and the like.

As you can see, these regions belong to the user space, so we can force the dispatch function to execute any code, including those under our control. To do this, you first need to project the specified regions of memory in the process, and then for every 0x10000 bytes to satisfy the requirements of the dispatch function:

Address [0x43 + 0x38] must be placed DWORD = 0x00000001 to meet the following condition:

 .text:0001B2E1 010 lea edx, [eax+38h] .text:0001B2E4 010 lock xadd [edx], ecx .text:0001B2E8 010 cmp ecx, 1

At the address [0x43 + 0xAC] it is necessary to place a pointer to the shellcode.
At the address [0x43 + 0x100], you need to place a pointer to a fake object, which will be dereferenced by the ObfDereferenceObject () function. Note that the reference count is stored in the header with a negative offset relative to the object, so make sure that the code in the ObfDereferenceObject () function does not fall on an unprojected region. Also specify the appropriate value of the reference count, since, for example, when the reference count reaches zero, ObfDereferenceObject () will attempt to free memory by functions that are completely unsuitable for user-mode memory.

Consider the fact that for different VMware products, the offset values may be different.

Everything is done right!

Increase exploit stability

Despite the fact that we have developed a good strategy for the exploitation of this vulnerability, it still remains unreliable. For example, the dispatch function may fall on a free block, the fields of which cannot be predicted. Despite the fact that the title of such a block will be interpreted as a pointer (because it is not zero), the result of its processing will be a “blue screen” error. This will also happen when the dispatch function falls into an unprojected area of memory.

In this case, the technique of kernel pool spraying comes to the rescue. I chose semaphores as the pool spraying object, since they are the most suitable in size. As a result of the use of this technique, the stability of the exploit has increased significantly.

Let me remind you that in the Windows 8 system there appeared support for such a protection mechanism as SMEP, therefore the developer’s laziness complicates the development of the exploit somewhat. Writing a base-independent code with SMEP bypass remains an exercise for the reader.

As for x64-systems, there is a problem with the fact that the size of the pointer was equal to 8 bytes. This means that the high double word (DWORD) of the pointer will fall into the Pool Tag field. And since most of the drivers and kernel subsystems use ASCII characters for such labels, the pointer falls into the space of non-canonical addresses and cannot be used for operation. At the time of this writing, I did not come up with anything sensible about this.

Total

Hope this information was helpful. I apologize for not being able to fit everything you need in a couple of paragraphs. I wish you success in research and exploitation in the name of a complete increase in safety.

PS I remind you that in order to eliminate the vulnerability, you need to update not only the main one, but also all the guest systems!
PPS If you feel some discomfort from the translation of certain terms, be prepared to put up with it in the future, since this translation is recommended on the Microsoft language portal .

Demo!

Links
[1] Tarjei Mandt. Kernel Pool Exploitation on Windows 7. Black Hat DC, 2011
[2] Nikita Tarakanov. Kernel Pool Overflow from Windows XP to Windows 8. ZeroNights, 2011
[3] Kostya Kortchinsky. Real world kernel pool exploitation. SyScan, 2008
[4] SoBeIt. How to exploit Windows kernel memory pool. X'con, 2005

Source: https://habr.com/ru/post/172719/

All Articles