Control Flow Guard. The principle of work and workarounds on the example of Adobe Flash Player

Microsoft does not leave attempts to win the endless war with exploiters, implementing new application protection techniques over and over again. This time, the developers of the Windows operating system approached the solution of this issue more fundamentally, shifting their view of the root of the problem. The work of almost every exploit is somehow aimed at intercepting the flow of application execution, therefore, it would not hurt to “teach” applications to monitor this moment.
The concept of Control Flow Integrity was described as early as 2005. And now, 10 years later, the developers at Microsoft presented their incomplete implementation of this concept - Control Flow Guard.

What is Control Flow Guard?

Control Flow Guard (Guard CF, CFG) is a relatively new Windows protection mechanism (exploit mitigation), aimed at complicating the exploitation of binary vulnerabilities in user applications and kernel-mode applications. The work of this mechanism is to validate implicit calls (indirect calls) that prevent an attacker from intercepting the flow of execution (for example, by rewriting the virtual function table). Combined with previous security mechanisms ( SafeSEH , ASLR , DEP , etc.) is an additional headache for the creators of exploits.
This security feature is available to users of Microsoft Windows 8.1 (Update 3, KB3000850) and Windows 10.
Compiling programs with CFG support is available in Microsoft Visual Studio 2015 ( how to enable? ).

A similar implementation of the protection mechanism based on the Control Flow Integrity concept for Linux operating systems is available in the PaX extension.

How does Control Flow Guard work?

Let's look at how CFG works in user mode. This mechanism has two main components: the address bitmap (managed by the kernel) and the procedure for checking the pointer of the function being called (used by user applications).
All CFG service information is recorded in the IMAGE_LOAD_CONFIG_DIRECTORY executable file at compile time:

GuardCFCheckFunctionPointer - pointer to the verification procedure
GuardCFFunctionTable - a table of valid addresses of functions (used by the kernel to initialize the bitmap)
GuardCFFunctionCount - the number of functions in the table
GuardFlags - flags

The IMAGE_NT_HEADERS.OptionalHeader.DllCharacteristics header IMAGE_DLLCHARACTERISTICS_GUARD_CF flag, indicating that this executable file supports the CFG mechanism.

All the service information can be viewed using the dumpbin.exe tool from Microsoft Visual Studio 2015 ( Microsoft Visual Studio 14.0 \ VC \ bin \ dumpbin.exe ) by running it with the /loadconfig .

Guardflags

The winnt.h header file for Windows 10 (1511) contains the following CFG flags (the latter is a mask, not a flag):

IMAGE_GUARD_CF_INSTRUMENTED (0x00000100) - The module performs flow checks with the support of the system
IMAGE_GUARD_CFW_INSTRUMENTED (0x00000200) - The module checks the integrity of the execution stream and the record
IMAGE_GUARD_CF_FUNCTION_TABLE_PRESENT (0x00000400) - The module contains a table of valid functions
IMAGE_GUARD_SECURITY_COOKIE_UNUSED (0x00000800) - The module does not use security cookies (/ GS)
IMAGE_GUARD_PROTECT_DELAYLOAD_IAT (0x00001000) - Module supports Delay Load Import Table, read-only
IMAGE_GUARD_DELAYLOAD_IAT_IN_ITS_OWN_SECTION (0x00002000) - The Delay Load Import Table is in its own .didat section
IMAGE_GUARD_CF_FUNCTION_TABLE_SIZE_MASK (0xF0000000) - Step of one element of the table of valid Guard CF functions are encoded in these bits (an additional number of bytes for each element)

It is worth noting that this is an incomplete list of existing flags. The most complete list can be obtained from the link.exe file (linker):

Also pay attention to the presence of some interesting flags, official information about which is not. Microsoft developers are apparently testing an additional CFG mechanism to verify the address of the entry ( IMAGE_GUARD_CFW_INSTRUMENTED ).

Bitmap

During OS loading, the kernel (the nt!MiInitializeCfg ) creates the nt!MiCfgBitMapSection , which is a common (shared) section for all processes. When a process that supports CFG is started, the mapping of the bitmap to the address space of the process takes place. After that, the address and size of the bitmap are entered into the ntdll!LdrSystemDllInitBlock structure ntdll!LdrSystemDllInitBlock .
The mapping of function addresses to bits in the bitmap is done by the executable file loader (the nt!MiParseImageCfgBits ). Each bit in the bitmap is responsible for 8 bytes of the process’s user address space. Addresses of the beginning of all valid functions correspond to a single bit of the corresponding offset in the bitmap, and all the rest - 0.

Procedure for checking the pointer of the called function

Each implicit call in the program at compile time is framed by checking the address of the called function. The address of the verification procedure installs the executable loader, since the address of the empty procedure is initially set, thereby maintaining backward compatibility.
For clarity, let's look at the same code compiled without CFG and with it.

Original C ++ code:

 class CSomeClass { public: virtual void doSomething() { std::cout << "hello"; } }; int main() { CSomeClass *someClass = new CSomeClass(); someClass->doSomething(); return 0; }

ASM listing (clipping):

 mov eax, [ecx] ; EAX = CSomeClass::vftable call dword ptr [eax] ; [EAX] = CSomeClass::doSomething()

Compile key / guard: cf:

 mov eax, [edi] ; EAX = CSomeClass::vftable mov esi, [eax] ; ESI = CSomeClass::doSomething() mov ecx, esi call ds:___guard_check_icall_fptr ; checks that ECX is valid function pointer mov ecx, edi call esi

In the first case, the code is subject to an attack using the virtual function table substitution technique. If an attacker is able to overwrite object data during the exploitation of a vulnerability, he can replace the virtual function table in such a way that a call to the function someClass->doSomething() will lead to the execution of the code controlled by the attacker, thereby intercepting the flow of application execution.
In the case of using Control Flow Guard, the address of the called function will be pre-checked with the bitmap. If the corresponding bit is zero, a program exception will occur.

When running this application on an OS that supports the Guard CF mechanism, the executable file loader will build a bitmap and redirect the checking procedure address to the ntdll!LdrpValidateUserCallTarget function ntdll!LdrpValidateUserCallTarget .
This feature in Windows 10 (build 1511) is implemented as follows:

We study the algorithm of this function on the example of the input address 0x0B3385B0.

B3385B0 ₁₆ = 10110011001110000101 10110 000 ₂

This function receives the address being checked through the ecx . In the register edx entered the address of the bitmap. In my case, the bitmap is located at 0x01430000.

Three bytes (24 bits) higher order (underlined) addresses correspond to the offset in the bitmap. In this case, the offset will be 0xB3385 . The unit of measure of the bitmap is 4 bytes (32 bits), therefore, to obtain the desired cell, it is necessary to calculate the + * 4 . For this example, we get 0x01430000 + 0xB3385 * 4 = 0x16FCE14 . The value of the cell of the bitmap is written to the edx .

The target cell received, now it is required to determine the number of the bit of interest to us. The number is the value of the next 5 bits of the address (in bold). But keep in mind that if the address being checked is not aligned at the boundary of 16 bytes ( address & 0xf != 0 ), then an odd bit will be used ( offset | 0x1 ). In this case, the address is aligned and the bit number will be 10110 ₂ = 22 ₁₀ .

Now it only remains to check the value of the bit by performing a bit test. The bt instruction checks the bit value of the first register, the sequence number of which is taken from the 5 low bits (modulo 32) of the second register. If the bit is 1, the Carry Flag (CF) will be set and the program will continue its execution as usual.

Otherwise, the ntdll!RtlpHandleInvalidUserCallTarget function will be called and the program will end with the 29th interrupt with the 0xA parameter on the stack, which means nt!_KiRaiseSecurityCheckFailure(FAST_FAIL_GUARD_ICALL_CHECK_FAILURE) .

By checking bit 22, you can verify that the address of the function being called is valid.

The implementation of this algorithm in Python is as follows:

 def calculate_bitmap_offset(addr): offset = (addr >> 8) * 4 bit = (addr >> 3) % 32 aligned = (addr & 0xF == 0) if not aligned: bit = bit | 1 print "addr = 0x%08x, offset = 0x%x, bit index = %u, aligned? %s" % (addr, offset, bit, "yes" if aligned else "no") calculate_bitmap_offset(0x0B3385B0)

The result of the script:

 addr = 0x0b3385b0, offset = 0x2cce14, bit index = 22, aligned? yes

Exceptions

Not in all cases the call of an invalid function will end with the 29th interrupt. The following checks occur in the ntdll!RtlpHandleInvalidUserCallTarget function:

Is DEP enabled for the current process?
Does the target address have the required rights ( PAGE_EXECUTE_WRITECOPY | PAGE_EXECUTE_READWRITE | PAGE_EXECUTE_READ | PAGE_EXECUTE )
Whether suppressed calls are ntdll!RtlGuardAllowSuppressedCalls - ntdll!RtlGuardAllowSuppressedCalls
Whether the target address is "suppressed" - ntdll!RtlpGuardIsSuppressedAddress

The pseudocode of this function is as follows:

Official information about the "suppressed" calls is missing. We can only say that these calls require compiler support — the IMAGE_GUARD_CF_FUNCTION_TABLE_SIZE_MASK mask must be set in the GuardFlags flags and the compiler must generate an extended table. In the bytes corresponding to this mask, the value of the additional size for the elements of the GuardCFFunctionTable table is GuardCFFunctionTable . If the address of the function is "suppressed", then the byte following the address must be equal to one.
You can allow suppressed calls, for example, using the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\ CFGOptions , by setting the CFGOptions parameter for the required application to 1.

Weak points Control Flow Guard

Like any other defense mechanism, CFG has some weak points:

Turning off the DEP process entails a loss of meaning of the CFG checks, since the ntdll!RtlpHandleInvalidUserCallTarget in this case will always allow the execution of an invalid address.
The address of the bitmap is stored at a fixed address and can be easily calculated from the user mode. The user is not allowed to modify the data in the bitmap, but exploiters will somehow find a way around this restriction.
If the executable file is not compiled with CFG support, then CFG support in the modules loaded by it loses meaning. The bitmap and the address of the verification procedure are filled only if the executable file supports the CFG mechanism, so checks inside the module code will be simple stubs.
CFG depends on the compilation process, so third-party modules and even old Microsoft modules are a vulnerability in the protected CFG executable file. Since the bitmap is compiled according to the table of addresses of valid functions, which is generated by the compiler, the entire module code without CFG support will be marked in the bitmap with a valid destination.

For every 8 bytes of the address space, 1 bit answers, but, in fact, 1 equalized address corresponds to one even bit, while the next odd bit corresponds immediately to 15 bytes of the address space. You can verify this by running the above Python script in a loop and analyzing the result:

 addr = 0x08f38480, offset = 0x23ce10, bit index = 16, aligned? yes addr = 0x08f38481, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38482, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38483, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38484, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38485, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38486, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38487, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38488, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38489, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f3848a, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f3848b, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f3848c, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f3848d, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f3848e, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f3848f, offset = 0x23ce10, bit index = 17, aligned? no addr = 0x08f38490, offset = 0x23ce10, bit index = 18, aligned? yes addr = 0x08f38491, offset = 0x23ce10, bit index = 19, aligned? no addr = 0x08f38492, offset = 0x23ce10, bit index = 19, aligned? no addr = 0x08f38493, offset = 0x23ce10, bit index = 19, aligned? no addr = 0x08f38494, offset = 0x23ce10, bit index = 19, aligned? no ...

From this it follows that the attacker has the ability to call the untrusted function in the immediate vicinity of the trusted function, provided that the latter is not equalized.

Dynamically generated functions (for example, JIT functions) require special attention from developers, since it is necessary to provide checks for implicit calls at the function generation stage. In addition to this, it is necessary to take into account that the standard behavior of the ntdll!NtAllocVirtualMemory and ntdll!NtProtectVirtualMemory is to ntdll!NtProtectVirtualMemory a single bit for the entire memory region in the Control Flow Guard bitmap if the memory becomes executable ( PAGE_EXECUTE_* ).
CFG is not able to prevent the execution thread from intercepting when an attacker modifies a function’s return address.
The calls of library functions (for example, WinAPI) from the CFG point of view are valid, but the attacker has to solve the task of filling the stack / registers with the necessary parameters.

Implementing Control Flow Guard bypass using the example of Adobe Flash Player

Starting with Windows 8, the Adobe Flash Player plugin is integrated into Internet Explorer, and with Windows 8.1 (Update 3) it comes with CFG support. There are several implementations of the Control Flow Guard circumvention in exploits for Adobe Flash Player, some of which are still relevant today.

Bypass with dynamic code

In Adobe Flash Player, JIT-compilation is actively used, which allows you to avoid performing such a resource-intensive operation as interpreting ActionScript code. But, as mentioned earlier, dynamically generated functions require additional attention from developers. The two workarounds, described below, are a consequence of the omission of developers regarding work with memory allocation.

No implicit call checks in dynamic code

This method was proposed and implemented by the researcher Francisco Falcón from Core Security in his analysis of an exploit for the CVE-2015-0311 vulnerability. The original article describes quite well the process of implementing a crawl in detail. The essence of the method is to modify the internal table of virtual functions of a certain ActionScript class. After that, one of the methods of this class must be called from the body of a dynamically generated function. The class ByteArray well suited for this purpose.
The structure of the ByteArray object:

At offset $ + 8 is a pointer to an object of class VTable :

The VTable class is an internal representation of a virtual table of functions (that is, not the one that C ++ generates) for ActionScript classes.
An object of this class contains pointers to objects of the class MethodEnv :

This class is a description of the ActionScript method and contains a pointer to the function body in memory at the offset of $ + 4.

The VTable object at offset $ + D4 contains the description of the method ByteArray::toString() . Having the ability to arbitrarily read and write to memory, an attacker is able to change the function pointer to the function body ( MethodEnv + 4 ) and safely intercept the application execution flow by executing ByteArray::toString() .

This becomes possible due to the fact that the method of this class will be called from the JIT code:

As can be seen in the screenshot above, an implicit call occurs without first checking the called address, since this function was generated dynamically.

This CFG workaround was fixed with the release of Adobe Flash Player version 18.0.0.160 (KB3065820, June 2015). The fix is as follows: if the JIT function contains an implicit call, then the JIT compiler will insert a call to the check procedure immediately before the implicit call.

Any address within the body of a dynamic function is valid.

The previous workaround was possible due to a flaw in the function that makes the implicit call. And this method is possible due to a flaw in a function that is implicitly called.
Researchers Yuri Drozdov and Lyudmila Drozdov from the Center of Vulnerability Research presented this CFG bypass method at the Defcon Russia conference (St. Petersburg, 2015) ( presentation , article ). Their method is based on the fact that when allocating executable memory, the kernel sets a single bit in the CFG bitmap for the entire allocated memory. Let's see what this behavior can lead to.

Suppose that there is a certain JIT-function at the address 0x69BC9080, the body of which contains the following code:

What exactly this function does is not what interests us, we just need to pay attention to the 2 bytes FF D0 instructions at 0x69BC90F0. What happens if the beginning of the function suddenly moves to the middle of this instruction? That's what:

FF D0 is nothing but call eax ! This is how a seemingly harmless function turned into an excellent target for an attacker — an implicit call without checking Control Flow Guard. It is only necessary to deal with two questions: how to achieve the desired sequence of bytes and how to write the necessary address in the register.

You can generate the necessary sequence simply by experimenting with ActionScript code. One has only to take into account the fact that Nanojit ( AVM JIT compiler) obfusts the generated code, so there will be no easy way. Let's see what this function will turn Nanojit into:

 public static function useless_func():uint { return 0xD5EC; }

Result:

Not at all what we expected. Experienced way you can come, for example, to this version of the code:

 public static function useless_func():void { useless_func2(0x11, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26); } public static function useless_func2(arg1:uint, arg2:uint, arg3:uint, a, b, c, d, e, f, g, h, i, j, k, l, m, n, p, q, r, s, t, u, v, w, x, y, z):void { }

The body of the first function will contain the following instructions:

The FF 11 bytes of interest are the instruction call [ecx] :

An implicit call was received, now you need to add a controlled address to the ecx . Let us find out what is stored in this register at the moment of calling the function useless_func() .

At the time of the function call, in the ecx is an object of class MethodEnv . The first DWORD of this class is a pointer to a virtual table of functions (the one that the C ++ compiler generates). This table is not used when calling the useless_func() method, so nothing prevents the attacker from replacing the pointer with his own just before calling the method.
:

 var class_addr:uint = read_addr(UselessClass); var vtable:uint = read_dword(class_addr + 8); var methodenv:uint = read_dword(vtable + 0x54); // $+54 = useless_func var func_ptr:uint = read_dword(methodenv + 4); write_dword(methodenv + 4, func_ptr + offset_to_call_ecx); write_dword(methodenv, rop_gadget); // ecx <- pointer to rop gadgets UselessClass.useless_func(); // call [ecx]

, , , ROP -.

CFG 18.0.0.194 (KB3074219, 2015).
PAGE_TARGETS_INVALID/PAGE_TARGETS_NO_UPDATE (0x40000000) VirtualAlloc VirtualProtect WinAPI — SetProcessValidCallTargets .
PAGE_TARGETS_INVALID , PAGE_TARGETS_NO_UPDATE .
AVMPI_makeCodeMemoryExecutable AVM ( AVMPI/MMgcPortWin.cpp ):

SetProcessValidCallTargets AVMPI_makeTargetValid ( AVMPI/MMgcPortWin.cpp ):

, CFG :

VirtualAlloc(PAGE_READWRITE)
VirtualProtect(PAGE_EXECUTE_READ | PAGE_TARGETS_NO_UPDATE )
SetProcessValidCallTargets()

, , .

WinAPI

Control Flow Guard , . WinAPI, , . , (shellcode) ROP-. WinAPI kernel32!WinExec .

Yuki Chen Qihoo 360 Vulcan Team SyScan (, 2015) , Internet Explorer 11 . BlackHat (, 2015) Francisco Falcón Adobe Flash Player.
Francisco Falcón toString() Vector , , .
, WinExec . , , 2 : LPCSTR lpCmdLine UINT uCmdShow .

lpCmdLine — , ( ).
uCmdShow — .

3 . . , 0 = SW_HIDE ( ). MethodEnv .

, 4 , ActionScript- . 4 , WinExec .
, 4 . , , cmd\0 ( Windows). , , , .

 var class_addr:uint = read_addr(UselessClass); var vtable:uint = read_dword(class_addr + 8); var methodenv:uint = read_dword(vtable + 0x50); // $+50 = useless_func var winexec:uint = get_proc_addr("kernel32.dll", "WinExec"); write_dword(methodenv + 4, winexec); // useless_func() --> WinExec() write_dword(methodenv, 0x00646d63); // '\0', 'd', 'm', 'c' UselessClass.useless_func();

WinAPI Flash- . , , Flash Exploiter Metasploit.
, :

, , , .

Flash- (payload) HackingTeam. . WinAPI kernel32!VirtualProtect , , , Control Flow Guard.
apply() Function ( core/FunctionClass.cpp )

core->exec->apply(get_callEnv(), thisArg, (ArrayObject*)AvmCore::atomToScriptObject(argArray)); , , ActionScript.

, GitHub . 64- Flash Metasploit .

Control Flow Guard

CFG Adobe Flash Player. Flash, , Control Flow Guard Internet Explorer 11.

Zhang Yunhai @ Black Hat 2015
read-only ___guard_check_icall_fptr CustomHeap::Heap Jscript9 .
Yuki Chen @ SyScan 2015
WinAPI kernel32!LoadLibraryA
Rafal Wojtczuk & Jared DeMott @ DerbyCon 2015 (video) , Bromium Labs
, — " " (stack desync). , Control Flow Guard . (calling convention).

Conclusion

, Control Flow Guard Windows. Microsoft , , Control Flow Integrity, , . , Microsoft .
, CFG.

. Intel, , , ROP- — CET (Control-flow Enforcement Technology) ( ). , CET Control Flow Guard.

Sources

Jack Tang, Trend Micro Threat Solution Team. Exploring Control Flow Guard in Windows 10.
mj0011, Qihoo 360 Vulcan Team. Windows 10 Control Flow Guard Internals.
Source code for the Actionscript virtual machine, GitHub.
Francisco Falcon, Core Security. Exploiting Adobe Flash Player in the era of Control Flow Guard.

Source: https://habr.com/ru/post/305960/

All Articles