We continue the series of articles on exploit protection mechanisms for browsers:
Let's take a look under the hood of the Chrome browser and find out what is in its anti-exploit arsenal.
Involves the common mechanisms provided by modern versions of Windows and compilers:
We consider the internal mechanisms of the browser:
The browser provides some API for managing the document's component objects — web pages. The content of this document is represented as a tree of nodes, each node of which — an element, attribute, text, graphic, or any other object — is a representation of the DOM. The nodes of this tree - "nodes" - can be created, destroyed and modified by JavaScript. The existence of many interdependent complex objects is a prerequisite for the presence of bugs, and the convenience of managing these objects using JavaScript is a way to use these bugs.
The bugs that cause damage to objects in the heap are very often the starting point for the entire sequence of actions that the browser exploit performs. The vulnerabilities caused by them can be divided into two broad categories:
To prevent a targeted impact on other objects in memory, using a vulnerable object, it is proposed to consider a number of mechanisms implemented in the allocator - heap managers.
Blink, the Chrome renderer, uses two of its own allocators: PartitionAlloc and Oilpan (aka BlinkGC). There are two separate rare cases:
PartitionAlloc is used for objects for which automatic garbage collection is not supposed . This is his key difference from Oilpan, about which later.
PartitionAlloc design includes elements related to the security of objects in memory:
You can illustrate this development by looking at the Pinkie Pie exploit demonstrated on Mobile Pwn2Own 2013. Here the author uses integer overflow, which occurs in the constructor of a typed array. The conditions are as follows: the buffer of the array Float64 is allocated, while the sequential initialization of all its elements goes beyond its end and the recording of arbitrary values of the type Float64 can be continued indefinitely further every 8 bytes of memory after the allocated buffer. For an array of suitable length, you need a large buffer, PartitionAlloc delegates its allocation to the system allocator - dlmalloc on Android. Pinkie Pie rewrote the header of the next allocation in the heap, changing its size, releasing the object placed there (adding a block of the required size to the freelist) and thus achieving the selection of the next object of a given size at this place — where you can continue to write arbitrary values.
From here we see what the developers are protecting the metadata of the allocator, why large locations are framed by guard pages, which will not allow to go beyond the buffer in such a way that the buffers of typed arrays are isolated from other objects.
Oilpan offers automatic garbage collection. This system removes from developers the need for manual memory management, which is the cause of use-after-free class errors. Let us briefly recall the essence of the vulnerabilities caused by such errors: the premature release of the object occurs, that is, the release, after which it can be used.
As an example, let's take a look at what we managed to find on the project’s bug tracker: https://bugs.chromium.org/p/chromium/issues/detail?id=69965 This is a UaF bug related to the Geolocation class. The following happens: objects of the Geolocation class are destroyed when the page is updated, however, the associated requests for permission of geolocation were not previously canceled, because of which there are hanging pointers in the request manager, and an attempt to cancel them when the tab is closed in the future ends with erroneous access to the released geolocation object. The patch for this bug adds the pageDestroyed method to the Geolocation class, which apparently should have arranged the correct order for releasing page objects. Since then, the Geolocation class has undergone changes due to the introduction of Oilpan, it is now automatically controlled by this system.
The operation of such bugs consists of the following stages: fulfillment of conditions in which the vulnerable object will be removed from the heap, placement of monitored data on the memory freed from this object - thus creating a "fake object" and fulfilling the conditions leading to using elements of this fake object as its own members . In order to prevent the second part of this action - making a fake object by placing it on the free memory - the Chrome developers isolate regions of memory in which objects of different types live. Let's see how this is done:
// Override operator new to allocate Node subtype objects onto // a dedicated heap. GC_PLUGIN_IGNORE("crbug.com/443854") void* operator new(size_t size) { return allocateObject(size, false); } static void* allocateObject(size_t size, bool isEager) { ThreadState* state = ThreadStateFor<ThreadingTrait<Node>::Affinity>::state(); const char typeName[] = "blink::Node"; return ThreadHeap::allocateOnArenaIndex( state, size, isEager ? BlinkGC::EagerSweepArenaIndex : BlinkGC::NodeArenaIndex, GCInfoTrait<EventTarget>::index(), typeName); }
chromium // src / third_party / WebKit / Source / core / dom / Node.h
In the overridden new, allocateObject is called, the argument isEager == false, so - next, the ThreadHeap :: allocateOnArenaIndex takes the third argument arenaIndex BlinkGC :: NodeArenaIndex - the index of the "arena" (region of memory) in which we will allocate the object:
inline Address ThreadHeap::allocateOnArenaIndex(ThreadState* state, size_t size, int arenaIndex, size_t gcInfoIndex, const char* typeName) { ASSERT(state->isAllocationAllowed()); ASSERT(arenaIndex != BlinkGC::LargeObjectArenaIndex); NormalPageArena* arena = static_cast<NormalPageArena*>(state->arena(arenaIndex)); Address address = arena->allocateObject(allocationSizeFromSize(size), gcInfoIndex); HeapAllocHooks::allocationHookIfEnabled(address, size, typeName); return address; }
chromium // src / third_party / WebKit / Source / platform / heap / Heap.h
What other regions are defined?
enum HeapIndices { EagerSweepArenaIndex = 0, NormalPage1ArenaIndex, NormalPage2ArenaIndex, NormalPage3ArenaIndex, NormalPage4ArenaIndex, Vector1ArenaIndex, Vector2ArenaIndex, Vector3ArenaIndex, Vector4ArenaIndex, InlineVectorArenaIndex, HashTableArenaIndex, FOR_EACH_TYPED_ARENA(TypedArenaEnumName) LargeObjectArenaIndex, // Values used for iteration of heap segments. NumberOfArenas, }; * * * // List of typed arenas. The list is used to generate the implementation // of typed arena related methods. // // To create a new typed arena add a H(<ClassName>) to the // FOR_EACH_TYPED_ARENA macro below. #define FOR_EACH_TYPED_ARENA(H) \ H(Node) \ H(CSSValue) #define TypedArenaEnumName(Type) Type##ArenaIndex,
chromium // src / third_party / WebKit / Source / platform / heap / BlinkGC.h
Here we see: objects of the Node, CSSValue, HashTables, Vectors classes will be divided into memory; The remaining objects are distributed by region by size.
Let us turn to the consideration of the key property of Oilpan / BlinkGC - automatic garbage collection. Objects that should be managed by this system are inherited from the GarbageCollected , GarbageCollectedFinalized or GarbageCollectedMixin template class. Heap member objects of these classes are represented by the Member or WeakMember template classes, depending on the semantics required.
The garbage collection algorithm is the mark-and-sweep algorithm, and consists of two main steps:
If the instructions received during the assembly of dynamically generated code did not change, the attacker would receive a powerful primitive that allows you to create shellcode in executable memory. To avoid this, a series of counter measures were introduced:
NOPs
NOPs (instructions that do not change the state of the environment, the only purpose of which is to take up space) of various sizes - from one to eight bytes - are randomly inserted into the program body. They are needed to eliminate the possibility of the appearance of constant sequences of bytes in the assembled code.
void Assembler::Nop(int n) { // The recommended muti-byte sequences of NOP instructions from the Intel 64 // and IA-32 Architectures Software Developer's Manual. // // Length Assembly Byte Sequence // 2 bytes 66 NOP 66 90H // 3 bytes NOP DWORD ptr [EAX] 0F 1F 00H // 4 bytes NOP DWORD ptr [EAX + 00H] 0F 1F 40 00H // 5 bytes NOP DWORD ptr [EAX + EAX*1 + 00H] 0F 1F 44 00 00H // 6 bytes 66 NOP DWORD ptr [EAX + EAX*1 + 00H] 66 0F 1F 44 00 00H // 7 bytes NOP DWORD ptr [EAX + 00000000H] 0F 1F 80 00 00 00 00H // 8 bytes NOP DWORD ptr [EAX + EAX*1 + 00000000H] 0F 1F 84 00 00 00 00 00H // 9 bytes 66 NOP DWORD ptr [EAX + EAX*1 + 66 0F 1F 84 00 00 00 00 // 00000000H] 00H ... }
Constant folding
Arithmetic expressions are counted (collapsed) during code assembly:
<script> x = 0x123 + 0x567; // == 0x68A </script>
mov rax,68A00000000h
Constant blinding
Only values up to two bytes are stored unchanged in the code. For example:
<script> a = 0x1234; </script>
Will be compiled in:
... mov rax,123400000000h ...
Larger constants conquer random numbers (jit_cookie):
void MacroAssembler::SafeMove(Register dst, Smi* src) { ... if (IsUnsafeInt(src->value()) && jit_cookie() != 0) { if (SmiValuesAre32Bits()) { // JIT cookie can be converted to Smi. Move(dst, Smi::FromInt(src->value() ^ jit_cookie())); Move(kScratchRegister, Smi::FromInt(jit_cookie())); xorp(dst, kScratchRegister); } else { DCHECK(SmiValuesAre31Bits()); int32_t value = static_cast<int32_t>(reinterpret_cast<intptr_t>(src)); movp(dst, Immediate(value ^ jit_cookie())); xorp(dst, Immediate(jit_cookie())); } } else { Move(dst, src); } }
Guard pages
The buffer containing the compiled JIT code is framed by PAGE_NOACCESS pages to prevent it from being overwritten if heap overflows in nearby locations.
JIT Page Randomization
The memory location where the collected JIT code will be located is often (but not always) randomized. If the free address is not guessed from three attempts, Chrome lets the system allocator choose the address for the created buffer.
static void* RandomizedVirtualAlloc(size_t size, int action, int protection) { ... if (use_aslr && (protection == PAGE_EXECUTE_READWRITE || protection == PAGE_NOACCESS)) { // For executable pages try and randomize the allocation address for (size_t attempts = 0; base == NULL && attempts < 3; ++attempts) { base = VirtualAlloc(OS::GetRandomMmapAddr(), size, action, protection); ... } void* OS::GetRandomMmapAddr() { ... static const uintptr_t kAllocationRandomAddressMin = 0x0000000080000000; static const uintptr_t kAllocationRandomAddressMax = 0x000003FFFFFF0000; ... uintptr_t address; platform_random_number_generator.Pointer()->NextBytes(&address, sizeof(address)); address <<= kPageSizeBits; address += kAllocationRandomAddressMin; address &= kAllocationRandomAddressMax; return reinterpret_cast<void *>(address); }
Chrome implements a multi-process architecture that allows you to assign different privileges and restrictions for different parts of the browser. The unit that the sandbox operates on is a process. The minimum configuration of the Chrome sandbox includes two processes: one privileged, called a broker , and one (or more) isolated . For example, how isolated processes stand out renderers - Blink engine instances that render HTML documents. Renderers are launched for tabs of web pages and for browser extensions. The risk of compromise of the renderer is high, because inside it there is an interpretation of heterogeneous code downloaded from any sources that the user will surf. In addition to renderers, separate processes are containers of plug-ins (flash), auxiliary ones are crash reporter, gpu accelerator for graphics. Renderers and others use IPC (inter-process communication) to request access to resources from a broker. They delegate such API calls to the broker via IPC, the broker checks the delegated call with the policy specified for each isolated process, the allowed calls are executed, and the result is returned back through the same IPC mechanism.
Chrome sandbox model, source
Windows tools on which Chrome process isolation is based:
It is worth noting once again that there is an interaction between isolated processes and a preferred broker, which means exit from the sandbox can be accomplished not only through the weaknesses of the above-listed system mechanisms, but also through exploiting the broker's vulnerability that is achievable through IPC. This approach was demonstrated by Pinkie Pie on Mobile Pwn2Own 2013, in conjunction with RCE, which we have already discussed earlier in this article: see Part II, link .
The access token contains SIDs - identifiers of the subjects of access: users and groups. Isolated processes are assigned a token containing a NULL SID (S-1-0-0), for which the system will hardly detect an object with an ACL that can be obtained.
How does such a process get the handle of a file? On API functions (here - ZwCreateFile), normal hooks are installed, the call is redirected through the sandbox modules to the broker, the broker opens the file and duplicates the handle back.
Includes some special restrictions related to non-ACL managed resources. This entity prohibits the creation of child processes, reading / writing clipboard and so on. More details .
For isolated processes, Chrome creates a separate desktop object in order to prevent interaction with other processes by passing messages to their windows.
Why is this interaction dangerous? This is the old weakness of the Windows architecture that was used to execute the so-called. Shatter Attack . Window messages up to Vista were anonymous and could be sent to any process. A particularly piquant opportunity was given by the WM_TIMER message with the address of the function to which the target process will transfer control without any participation on its part.
In Vista and later versions, the transfer of messages between processes was limited based on their Integrity level (power of attorney): User Interface Privilege Escalation. Less privileged processes can no longer send messages to more privileged ones.
Windows access control mechanisms, we wrote about them in the previous article .
A set of new Windows security features that can be enabled for processes partially overlap the capabilities of the EMET (Enhanced Mitigation Experience Tool Tool) . Here are features such as disabling font loading (parsing in the Windows kernel), modules in your process, and also creating processes.
The ban on creating processes intersects with what has already been done in the Job object for isolated Chrome processes, but there was one funny gap in the Job object. The workaround is to call the AllocConsole API, which creates a console window for the program, and the host process conhost.exe will be launched for the console window. Read more about these policies and their weaknesses in the presentation by the researcher James Forshaw.
We will consider this policy separately.
The Windows graphics subsystem has been delivering LPE vulnerabilities for many years. In the case of browser attacks, they are used after RCE. Having received the execution of the code in the render process, the exploit elevates privileges through the vulnerability of the Windows component, thereby gaining full access to the system. This can be illustrated with a well-documented exploit for the win32k kernel pool corruption vulnerability, which was demonstrated by researchers at MWR Labs at Pwn2Own 2013 in conjunction with RCE in Chrome: article presentation .
The vulnerability was found in the call handler, which is used to transfer messages between windows: . The last bAnsi parameter determines the encoding of the message text that is copied from the process that caused the service to the kernel memory: WCHAR or ASCII - 2 or 1 byte per character. And this parameter was interpreted differently when the buffer was allocated in the kernel pool and when the message was copied to the buffer - first as bool, then as bitmask. This made it possible to overflow the buffer by copying twice as many bytes into it. Thus, by manipulating the data in the kernel, we achieved the execution of shellcode in ring0, the shellcode reset the ACL of the privileged process winlogon.exe, that is, left it vulnerable to trivial code injection. Profit!
Win32k problem
The development of this simple, at first glance, countermeasure took a lot of time and effort, as it required modification not only of the Chrome code itself, but also coordination with the Adobe Flash Player and Pdfium development teams (lockdown is needed not only for rendering processes, but also for PPAPI processes where plugins are executed). Google engineers have added their broker to the Flash stack with win32k. Currently, a full-fledged lockdown implementation exists only for Windows 10, since the operating system itself provides filtering capabilities for system calls . It is highly recommended to read the document describing the problems and solutions of this remedy.
Chrome's strength is, of course, the sandbox. Here we see a wide range of ways to limit the authority to mitigate the effects of exploiting vulnerabilities in the renderer code base. The set of these methods depends on what the operating system offers us; in newer versions of Windows, many new and interesting things have been added. In addition, much attention is paid to managing dynamic memory, which remains in the background when creating new browser features for the modern web, but is of paramount importance from a security point of view. The developers have implemented a progressive garbage collection system and obtained new properties of the environment in which the browser components are executed, which are not typical for ordinary C ++ applications.
Source: https://habr.com/ru/post/319234/
All Articles