📜 ⬆️ ⬇️

Fast security-oriented fuzzing with AFL

Many have heard, and some successfully use in their development processes such a thing as static code analysis - an effective, relatively fast and often convenient way to control the quality of the code.
For those who already use static code analysis, it may be interesting to try dynamic analysis at the testing stage as well. Enough has been written about the differences in these techniques, just to remind you that static analysis is done without executing code (for example, at the compilation stage), and dynamic, respectively, during execution. When analyzing a compiled code from a security point of view, dynamic analysis often means fuzzing. The advantage of fuzzing is the almost complete absence of false positives, which is quite common when using static analyzers.

“Fuzzing is a testing technique in which non-valid, unintended or incidental data is input to the program.” © Habrahabr


')

Recently, a fuzzer by Michal Zalewski - American Fuzzy Lop has received great fame for its effectiveness.
Its main difference is the instrumentation code at the compilation stage, performance and focus on practical application. AFL does not use SMT solvers, which means it should be less demanding of resources, work faster, though not always more efficiently.
Today I will tell you exactly how this tool can be used, and at the same time I will conduct a small experiment to compare the result of its work with the results of several static analysis tools.

So, in order to start using a fuzzer, you need to understand whether it works in practice, as well as what exactly and how we will be phasing.
For verification, I took a deliberately vulnerable version of the popular libcurl library - 7.34.0.
This version contains a vulnerability in the sanitize_cookie_path () function described in CVE-2015-3145 .

bleedingeyes

The function incorrectly processes the input data, and passing to it a path consisting of a double quote or a null byte, libcurl will assign null bytes to the negative pointer of the new_path array, and will corrupt memory on the heap.

First, let's check how static analyzers respond to this vulnerability.

Coverity , PVS Studio and Clang Static Analyzer were on hand.

In clang, you can do it like this:

$ cd curl $ mkdir build-clang $ cd build-clang $ cmake -DCMAKE_C_COMPILER=/path/to/clang/ccc-analyzer -DCMAKE_CXX_COMPILER=/path/to/clang/ccc-analyzer -DCMAKE_BUILD_TYPE=release ../ $ scan-build -o html make 


then in the html directory we get the result of the analysis:



Coverity intercepts the compiler calls, the analyzer was launched with the following parameters:

 $ cov-analyze --dir cov --all --security --enable-constraint-fpp --enable-single-virtual --enable-fnptr --enable-callgraph-metrics -j 2 --inherit-taint-from-unions --override-worker-limit 




PVS Studio requires Windows, and in the trial version it does not show the names of the problem files, however, we already know the line and the type of error, so this will be enough for binary evaluation. PVS was launched in monitor mode, and for simplicity, the build-libcurl-windows script was used.


(PVS Studio output is not all)

None of the static analyzers found a problem.

Now let's look at how to start the fuzzing process.

First, download and assemble the AFL:

 $ wget http://lcamtuf.coredump.cx/afl/releases/afl-latest.tgz $ tar xvfz afl-latest.tgz $ cd afl-1.83b $ make $ cd llvm_mode $ make 


Typically, the fuzzer starts the application in a new process, then sends test data to STDIN for input or using a temporary file if the process fails - AFL will notice this and write the submitted data to the crashes directory. An important point for successful fuzzing will be building the application using Address Sanitizer , so the application is guaranteed to crash even when overwriting one byte of dynamic memory. I will not write anything about ASAN, because it has been described many times, and has been successfully applied for a long time .

To generate tests required so-called. corpus is the set of test data that the application processes, in the case of curl, these are valid HTTP responses from the web server.

In the case of a known vulnerability, we have two ways - to fuzz separate functions that may seem suspicious and fuzz the entire application.
In the first case, it is necessary to write a minimal wrapper to the application:

 int main(int argc, char **argv) { unsigned char buf[2048]; char *res = NULL; assert(argc == 2); FILE *f = fopen(argv[1], "rb"); assert(f); size_t len = fread(buf, 1, sizeof(buf), f); buf[len] = 0x00; if (len == 0 || strlen(buf) == 0) { return 0; } printf("read = %zu\n", len); printf("in = %s\n", buf); /* call the code which smell */ res = sanitize_cookie_path(buf); if (res) { printf("res = %s\n", res); free(res); } return 0; } 


Next, the wrapper must be instrumented, for which we will assemble it under AFL

 $ afl-clang-fast -g -fsanitize=address path_san.c -o path_san 


In the inputs directory, it’s enough to put one file with a suitable URI, for example "/ xxx /".
And run AFL:

 $ AFL_USE_ASAN=1 /path/to/afl/afl-fuzz -m none -i inputs -o out ./path_san @@ 


the -m none parameter will disable the memory limit, and @@ will be replaced with the name of the temporary file during fuzzing, if you do not specify this parameter, the test data will be supplied to STDIN. Almost immediately after launching, AFL will detect crash and generate a test entry in the out / crashes directories.

Fuzzing strategy of individual functions processing user input in a large project may be more effective than fuzzing the entire application, especially if unit tests have already been written for the code.
However, sometimes it is useful to be able to fuzz the entire application, let's see how to do this using the same curl as an example.

As we know, curl interacts with the server through sockets, but fuzzer does not know how to do this, which means we need to learn how to transfer data from fuzzer to curl.
To do this, we substitute the connect function so that instead of creating a new connection, the result of connect returns the descriptor stdin.
This can be done through LD_PRELOAD of its dynamic library, which, fortunately, is not necessary to write - you can use ready ( preeny ).

We collect preeny and curl:

 $ git clone https://github.com/zardus/preeny $ cd preeny && make ... $ cd curl $ mkdir build $ export CMAKE_C_FLAGS="-g -fsanitize=address" $ cmake -DCMAKE_C_COMPILER=/path/to/afl-clang-fast -DCMAKE_CXX_COMPILER=/path/to/afl-clang-fast -DCMAKE_BUILD_TYPE=release ../ $ make 


We put the collected binaries in one directory, create the inputs directory next to it, and in it the file with the HTTP response of the server (to increase coverage, it is better to create several).

For example:

 HTTP/1.1 200 OK Content-Type: text/html Content-Length: 1 Connection: close Set-Cookie: xx=xxx; path=xx; domain=xxx.com; httponly; secure; 1 


After that, return to the directory with the application and run AFL:

 $ LD_PRELOAD="/path/to/preeny/x86_64-linux-gnu/desock.so" /path/to/afl/afl-fuzz -m none -i inputs -o out ./curl http://127.0.0.1/ --max-time 1 --cookie-jar /dev/null 


LD_PRELOAD here sets the path to SO, which will replace the connect function.

Curl options:




After a few minutes, AFL will find the first test data that causes the application to crash.

Now you can make sure that the application really falls on this input.

 $ LD_PRELOAD="/path/to/preeny/x86_64-linux-gnu/desock.so" ./curl http://127.0.0.1/ --max-time 1 --cookie-jar /dev/null < out/crashes/id:000010,sig:06,src:000000,op:havoc,rep:2 




So we learned how to use AFL to test applications.
When applying fuzzing in the testing process, it is important to understand that any, even the fastest and most effective, well-coated fuzzer does not replace the code analyzer, but only complements it.

Related links and sources:

Source: https://habr.com/ru/post/259671/


All Articles