Many have heard, and some successfully use in their development processes such a thing as static code analysis - an effective, relatively fast and often convenient way to control the quality of the code.
For those who already use static code analysis, it may be interesting to try dynamic analysis at the testing stage as well. Enough has been written about the differences in these techniques, just to remind you that static analysis is done without executing code (for example, at the compilation stage), and dynamic, respectively, during execution. When analyzing a compiled code from a security point of view, dynamic analysis often means fuzzing. The advantage of fuzzing is the almost complete absence of false positives, which is quite common when using static analyzers.
“Fuzzing is a testing technique in which non-valid, unintended or incidental data is input to the program.” © Habrahabr

')
Recently, a fuzzer by
Michal Zalewski -
American Fuzzy Lop has received great fame for its effectiveness.
Its main difference is the instrumentation code at the compilation stage, performance and focus on practical application. AFL does not use SMT solvers, which means it should be less demanding of resources, work faster, though not always more efficiently.
Today I will tell you exactly how this tool can be used, and at the same time I will conduct a small experiment to compare the result of its work with the results of several static analysis tools.
So, in order to start using a fuzzer, you need to understand whether it works in practice, as well as what exactly and how we will be phasing.
For verification, I took a deliberately vulnerable version of the popular
libcurl library - 7.34.0.
This version contains a vulnerability in the
sanitize_cookie_path () function described in
CVE-2015-3145 .

The function incorrectly processes the input data, and passing to it a path consisting of a double quote or a null byte, libcurl will assign null bytes to the negative pointer of the
new_path array, and will corrupt memory on the heap.
First, let's check how static analyzers respond to this vulnerability.
Coverity ,
PVS Studio and
Clang Static Analyzer were on hand.
In
clang, you can do it like this:
$ cd curl $ mkdir build-clang $ cd build-clang $ cmake -DCMAKE_C_COMPILER=/path/to/clang/ccc-analyzer -DCMAKE_CXX_COMPILER=/path/to/clang/ccc-analyzer -DCMAKE_BUILD_TYPE=release ../ $ scan-build -o html make
then in the
html directory we get the result of the analysis:
Coverity intercepts the compiler calls, the analyzer was launched with the following parameters:
$ cov-analyze --dir cov --all --security --enable-constraint-fpp --enable-single-virtual --enable-fnptr --enable-callgraph-metrics -j 2 --inherit-taint-from-unions --override-worker-limit
PVS Studio requires Windows, and in the trial version it does not show the names of the problem files, however, we already know the line and the type of error, so this will be enough for binary evaluation. PVS was launched in monitor mode, and for simplicity, the
build-libcurl-windows script was used.

(PVS Studio output is not all)
None of the static analyzers found a problem.Now let's look at how to start the fuzzing process.
First, download and assemble the AFL:
$ wget http://lcamtuf.coredump.cx/afl/releases/afl-latest.tgz $ tar xvfz afl-latest.tgz $ cd afl-1.83b $ make $ cd llvm_mode $ make
Typically, the fuzzer starts the application in a new process, then sends test data to STDIN for input or using a temporary file if the process fails - AFL will notice this and write the submitted data to the crashes directory. An important point for successful fuzzing will be building the application using
Address Sanitizer , so the application is guaranteed to crash even when overwriting one byte of dynamic memory. I will not write anything about ASAN, because it has been
described many times, and has been successfully applied for a long time .
To generate tests required so-called. corpus is the set of test data that the application processes, in the case of curl, these are valid HTTP responses from the web server.
In the case of a known vulnerability, we have two ways - to fuzz separate functions that may seem suspicious and fuzz the entire application.
In the first case, it is necessary to write a minimal wrapper to the application:
int main(int argc, char **argv) { unsigned char buf[2048]; char *res = NULL; assert(argc == 2); FILE *f = fopen(argv[1], "rb"); assert(f); size_t len = fread(buf, 1, sizeof(buf), f); buf[len] = 0x00; if (len == 0 || strlen(buf) == 0) { return 0; } printf("read = %zu\n", len); printf("in = %s\n", buf); res = sanitize_cookie_path(buf); if (res) { printf("res = %s\n", res); free(res); } return 0; }
Next, the wrapper must be instrumented, for which we will assemble it under AFL
$ afl-clang-fast -g -fsanitize=address path_san.c -o path_san
In the inputs directory, it’s enough to put one file with a suitable URI, for example "/ xxx /".
And run AFL:
$ AFL_USE_ASAN=1 /path/to/afl/afl-fuzz -m none -i inputs -o out ./path_san @@
the
-m none parameter will disable the memory limit, and
@@ will be replaced with the name of the temporary file during fuzzing, if you do not specify this parameter, the test data will be supplied to STDIN. Almost immediately after launching, AFL will detect crash and generate a test entry in the
out / crashes directories.
Fuzzing strategy of individual functions processing user input in a large project may be more effective than fuzzing the entire application, especially if unit tests have already been written for the code.
However, sometimes it is useful to be able to fuzz the entire application, let's see how to do this using the same curl as an example.
As we know, curl interacts with the server through sockets, but fuzzer does not know how to do this, which means we need to learn how to transfer data from fuzzer to curl.
To do this, we substitute the connect function so that instead of creating a new connection, the result of connect returns the descriptor stdin.
This can be done through LD_PRELOAD of its dynamic library, which, fortunately, is not necessary to write - you can use ready (
preeny ).
We collect preeny and curl:
$ git clone https://github.com/zardus/preeny $ cd preeny && make ... $ cd curl $ mkdir build $ export CMAKE_C_FLAGS= $ cmake -DCMAKE_C_COMPILER=/path/to/afl-clang-fast -DCMAKE_CXX_COMPILER=/path/to/afl-clang-fast -DCMAKE_BUILD_TYPE=release ../ $ make
We put the collected binaries in one directory, create the inputs directory next to it, and in it the file with the HTTP response of the server (to increase coverage, it is better to create several).
For example:
HTTP/1.1 200 OK Content-Type: text/html Content-Length: 1 Connection: close Set-Cookie: xx=xxx; path=xx; domain=xxx.com; httponly; secure; 1
After that, return to the directory with the application and run AFL:
$ LD_PRELOAD="/path/to/preeny/x86_64-linux-gnu/desock.so" /path/to/afl/afl-fuzz -m none -i inputs -o out ./curl http://127.0.0.1/
LD_PRELOAD here sets the path to SO, which will replace the connect function.
Curl options:
127.0.0.1
127.0.0.1
- URL connection to which we will emulate, it is important to specify the IP address and not the domain (remember - we changed the functions and the resolution will not work, and so faster)- max-time sets the maximum execution time for curl to be equal to one second (it is impossible to set less), we set it as neither curl nor AFL close the handle
- It is important to use the --cookie-jar parameter, curl will only call a vulnerable function if cookies are used.

After a few minutes, AFL will find the first test data that causes the application to crash.
Now you can make sure that the application really falls on this input.
$ LD_PRELOAD="/path/to/preeny/x86_64-linux-gnu/desock.so" ./curl http://127.0.0.1/ --max-time 1 --cookie-jar /dev/null < out/crashes/id:000010,sig:06,src:000000,op:havoc,rep:2

So we learned how to use AFL to test applications.
When applying fuzzing in the testing process, it is important to understand that any, even the fastest and most effective, well-coated fuzzer does not replace the code analyzer, but only complements it.
Related links and sources: