And again about the danger of eval ()

How many copies were broken when discussing the question “Is it possible to make an eval safe?” Is impossible to count. There is always someone who claims to have found a way to protect themselves from all the possible consequences of performing this function.
When I needed to find a detailed answer to this question, I came across one post . I was pleasantly surprised by the depth of the study, so I decided it was worth translating.

Briefly about the problem

Python has a built-in eval() function that executes a string with code and returns the execution result:

 assert eval("2 + 3 * len('hello')") == 17

This is a very powerful, but at the same time, very dangerous instruction, especially if the strings that you pass to eval are not from a trusted source. What happens if the line we decide to feed to eval 'y is os.system('rm -rf /') ? The interpreter will honestly start the process of deleting all data from the computer, and it’s good if it runs on behalf of the least privileged user (in the following examples I will use clear ( cls if you use Windows) instead of rm -rf / so that none of the readers will accidentally did not shoot himself in the leg ).

What are the solutions?

Some argue that it is possible to make eval safe if you run it without accessing the symbols from globals . As a second (optional) argument, eval() takes a dictionary that will be used instead of the global namespace (all classes, methods, variables, etc., declared at the “upper” level, accessible from any point of the code) by the code that will be executed by eval 'om If eval is called without this argument, it uses the current global namespace into which the os module could be imported. If you pass an empty dictionary, the global namespace for eval 'a will be empty. Here such code can no longer be executed and NameError: name 'os' is not defined exception NameError: name 'os' is not defined :

 eval("os.system('clear')", {})

However, we can still import modules and access them using the __import__ built-in function. So, the code below will work without errors:

 eval("__import__('os').system('clear')", {})

The next attempt usually becomes the decision to deny access to __builtins__ from within eval 'a, since names like __import__ are available to us because they are in the global variable __builtins__ . If we explicitly pass an empty dictionary instead, the code below can no longer be executed:

 eval("__import__('os').system('clear')", {'__builtins__':{}}) # NameError: name '__import__' is not defined

Now are we safe?

Some say yes and make a mistake. For example, this little piece of code will call segfault if you run it in CPython:

 s = """ (lambda fc=( lambda n: [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n ][0] ): fc("function")( fc("code")( 0,0,0,0,"KABOOM",(),(),(),"","",0,"" ),{} )() )() """ eval(s, {'__builtins__':{}})

So let's see what is going on here. Let's start with this:

 ().__class__.__bases__[0]

As many might have guessed, this is just one of the ways to refer to an object . We cannot simply write object , since __builtins__ are empty, but we can create an empty tuple (tuple), the first base class of which is object and, walking through its properties, access the object class.
Now we get a list of all classes that inherit object or, in other words, a list of all classes declared in the program at the moment:

 ().__class__.__bases__[0].__subclasses__()

If we replace this expression with ALL_CLASSES for ALL_CLASSES , it will be easy to see that the expression below finds the class by its name:

 [c for c in ALL_CLASSES if c.__name__ == n][0]

Further in the code we will have to look twice for the class, so we will create a function

 lambda n: [c for c in ALL_CLASSES if c.__name__ == n][0]

To call a function, we need to call it somehow, but since we will execute this code inside eval 'a, we can neither declare a function (using def ) nor use an assignment operator to bind our lambda to any variable .
However, there is a third option: the default settings. When declaring a lambda, as when declaring any ordinary function, we can set the default parameters, so if we put all the code inside another lambda and set it as our default parameter, we will achieve the desired:

 (lambda fc=( lambda n: [ c for c in ALL_CLASSES if c.__name__ == n ][0] ): #         fc )()

So, we have a function that can look for classes, and we can refer to it by name. What's next? We will create an object of class code (an internal class, its instance, for example, is the property func_code a function object):

 fc("code")(0,0,0,0,"KABOOM",(),(),(),"","",0,"")

Of all the initializing parameters, we are only interested in KABOOM. This is the sequence of bytecodes that our object will use, and, as you might have guessed, this sequence is not “good.” In fact, any bytecode from it would be enough, since all of this is binary operators that will be called when the stack is empty, which will lead to segfault 'in CPython. " KABOOM " just looks funnier, thanks lvh for this example.
')
So, we have an object of class code , but we cannot directly execute it. Then create a function, the code of which will be our object:

 fc("function")(CODE_OBJECT, {})

Well, now that we have a function, we can execute it. Specifically, this function will try to execute our incorrectly compiled byte code and lead to the crash of the interpreter.
Here is the whole code again:

 (lambda fc=(lambda n: [c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == n][0]): fc("function")(fc("code")(0,0,0,0,"KABOOM",(),(),(),"","",0,""),{})() )()

Conclusion

So, I hope now no one has any doubt that eval NOT SAFE , even if you remove access to global and embedded variables.

In the example above, we used a list of all subclasses of the object class to create code and function objects. In exactly the same way, you can get (and instantiate) any class that exists in the program at the time of the eval() call.
Here is another example of what can be done:

 s = """ [ c for c in ().__class__.__bases__[0].__subclasses__() if c.__name__ == "Quitter" ][0](0)() """ eval(s, {'__builtins__':{}})

The lib / site.py module contains the Quitter class, which is called by the interpreter when you type quit() .
The code above finds this class, instantiates it and calls, which terminates the interpreter.

Now we run eval in an empty environment, based on the fact that the code specified in the article is the entire code of our program.
In the case of using eval 'and in a real application, an attacker can gain access to all the classes that you use, so that its capabilities will not be limited to almost nothing.

The problem with all such attempts to make eval safe is that they are all based on the idea of “blacklists”, the idea that we need to remove access to all things that we think can be dangerous when used in eval 'e. With such a strategy, there is virtually no chance of winning, because if anything turns out to be unlawful, the system will be vulnerable.

When I conducted research on this topic, I came across a protected execution mode for eval 'and in Python, which is another attempt to overcome this problem:

 >>> eval("(lambda:0).func_code", {'__builtins__':{}}) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<string>", line 1, in <module> RuntimeError: function attributes not accessible in restricted mode

In short, it works like this: if __builtins__ inside eval are different from “official” - eval goes into protected mode, in which access to some dangerous properties, such as func_code for functions, is func_code . A more detailed description of this mode can be found here , but, as we have already seen above , it is not a “silver bullet” either.

Still, is eval safe to make? It is hard to say. It seems to me that the attacker cannot be harmed without access to objects with two lower underscores framing the name, so it is possible if we exclude all lines with two lower underscores from processing, we will be safe. Maybe...

PS

In the thread on Reddit, I found a short snippet that allows us in eval to get the "original" __builtins__:

 [ c for c in ().__class__.__base__.__subclasses__() if c.__name__ == 'catch_warnings' ][0]()._module.__builtins__

Traditional PPS for Habr: I ask to write about all errors, inaccuracies and typographical errors in a personal :)

Source: https://habr.com/ru/post/221937/

All Articles