📜 ⬆️ ⬇️

os.urandom, CPython, Linux and rake



I want to tell an instructive history of errors in the implementation of the urandom function from the os module in CPython on UNIX-like operating systems (Linux, Mac OS X, etc.).

Quote from the documentation for the top three :
Return a string of random order bytes suitable for cryptographic use.
')
This function returns a random bytes from an OS-specific randomness source. It should not be necessary for the cryptographic applications, however, it depends on the OS implementation. It will use CryptGenRandom () on the Unix-like system.
Documentation for the pair adds:
New in version 2.4.
In other words, for example, under Linux, urandom reads and returns bytes from the system device / dev / urandom. Let me remind you that in this OS there are two typical entropy source devices: / dev / random and / dev / urandom. As is known , the first device is “slow” and blocking, and the second is “fast”, and contrary to popular belief, both of them are crypto-resistant sources of (pseudo-) random numbers . I’ll say right away that KDPV has nothing to do with the article and it’s not about cryptography, security, or OpenSSL with Heartbleed .

It would seem, how can you make a mistake in the implementation of such a simple routine? As it often happens, they were optimized ...

2.4

Let's come back to the end of 2004, the Half-Life 2 CPython 2.4 comes out, adding such familiar features as function decorators, sets (set), reverse order traversal (reversed) and list comprehensions, which are referred to as generator expressions by reference. How could people without them develop software on Python at all ?!

Above it was already written that they added os.urandom, implemented on Python itself. Let's fantasize how urandom could be written:
def urandom ( n ) :
with open ( '/ dev / urandom' , 'rb' ) as rnd:
return rnd. read ( n )
So, three lines. Moreover, this is an absolutely correct implementation without errors, except for exception handling and other details in order to comply with the specification of the function of the function on the docks. And then someone's bright head offers to speed up this code. How is this possible, you ask. Having cached a file object, the bright head responds.
rnd = None

def urandom ( n ) :
if rnd is None :
rnd = open ( '/ dev / urandom' , 'rb' )
return rnd. read ( n )
What problems appear with this implementation? Scripts that become demons fall on the first invocation of urandom after the death of the parent.

fork ()

Many people know that the fork () system function, which is part of the POSIX 2001 standard and appeared in the very first version of Unix, is designed to spawn new processes using the “forking” method, when a twin appears in the system with an identical environment, but a separate address space, and it starts working exactly from the very place in the code where the fork () call was. As a rule, forks use the copy-on-write mechanism, thanks to which the memory is not physically copied when creating a twin process (“child”). Instead, the pages in which the twin writes as they work are copied from the memory of the parent. These are all lyrics, and we are interested in the following quote from man fork:
Copy of the open file descriptors. Each file descriptor (see open (2)) as the corresponding file descriptor in the parent. I / O attributes and signal-driven I / O attributes
In other words, the file descriptors belonging to the Python file object, after the fork, are interconnected and refer to the same file. However, if in one process the file is closed, it will not be automatically closed in the other.

Well, fork and fork, you say. Python here and? And despite the fact that
  1. multiprocessing * works on top of it
  2. through it comes demonization
* with correction # 8713 is not always

Thanks to fork-in multiprocessing, children are initially in a state that was in the main process before reproduction. As for the demonization process (turning into a service in Windows terms) - see PEP 3143 . Somewhere in full swing there is a fork () call. And if, according to the best traditions, to close all file descriptors directly in the newly-created daemon, not through close () (for example, so: os . Closerange ( 3 , 256 ) ), then os.urandom () collapses.

Approximately these words were explained by CPython users in early 2005 to its developers a mistake. However, Guido first tried to build a fool out of himself to excuse himself :
I recommend to close this as invalid. The daemonization code is clearly broken.
Fortunately, people were able to convince the king of the opposite, and, finally, in July caching / dev / urandom was removed - more than six months passed. I draw attention to how it was done: the code does not contain a reference to the bug number, no indication of the reasons for the patch, or, in the end, just an explanatory comment. It works, and well.

3.4

It takes 9 years. In March 2014 , CPython 3.4 is released. He adds such necessary features as ... wait, oh shi
No new syntax features were added in Python 3.4.
Okay, well, seriously, great progress: a bunch of libraries were taken, for example, asyncio, which has already been written a lot on Habré, improved security, cleared the release of objects - not for me to talk about it. The main thing is that before the release there were people who thought that the implementation of / dev / urandom on Python is hellishly slow, and true performance can only be ensured by the good old C. In general, the function was rewritten ... and again they stepped on the same rake . And no PEP 446 helped them. The patch was released on April 24 and this time already contained abundant comments, a link to the bug and even regression tests.

What do I care?

As a bonus to the article, I will tell you how I stumbled over this error. I have a working system of Ubuntu 14.04 LTS, and, unfortunately, on it
import platform
platform . python_build ( )
( 'default' , 'Apr 11 2014 13:05:11' )
I had a daemon code that closed all the file descriptors. And that's because the trouble
import os
print ( os . listdir ( '/ proc / self / fd' ) )
import random
print ( os . listdir ( '/ proc / self / fd' ) )
prints
['0', '1', '2', '3']
['0', '1', '2', '3', '4']
The experiment is not entirely clean, because os.listdir creates its descriptor in both cases under the last number. After importing the random number 3 opened. Which file does it correspond to?
print ( os . readlink ( '/ proc / self / fd / 3' ) )
/dev/urandom
Ta-dam! I always had a bad attitude to work when importing modules ... In this case, I give the ending random.py:
from os import urandom as _urandom

class Random ( _random. Random ) :
# ...
def __init__ ( self , x = None ) :
# ...
self . seed ( x )
self . gauss_next = None

def seed ( self , a = None , version = 2 ) :
# ...
if a is None :
try :
a = int . from_bytes ( _urandom ( 32 ) , 'big' )
except NotImplementedError :
# ...
It remains to be noted that import random do Tornado, Twisted, uuid, and a whole bunch of other libraries, standard and not very.

It should be noted that at first I did not quite understand the essence of the problem, unreasonably deciding that the file descriptors of the child and the parent are closed at the same time. Thanks to kekekeks for restoring the full picture of this bug .

findings

You should always think about the perennial problems of fork () when developing libraries, always comment on the bug fixes in the code and carefully read the messages about users' problems (at least if they are programmers).

Source: https://habr.com/ru/post/223981/


All Articles