📜 ⬆️ ⬇️

How many objects does Python allocate when executing scripts?

Some Python programmers are greatly surprised when they know how many temporary objects the python interpreter allocates while running a simple script.

CPython allows you to get statistics on selected objects, for this you need to compile it with additional flags.

./configure CFLAGS='-DCOUNT_ALLOCS' --with-pydebug make -s -j2 

After compiling, we can open an interactive REPL and check the statistics:
')
 >>> import sys >>> sys.getcounts() [('iterator', 7, 7, 4), ('functools._lru_cache_wrapper', 1, 0, 1), ('re.Match', 2, 2, 1), ('re.Pattern', 3, 2, 1), ('SubPattern', 10, 10, 8), ('Pattern', 3, 3, 1), ('IndexError', 4, 4, 1), ('Tokenizer', 3, 3, 1), ('odict_keys', 1, 1, 1), ('odict_iterator', 18, 18, 1), ('odict_items', 17, 17, 1), ('RegexFlag', 18, 8, 10), ('operator.itemgetter', 4, 0, 4), ('PyCapsule', 1, 1, 1), ('Repr', 1, 0, 1), ('_NamedIntConstant', 74, 0, 74), ('collections.OrderedDict', 5, 0, 5), ('EnumMeta', 5, 0, 5), ('DynamicClassAttribute', 2, 0, 2), ('_EnumDict', 5, 5, 1), ('TypeError', 1, 1, 1), ('method-wrapper', 365, 365, 2), ('_C', 1, 1, 1), ('symtable entry', 5, 5, 2), ('OSError', 1, 1, 1), ('Completer', 1, 0, 1), ('ExtensionFileLoader', 2, 0, 2), ('ModuleNotFoundError', 2, 2, 1), ('_Helper', 1, 0, 1), ('_Printer', 3, 0, 3), ('Quitter', 2, 0, 2), ('enumerate', 5, 5, 1), ('_io.IncrementalNewlineDecoder', 1, 1, 1), ('map', 25, 25, 1), ('_Environ', 2, 0, 2), ('async_generator', 2, 1, 1), ('coroutine', 2, 2, 1), ('zip', 1, 1, 1), ('longrange_iterator', 1, 1, 1), ('range_iterator', 7, 7, 1), ('range', 14, 14, 2), ('list_reverseiterator', 2, 2, 1), ('dict_valueiterator', 1, 1, 1), ('dict_values', 2, 2, 1), ('dict_keyiterator', 25, 25, 1), ('dict_keys', 5, 5, 1), ('bytearray_iterator', 1, 1, 1), ('bytearray', 4, 4, 1), ('bytes_iterator', 2, 2, 1), ('IncrementalEncoder', 2, 0, 2), ('_io.BufferedWriter', 2, 0, 2), ('IncrementalDecoder', 2, 1, 2), ('_io.TextIOWrapper', 4, 1, 4), ('_io.BufferedReader', 2, 1, 2), ('_abc_data', 39, 0, 39), ('mappingproxy', 199, 199, 1), ('ABCMeta', 39, 0, 39), ('CodecInfo', 1, 0, 1), ('str_iterator', 7, 7, 1), ('memoryview', 60, 60, 2), ('managedbuffer', 31, 31, 1), ('slice', 589, 589, 1), ('_io.FileIO', 33, 30, 5), ('SourceFileLoader', 29, 0, 29), ('set', 166, 101, 80), ('StopIteration', 33, 33, 1), ('FileFinder', 11, 0, 11), ('os.stat_result', 145, 145, 1), ('ImportError', 2, 2, 1), ('FileNotFoundError', 10, 10, 1), ('ZipImportError', 12, 12, 1), ('zipimport.zipimporter', 12, 12, 1), ('NameError', 4, 4, 1), ('set_iterator', 46, 46, 1), ('frozenset', 50, 0, 50), ('_ImportLockContext', 113, 113, 1), ('list_iterator', 305, 305, 5), ('_thread.lock', 92, 92, 10), ('_ModuleLock', 46, 46, 5), ('KeyError', 67, 67, 2), ('_ModuleLockManager', 46, 46, 5), ('generator', 125, 125, 1), ('_installed_safely', 52, 52, 5), ('method', 1095, 1093, 14), ('ModuleSpec', 58, 4, 54), ('AttributeError', 22, 22, 1), ('traceback', 154, 154, 3), ('dict_itemiterator', 45, 45, 1), ('dict_items', 46, 46, 1), ('object', 8, 1, 7), ('tuple_iterator', 631, 631, 3), ('cell', 71, 31, 42), ('classmethod', 58, 0, 58), ('property', 18, 2, 16), ('super', 360, 360, 1), ('type', 78, 3, 75), ('function', 1705, 785, 922), ('frame', 5442, 5440, 36), ('code', 1280, 276, 1063), ('bytes', 2999, 965, 2154), ('Token.MISSING', 1, 0, 1), ('stderrprinter', 1, 1, 1), ('MemoryError', 16, 16, 16), ('sys.thread_info', 1, 0, 1), ('sys.flags', 2, 0, 2), ('types.SimpleNamespace', 1, 0, 1), ('sys.version_info', 1, 0, 1), ('sys.hash_info', 1, 0, 1), ('sys.int_info', 1, 0, 1), ('float', 584, 569, 20), ('sys.float_info', 1, 0, 1), ('module', 56, 0, 56), ('staticmethod', 16, 0, 16), ('weakref', 505, 82, 426), ('int', 3540, 2775, 766), ('member_descriptor', 246, 10, 239), ('list', 992, 919, 85), ('getset_descriptor', 240, 4, 240), ('classmethod_descriptor', 12, 0, 12), ('method_descriptor', 678, 0, 678), ('builtin_function_or_method', 1796, 1151, 651), ('wrapper_descriptor', 1031, 5, 1026), ('str', 16156, 9272, 6950), ('dict', 1696, 900, 810), ('tuple', 10367, 6110, 4337)] 

Make the conclusion more readable:

 def print_allocations(top_k=None): allocs = sys.getcounts() if top_k: allocs = sorted(allocs, key=lambda tup: tup[1], reverse=True)[0:top_k] for obj in allocs: alive = obj[1]-obj[2] print("Type {}, allocs: {}, deallocs: {}, max: {}, alive: {}".format(*obj,alive)) 

 >>> print_allocations(10) Type str, allocs: 17328, deallocs: 10312, max: 7016, alive: 7016 Type tuple, allocs: 10550, deallocs: 6161, max: 4389, alive: 4389 Type frame, allocs: 5445, deallocs: 5442, max: 36, alive: 3 Type int, allocs: 3988, deallocs: 3175, max: 813, alive: 813 Type bytes, allocs: 3031, deallocs: 1044, max: 2154, alive: 1987 Type builtin_function_or_method, allocs: 1809, deallocs: 1164, max: 651, alive: 645 Type dict, allocs: 1726, deallocs: 930, max: 815, alive: 796 Type function, allocs: 1706, deallocs: 811, max: 922, alive: 895 Type code, allocs: 1284, deallocs: 304, max: 1063, alive: 980 Type method, allocs: 1095, deallocs: 1093, max: 14, alive: 2 

Where:


As you can see, the empty Python REPL managed to allocate 17,328 lines and 10,550 tuples. This is some crazy amount of objects! Here you need to keep in mind that for the REPL, Python automatically imports additional modules that are not imported in the case of empty scripts.

Now let's test “Hello, World” in flask:

 import sys from flask import Flask app = Flask(__name__) @app.route('/') def hello_world(): print_allocations(15) return 'Hello, World!' 

 ./python -m flask run ab -n 100 http://127.0.0.1:5000/ 

After sending 100 HTTP requests to our server, the statistics look like this:

 Type str, allocs: 192649, deallocs: 138892, max: 54320, alive: 53757 Type frame, allocs: 191752, deallocs: 191714, max: 158, alive: 38 Type tuple, allocs: 183474, deallocs: 150069, max: 33581, alive: 33405 Type int, allocs: 85154, deallocs: 81100, max: 4115, alive: 4054 Type bytes, allocs: 31671, deallocs: 14331, max: 17381, alive: 17340 Type list, allocs: 29846, deallocs: 27541, max: 2415, alive: 2305 Type builtin_function_or_method, allocs: 28525, deallocs: 27572, max: 957, alive: 953 Type dict, allocs: 19900, deallocs: 14800, max: 5280, alive: 5100 Type method, allocs: 15170, deallocs: 15105, max: 74, alive: 65 Type function, allocs: 14761, deallocs: 7086, max: 7711, alive: 7675 Type slice, allocs: 12521, deallocs: 12521, max: 1, alive: 0 Type list_iterator, allocs: 10795, deallocs: 10795, max: 35, alive: 0 Type code, allocs: 9849, deallocs: 1749, max: 8107, alive: 8100 Type tuple_iterator, allocs: 8938, deallocs: 8938, max: 4, alive: 0 Type float, allocs: 6033, deallocs: 5889, max: 152, alive: 144 

As you can see, flask allocated 847 261 objects since the start of the interpreter. Most of them were temporary ( 714,336 ) and removed as soon as they were no longer needed. The remaining objects ( 132 925 ) are still in memory.

Frames and code objects


In the example above, you can find a lot of frame and code objects. What are they needed for?

In short, each code object stores a block of compiled code; in turn, frame objects are used to execute them, working on the principle of a stack of calls . In Python, the most popular block is the function. For each new function, you need your own code object, and for each call to this function, you need a separate frame object, where Python will store local variables. In addition to local variables, each frame object stores a lot of auxiliary data that is needed to perform the function.

Where do all these objects come from?


Python is a very dynamic language and you have to pay for it. In order to maintain dynamic capabilities, it creates a large number of temporary objects that perform a supporting role.

For example, a simple function declaration creates at least 5 dictionaries, 5 tuples, and 4 lists. These objects will live until the end of the script. In turn, all these objects store other objects (their elements) in them, these are dozens, sometimes hundreds of additional objects used for the internal description of the compiled function. Description of the average class can distinguish hundreds of container (dictionaries, tuples, lists) objects. Unfortunately, here it will not be possible to automatically calculate the exact number of objects to be selected, and these figures are approximate.

In order for Python to quickly allocate a large number of objects, it uses a large and multi-layered system that optimizes the allocation of objects in memory.

Sometimes you wonder how many details hide interpreted languages ​​from us. Python allows you to write good code without thinking about a lot of problems and details.

PS: I am the author of this article, you can ask any questions.

Source: https://habr.com/ru/post/418305/


All Articles