
How many memory takes 1 million integers?
I have often been bothered by the thought of how efficiently Python uses memory as compared to other programming languages. For example, how much memory does it take to work with 1 million integers? And with the same number of lines of arbitrary length?
As it turned out, in Python there is an opportunity to get the necessary information directly from the interactive console, without referring to the source code in C (although, to be sure, we still look there).
Having satisfied the curiosity, we crawled inside the data types and find out what exactly the memory is spent on.
All examples were made in CPython version 2.7.4 on a 32 bit machine. At the end there is a table for the memory requirement on a 64 bit machine.
Required Tools
sys.getsizeof and __sizeof __ () method
The first tool we need is in the standard sys library. We quote the official documentation:
sys.getsizeof (object [, default_value])
')
Returns the size of the object in bytes.
If a default value is specified, it will return if the object does not provide a way to get the size. Otherwise, a TypeError exception will be thrown.
Getsizeof () calls the object method __sizeof__ and adds the size of additional information that is stored for the garbage collector, if used.
The algorithm for working with a getsizeof () rewritten in Python might look like this:
Py_TPFLAGS_HAVE_GC = 1 << 14
Where PyGC_Head is a double linked list item that is used by the garbage collector to detect ring references. In the source code, it is represented by the following structure:
typedef union _gc_head { struct { union _gc_head *gc_next; union _gc_head *gc_sourcev; Py_ssize_t gc_refs; } gc; long double dummy; } PyGC_Head;
The size of PyGC_Head will be 12 bytes on a 32 bit and 24 bytes on a 64 bit machine.
Let's try to call getsizeof () in the console and see what happens:
>>> import sys >>> GC_FLAG = 1 << 14 >>> sys.getsizeof(1) 12 >>> (1).__sizeof__() 12 >>> bool(type(1).__flags__ & GC_FLAG) False >>> sys.getsizeof(1.1) 16 >>> (1.1).__sizeof__() 16 >>> bool(type(1.1).__flags__ & GC_FLAG) False >>> sys.getsizeof('') 21 >>> ''.__sizeof__() 21 >>> bool(type('').__flags__ & GC_FLAG) False >>> sys.getsizeof('hello') 26 >>> sys.getsizeof(tuple()) 24 >>> tuple().__sizeof__() 12 >>> bool(type(tuple()).__flags__ & GC_FLAG) True >>> sys.getsizeof(tuple((1, 2, 3))) 36
With the exception of flag-checking magic, everything is very simple.
As you can see from the example, int and float occupy 12 and 16 bytes, respectively. Str takes 21 bytes and one more byte for each character of content. An empty tuple is 12 bytes, and an additional 4 bytes for each element. For simple data types (which do not contain references to other objects and, accordingly, are not tracked by the garbage collector), the sys.getsizeof value is equal to the value returned by the __sizeof __ () method.
id () and ctypes.string_at
Now we’ll find out what exactly the memory is spent on.
To do this, we need two things: first, to find out exactly where the object is stored, and second, to get direct read access from the memory. Despite the fact that Python carefully protects us from direct memory access, it is still possible to do this. You need to be careful as this can lead to a segmentation error.
The built-in function id () returns the address of the memory where the beginning of the object is stored (the object itself is a C structure)
>>> obj = 1 >>> id(obj) 158020320
To read data from a memory address, you need to use the string_at function from the ctypes module. Its official description is not very detailed:
ctypes.string_at (address [, length])
This function returns a string, with the beginning in the memory cell "address". If "length" is not specified, then the string is considered to be zero-terminated,
Now let's try to read the data at the address that returned us id ():
>>> import ctypes >>> obj = 1 >>> sys.getsizeof(obj) 12 >>> ctypes.string_at(id(obj), 12) 'u\x01\x00\x00 \xf2&\x08\x01\x00\x00\x003\x01\x00\x00 \xf2&\x08\x00\x00\x00\x001\x00\x00\x00'
The view of the hexadecimal code is not very impressive, but we are close to the truth.
Struct model
In order to present the output to values that are convenient for perception, we will use another module. Here we can use the unpack () function from the struct module.
struct
This module converts between Python values and C structures, represented as strings.
struct.unpack (format, string)
Parses the string according to the given formats. Always returns a tuple, even if the string contains only one element. The string must contain exactly the amount of information as described by the format.
The data formats that we need.
symbol | C value | Python value | Length on a 32bit machine |
c | char | Single character string | one |
i | int | int | four |
l | long | int | four |
L | unsigned long | int | four |
d | double | float | eight |
Now we collect everything together and look at the internal structure of some data types.
Int
>>> obj = 1 >>> sys.getsizeof(obj), obj.__sizeof__() (12, 12) >>> struct.unpack('LLl', ctypes.string_at(id(obj), 12)) (373, 136770080, 1)
The format of the values is easy to guess.
The first number (373) is the number of pointers to the object.
>>> obj2 = obj >>> struct.unpack('LLl', ctypes.string_at(id(obj), 12)) (374, 136770080, 1)
As you can see, the number has increased by one, after we have created another link to the object.
The second number (136770080) is a pointer (id) to the object type:
>>> type(obj) <type 'int'> >>> id(type(obj) ) 136770080
The third number (1) is the contents of the object itself.
>>> obj = 1234567 >>> struct.unpack('LLl', ctypes.string_at(id(obj), 12)) (1, 136770080, 1234567)
Our guesses can be confirmed by looking into the CPython source code.
typedef struct { PyObject_HEAD long ob_ival; } PyIntObject;
Here, PyObject_HEAD is a macro common to all embedded objects, and ob_ival is a long value. The PyObject_HEAD macro adds a count of the number of pointers to an object and a pointer to the parent type of the object — just what we saw.
Float
A floating point number is very similar to int, but is represented in memory C by a double.
typedef struct { PyObject_HEAD double ob_fval; } PyFloatObject;
This is easily seen:
>>> obj = 1.1 >>> sys.getsizeof(obj), obj.__sizeof__() (16, 16) >>> struct.unpack('LLd', ctypes.string_at(id(obj), 16) (1, 136763968, 1.1)
String (Str)
The string is represented as an array of characters ending in a zero byte. Also in the structure of a separate line, its length is stored, a hash of its content and a flag that determines whether it is stored in the interned cache.
typedef struct { PyObject_VAR_HEAD long ob_shash; # int ob_sstate; # ? char ob_sval[1]; # + } PyStringObject;
The PyObject_VAR_HEAD macro includes PyObject_HEAD and adds a long ob_ival value that stores the length of the string.
>>> obj = 'hello world' >>> sys.getsizeof(obj), obj.__sizeof__() (32, 32) >>> struct.unpack('LLLli' + 'c' * (len(obj) + 1), ctypes.string_at(id(obj), 4*5 + len(obj) + 1)) (1, 136790112, 11, -1500746465, 0, 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\x00')
The fourth value corresponds to the hash of the string, which is not difficult to verify.
>>> hash(obj) -1500746465
As you can see, the sstate value is 0, so the string is not cached right now. Let's try to add it to the cache:
>>> intern(obj) 'hello world' >>> struct.unpack('LLLli' + 'c' * (len(obj) + 1), ctypes.string_at(id(obj), 4*5 + len(obj) + 1)) (2, 136790112, 11, -1500746465, 1, 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\x00')
Tuple
The tuple is represented as an array of pointers. Since its use can lead to ring references, it is monitored by the garbage collector, which consumes additional memory (the sys.getsizeof () call reminds us of this)
The tuple structure is similar to a string, only there are no special fields except length.
typedef struct { PyObject_VAR_HEAD PyObject *ob_item[1]; } PyTupleObject;
>>> obj = (1,2,3) >>> sys.getsizeof(obj), obj.__sizeof__() (36, 24) >>> struct.unpack('LLL'+'L'*len(obj), ctypes.string_at(id(obj), 12+4*len(obj))) (1, 136532800, 3, 146763112, 146763100, 146763088) >>> for i in obj: print i, id(i) 1 146763112 2 146763100 3 146763088
As we see from the example, the last three elements of a tuple are pointers to its contents.
The remaining basic data types (unicode, list, dict, set, frozenset) can be explored in the same way.
What is the result?
Type of | CPython name | format | Format, for nested objects | 32bit length | 64bit length | Memory for GC * |
Int | PyIntObject | Lll | | 12 | 24 | |
float | PyFloatObject | Lld | | sixteen | 24 | |
str | PyStringObject | LLLli + c * (length + 1) | | 21 + length | 37 + length | |
unicode | PyUnicodeObject | Llllll | L * (length + 1) | 28 + 4 * length | 52 + 4 * length | |
tuple | PyTupleObject | LLL + L * length | | 12 + 4 * length | 24 + 8 * length | there is |
list | PyListObject | L * 5 | L * length | 20 + 4 * length | 40 + 8 * length | there is |
Set / frozenset | PySetObject | L * 7 + (lL) * 8 + lL | LL * length | (<= 5 items) 100 (> 5 items) 100 + 8 * length | (<= 5 items) 200 (> 5 items) 200 + 16 * length | there is |
dict | PyDictObject | L * 7 + (lLL) * 8 | lLL * length | (<= 5 items) 124 (> 5 items) 124 + 12 * length | (<= 5 items) 248 (> 5 items) 248 + 24 * length | there is |
* Adds 12 bytes on a 32 bit machine and 32 bytes on a 64 bit machine
We see that simple data types in Python are two to three times more than their prototypes in C. The difference is due to the need to store the number of references to an object and a pointer to its type (the contents of the PyObject_HEAD macro). This is partially compensated by internal caching, which allows you to reuse previously created objects (this is possible only for immutable types).
For strings and tuples, the difference is not so significant - some constant value is added.
And lists, dictionaries and sets, as a rule, take up more by 1/3 than necessary. This is due to the implementation of the algorithm for adding new elements, which sacrifices memory in order to save processor time.
So, we answer the question at the beginning of the article: to save 1 million integers we need 11.4 megabytes (12 * 10 ^ 6 bytes) for the numbers themselves and an additional 3.8 megabytes (12 + 4 + 4 * 10 ^ 6 bytes) for the tuple, which will keep links to them.
UPD: Typos.
UPD: In the subtitle "1 million integers", instead of "1 million prime numbers"