For the last two years I have been solving all problems exclusively in
Cython . This does not mean that I write on Python, and then “Sitonize” it using various type declarations, no, I just write in Cython. I use raw C structures and arrays (and sometimes C ++ vectors) and a small wrapper around malloc / free, which I wrote myself. The code works almost as fast as C / C ++, because this is C / C ++ code decorated with syntactic sugar. This is C / C ++ code with Python functionality exactly where I need it and where I want it.
In fact, this is the opposite version of the standard use of languages similar to Python: you write the entire application on Python, optimize important places in C and ... Profit! Speed C, the convenience of Python, the sheep are intact, the wolves are fed.
In theory, it always looks better than in practice. In practice, your data structures have a huge impact on the efficiency of your code and the complexity of its writing. Working with arrays is always a pain, but they are fast. Lists are extremely convenient, but very slow. The cycles and function calls in Python are always slow, so the part of the application you write in C tends to grow and grow until almost all of your application is written in C.
')
A post about
writing C Python extensions has recently been published. The author wrote the implementation of the algorithm on pure Python and C, using the Numpy C API. I decided that this was a good opportunity to demonstrate the differences, and, for comparison, I wrote my own version in Cython:
import random from cymem.cymem cimport Pool from libc.math cimport sqrt cimport cython cdef struct Point: double x double y cdef class World: cdef Pool mem cdef int N cdef double* m cdef Point* r cdef Point* v cdef Point* F cdef readonly double dt def __init__(self, N, threads=1, m_min=1, m_max=30.0, r_max=50.0, v_max=4.0, dt=1e-3): self.mem = Pool() self.N = N self.m = <double*>self.mem.alloc(N, sizeof(double)) self.r = <Point*>self.mem.alloc(N, sizeof(Point)) self.v = <Point*>self.mem.alloc(N, sizeof(Point)) self.F = <Point*>self.mem.alloc(N, sizeof(Point)) for i in range(N): self.m[i] = random.uniform(m_min, m_max) self.r[i].x = random.uniform(-r_max, r_max) self.r[i].y = random.uniform(-r_max, r_max) self.v[i].x = random.uniform(-v_max, v_max) self.v[i].y = random.uniform(-v_max, v_max) self.F[i].x = 0 self.F[i].y = 0 self.dt = dt @cython.cdivision(True) def compute_F(World w): """Compute the force on each body in the world, w.""" cdef int i, j cdef double s3, tmp cdef Point s cdef Point F for i in range(wN):
This Cython version was written in 30 minutes, and it is as fast as the C code. Actually, why not, because this is C code just written using syntactic sugar. And you don't even need to think about the complex and hostile C API and study it, you just ... just write C or C ++ code. Both versions, C and Cython, are about 70 times faster than the pure Python version, given that it uses Numpy arrays.
Only one difference from C: I use a small malloc / free wrapper that I wrote myself -
cymem . It remembers used memory addresses, and when the garbage collector is triggered it simply frees up unnecessary memory. Since I started using this wrapper, I have never had problems with memory leaks.
The intermediate option to write in Cython is to use typed memory-views, which allows you to work with multidimensional Numpy arrays. However, for me it looks more complicated. I usually work with simpler arrays in my applications, and I prefer to define my own data structures.
Translated Dreadatour , text read %% username.