📜 ⬆️ ⬇️

We write code C on Cython

For the last two years I have been solving all problems exclusively in Cython . This does not mean that I write on Python, and then “Sitonize” it using various type declarations, no, I just write in Cython. I use raw C structures and arrays (and sometimes C ++ vectors) and a small wrapper around malloc / free, which I wrote myself. The code works almost as fast as C / C ++, because this is C / C ++ code decorated with syntactic sugar. This is C / C ++ code with Python functionality exactly where I need it and where I want it.

In fact, this is the opposite version of the standard use of languages ​​similar to Python: you write the entire application on Python, optimize important places in C and ... Profit! Speed ​​C, the convenience of Python, the sheep are intact, the wolves are fed.

In theory, it always looks better than in practice. In practice, your data structures have a huge impact on the efficiency of your code and the complexity of its writing. Working with arrays is always a pain, but they are fast. Lists are extremely convenient, but very slow. The cycles and function calls in Python are always slow, so the part of the application you write in C tends to grow and grow until almost all of your application is written in C.
')
A post about writing C Python extensions has recently been published. The author wrote the implementation of the algorithm on pure Python and C, using the Numpy C API. I decided that this was a good opportunity to demonstrate the differences, and, for comparison, I wrote my own version in Cython:

import random from cymem.cymem cimport Pool from libc.math cimport sqrt cimport cython cdef struct Point: double x double y cdef class World: cdef Pool mem cdef int N cdef double* m cdef Point* r cdef Point* v cdef Point* F cdef readonly double dt def __init__(self, N, threads=1, m_min=1, m_max=30.0, r_max=50.0, v_max=4.0, dt=1e-3): self.mem = Pool() self.N = N self.m = <double*>self.mem.alloc(N, sizeof(double)) self.r = <Point*>self.mem.alloc(N, sizeof(Point)) self.v = <Point*>self.mem.alloc(N, sizeof(Point)) self.F = <Point*>self.mem.alloc(N, sizeof(Point)) for i in range(N): self.m[i] = random.uniform(m_min, m_max) self.r[i].x = random.uniform(-r_max, r_max) self.r[i].y = random.uniform(-r_max, r_max) self.v[i].x = random.uniform(-v_max, v_max) self.v[i].y = random.uniform(-v_max, v_max) self.F[i].x = 0 self.F[i].y = 0 self.dt = dt @cython.cdivision(True) def compute_F(World w): """Compute the force on each body in the world, w.""" cdef int i, j cdef double s3, tmp cdef Point s cdef Point F for i in range(wN): # Set all forces to zero. wF[i].x = 0 wF[i].y = 0 for j in range(i+1, wN): sx = wr[j].x - wr[i].x sy = wr[j].y - wr[i].y s3 = sqrt(sx * sx + sy * sy) s3 *= s3 * s3; tmp = wm[i] * wm[j] / s3 Fx = tmp * sx Fy = tmp * sy wF[i].x += Fx wF[i].y += Fy wF[j].x -= Fx wF[j].y -= Fy @cython.cdivision(True) def evolve(World w, int steps): """Evolve the world, w, through the given number of steps.""" cdef int _, i for _ in range(steps): compute_F(w) for i in range(wN): wv[i].x += wF[i].x * w.dt / wm[i] wv[i].y += wF[i].y * w.dt / wm[i] wr[i].x += wv[i].x * w.dt wr[i].y += wv[i].y * w.dt 

This Cython version was written in 30 minutes, and it is as fast as the C code. Actually, why not, because this is C code just written using syntactic sugar. And you don't even need to think about the complex and hostile C API and study it, you just ... just write C or C ++ code. Both versions, C and Cython, are about 70 times faster than the pure Python version, given that it uses Numpy arrays.

Only one difference from C: I use a small malloc / free wrapper that I wrote myself - cymem . It remembers used memory addresses, and when the garbage collector is triggered it simply frees up unnecessary memory. Since I started using this wrapper, I have never had problems with memory leaks.

The intermediate option to write in Cython is to use typed memory-views, which allows you to work with multidimensional Numpy arrays. However, for me it looks more complicated. I usually work with simpler arrays in my applications, and I prefer to define my own data structures.

Translated Dreadatour , text read %% username.

Source: https://habr.com/ru/post/242533/


All Articles