📜 ⬆️ ⬇️

Python, scipy.weave and openMP - overclocking the code

Hello% username%, this article is devoted to the problem of increasing the speed of mathematical calculations based on the python language using scipy.weave and openMP .

Many may ask: “Why use python for mathematical calculations at all?”, But we will not answer “eternal” questions, nor will we consider many other solutions to this problem, such as, for example, psyco .

Instruments


As described above, our tool is the scipy.weave library, as well as the openMP library.
scipy - a set of libraries for computing in applied mathematics and science. openMP is an open standard for parallelizing programs in C, C ++ and Fortran.

Package installation


On Debian-like Linux systems, you need to run:
  apt-get install python-scipy
 apt-get install libgomp1 

Method


To increase the computation speed, you need to implement the “narrow” part of the python code (usually a cycle in which some actions with the matrix occur) in C and add openMP directives for parallelization.
')

Example


I think that there is nothing better than to be convinced of this method by the example of solving the following problem:

Python implementation


In python c, the use of numpy this task, not taking into account various preparatory operations, like matrix generation and other things, solves in a couple of lines of code:
  1. # cycle through the rows of the matrix, where i is the row number
  2. # s - integer, randRow - random vector
  3. for i in xrange ( N ) :
  4. matrix [ i,: ] - = c * randRow
Generation of a random matrix x on y, in our case x = y:
  1. # generate random x by y matrix
  2. # Matrix elements - random numbers from 0 to 99 inclusive
  3. def randMat ( x, y ) :
  4. randRaw = lambda a : [ randint ( 0 , 100 ) for i in xrange ( 0 , a ) ]
  5. randConst = lambda x, y : [ randRaw ( x ) for e in xrange ( 0 , y ) ]
  6. return array ( randConst ( x, y ) )

Implement scipy.weave without openMP


scipy.weave is part of the scipy library, which allows you to use C / C ++ code inside python code.
It happens as follows:
  1. #C code
  2. codeC =
  3. "" "
  4. int i = 0;
  5. for (i = 0; i <N * M; i ++) {
  6. matrix [0, i] = matrix [0, i] - (c * randRow [i% M]);
  7. }
  8. " " "
  9. weave. inline ( codeC, [ 'matrix' , 'c' , 'randRow' , 'N' , 'M' ] , compiler = 'gcc' )

those. The C code itself is stored as a multiline string , and the python code variables are passed to C list, where the elements are text-like constants of the same name. Also, numpy arrays are transmitted to C not in the form of a matrix, but in the form of a vector, that is why there is one cycle in the code, not two.

By the way, the resulting C code can be searched in / tmp /% user% / python2x_intermediate / compiler_x

Implementation of scipy.weave with openMP


Now it is necessary to add openMP directives to the added version and in the inline call add the missing parameters, namely:
  1. # C and openMP code
  2. codeOpenMP =
  3. "" "
  4. int i = 0;
  5. omp_set_num_threads (2);
  6. #pragma omp parallel shared (matrix, randRow, c) private (i)
  7. {
  8. #pragma omp for
  9. for (i = 0; i <N * M; i ++) {
  10. matrix [0, i] = matrix [0, i] - (c * randRow [i% M]);
  11. }
  12. }
  13. " " "
  14. ...
  15. weave . inline ( codeOpenMP, [ 'matrix' , 'c' , 'randRow' , 'N' , 'M' ] ,
  16. extra_compile_args = [ '-O3 -fopenmp' ] ,
  17. compiler = 'gcc' ,
  18. libraries = [ 'gomp' ] ,
  19. headers = [ '<omp.h>' ] )
Full source code with all implementations can be downloaded here.

Comparison of results


The above source code can be run and make sure that scipy.weave really gives an increase in speed:
 Test on size: 100x100
	 Pure python: 0.0725984573364
	 Pure C: 0.303888320923
	 C plus OpenMP: 0.109100341797
	 Test - ok

 Test on size: 1000x1000
	 Pure python: 1.00839138031
	 Pure C: 0.506997108459
	 C plus OpenMP: 0.333213806152
	 Test - ok

 Test on size: 2000x2000
	 Pure python: 3.24151515961
	 Pure C: 2.10800170898
	 C plus OpenMP: 1.17690563202
	 Test - ok

 Test on size: 3000x3000
	 Pure python: 5.54490089417
	 Pure C: 4.61800098419
	 C plus OpenMP: 2.56960391998
	 Test - ok

Literature


The following resources were used in the code writing:

Source: https://habr.com/ru/post/135857/


All Articles