Parallel Python, start

Disclaimer

The geographical need was born of a friend to transfer a piece of the map from one area of the Earth to another. Out of habit, he did it on the dolphies, but I wanted to try a python in action, in which I am not a specialist.

Practice

Actually translating the algorithm turned out to be quite simple, but the speed of its work left much to be desired.
First of all, Psyco went to the course, speeding up processing 6 times.

It was no longer possible to get the best result without changing the algorithm, so the brute force method - task parallelization went.

A Parallel Python module was found. Connecting it turned out to be quite an easy task:
')
At first import pp , and then (the first option):

  ppservers = ()
 job_server = pp.Server (ppservers = ppservers)                     

 job_server.set_ncpus (2)
 print "Starting pp with", job_server.get_ncpus (), "workers"

 jobs = [job_server.submit (tighina_check, (), (find_geo_coords, compare, get_dist_bearing,), ("math",)) for i in range (3)]

 for job in jobs:
	 job ()

 job_server.print_stats ()

The code, in principle, speaks for itself - we use only the local server (and in general the module allows parallelizing on the network), we try to run on 2 processors, indicate which function to call and on which it depends, import math and run 3 tasks, at the end we print statistics.

The first ambush turned off psyco, which again threw us back to the starting position.
The solution was obvious - add psyco import when creating a job

jobs = [job_server.submit(tighina_check, (), (find_geo_coords, compare, get_dist_bearing,), ("math", "psyco", )) for i in range(3)]

and call psyco.full already in tighina_check:

  def tighina_check ():
         psyco.full ()
         # and here is a lot of math

The second problem was quite unexpected.
The code in tighina_check was originally sharpened for import of the form “from math import sin, pow, cos, asin, sqrt, fabs, acos”. But he did not work under pp, because creates the environment for executing functions only with the modules specified when creating the job. It was quite logical to remake all sin calls to math.sin, etc. It was here that a slight bewilderment arose - the intensive and constant use of the matte functions in the second call format led to a slowdown of 1.3-1.4 times.

The solution was to manually import the necessary functions into the global scope at the beginning of each job:

 def tighina_check ():
      psyco.full ()
      math_func_imports = ['sin', 'cos', 'asin', 'acos', 'sqrt', 'pow', 'fabs']
      for func in math_func_imports:
	  setattr (__builtins__, func, getattr (math, func))

Further it was thought that it would be quite good to accelerate pp itself with the help of psyco. To do this, you need a little patchwork pyworker.py from the kit, adding to the beginning:

  import psyco
 psyco.full ()

and replacing

  eval (__ fobj)

  exec __fobj

At the same time, there is no need to import psyco when creating a job and correspondingly calling psyco.full () in a job.

The rest is just a selection of the required number of processors.

What is the result?

100 jobs were launched.

Initial version (no parallelization, only psycho)
100 consecutive jobs 257 seconds

2 processors (pp, psyco)

  Starting pp with 2 workers
 Job execution statistics:
  job count |  % of all jobs |  job time sum |  time per job |  job server
        100 |  100.00 |  389.8933 |  3.898933 |  local
 Time elapsed since server creation 195.12789011

4 processors (pp, psyco)

  Starting pp with 4 workers
 Job execution statistics:
  job count |  % of all jobs |  job time sum |  time per job |  job server
        100 |  100.00 |  592.9463 |  5.929463 |  local
 Time elapsed since server creation 148.77167201

I didn’t want to test further, it seemed that there were 2 cores, each with hyper-trading, which means 4 jobs were the best option. But curiosity took up (and as it turned out - not in vain):
8 processors (pp, psyco)

  Starting pp with 8 workers
 Job execution statistics:
  job count |  % of all jobs |  job time sum |  time per job |  job server
        100 |  100.00 |  1072.3920 |  10.723920 |  local
 Time elapsed since server creation 137.681350946

16 processors (pp, psyco)

  Starting pp with 16 workers
 Job execution statistics:
  job count |  % of all jobs |  job time sum |  time per job |  job server
        100 |  100.00 |  2050.8158 |  20.508158 |  local
 Time elapsed since server creation 133.345046043

32 processors (pp, psyco)

  Starting pp with 32 workers
 Job execution statistics:
  job count |  % of all jobs |  job time sum |  time per job |  job server
        100 |  100.00 |  4123.8550 |  41.238550 |  local
 Time elapsed since server creation 136.022897005

So at best, 133 seconds versus 257 in the original version = acceleration 1.93 times for our specific task only due to parallelization.

It should be noted that all 100 job'ov from each other do not depend on and do not need to "communicate" with each other, which facilitates the task and increases the speed.

Final code examples:

  ppservers = ()
 job_server = pp.Server (ppservers = ppservers)                     

 job_server.set_ncpus (16)
 print "Starting pp with", job_server.get_ncpus (), "workers"

 jobs = [job_server.submit (tighina_check, (), (find_geo_coords, compare, get_dist_bearing,), ("math",)) for i in range (3)]

 for job in jobs:
     job ()

 job_server.print_stats ()

  def tighina_check ():
     math_func_imports = ['sin', 'cos', 'asin', 'acos', 'sqrt', 'pow', 'fabs']
     for func in math_func_imports:
         setattr (__builtins__, func, getattr (math, func)) 

         # and here is a lot of math

Source: https://habr.com/ru/post/61916/

All Articles

Parallel Python, start

Disclaimer

Practice

What is the result?

More articles: