Profiling and optimizing symbolic computations for a future server

Hi, Habr! Today I want to share my little experience of choosing tools for organizing calculations on the future server. I will note at once that in this publication we are not talking about the server itself, but rather about the optimization of symbolic calculations on it.

Task

There is a certain functionality that allows users to form often cumbersome formulas of the following general form, which later need to calculate the requests of other users.

Typical formula. Used for profiling

8.348841409877572e-11 * x1_ * x2_ * x3_ * x4_ * x5_ * x6_ * x7_ - 3.480284409621004e-9 * x1_ * x2_ * x3_ * x4_ * x5_ * x6_ - 1.44049340858321e-9 * x1_ * x2_ * x3_ * x4_ * x5_ * x7_ + 6.004816835089577e-8 * x1_ * x2_ * x3_ * x4_ * x5_ - 2.674192940005371e-9 * x1_ * x2_ * x3_ * x4_ * x6_ * x7_ + 1.1147596343241695e-7 * x1_ * x2_ * x3_ * x4_ * x6_ + 4.614001865646533e-8 * x1_ * x2_ * x3_ * x4_ * x7_ - 1.92338517189701e-6 * x1_ * x2_ * x3_ * x4_ # 3.980463071998064e-9 * x1_ * x2_ * x3_ * x5_ * x7_x7_ * x7_ * x7_ * x7_ * x4_ * x4_ * x4_ * x4_ * x4_ * x4_ * x4_ * x3_ * * x1_ * x2_ * x3_ * x5_ * x6_ + 6.867815593308846e-8 * x1_ * x2_ * x3_ * x5_ * x7_ - 2.862906227803913e-6 * x1_ * x2_ * x3_ * x5_ + 1.2749703798969891e-7 * x1_ * x2_ * x3_ * x6_ * x7_ - 5.314820397426395e-6 * x1_ * x2_ * x3_ * i'ehhhhhhhh ihhhhhhh i nn as * 2 2 2 2 2 80 80 80 80 2 99 99 99 99 99 99 99 99 99 99 99 99 1.8 1.8 1.8 1.8 1.8 1.8 46 46 46 x x * x3_ * x5_ * x6_ * x7_ + 7.69890890657543e-8 * x1_ * x2_ * x4_ * x5_ * x6_ + 3.1865865064706345e-8 * x1_ * x2_ * x4_ * x5_ * x7_ - 1.3283551698311385e-6 * x1_ * x2_ * x4_ * x5_ + 5.915714175810938e-8 * x1_ * x2_ * x4_ * x6_ * x7_ - 2.4660148079756056e-6 * x1_ * x2_ * x4_ * x6_ - 1.0206861262244266-6 * x1_ * x2_ * x4_ _ * x7_ + 4.254815271209286e-5 * x1_ * x2_ * x4_ + 8.80537876744858e-8 * x1_ * x2_ * x5_ * x6_ * x7_ - 3.6705956013363683e-6 * x1_ because - * - * - * - * - * - = - * x6_33 * 33233333686e-6 * x6_ - * * x6_33 x2_ * x5_ * x7_ + 6.333176170880347e-5 * x1_ * x2_ * x5_ - 2.820424906208041e-6 * x1_ * x2_ * x6_ * x7_ + 0.0001175717652455964 * x1_ * x2_ * x7_ + 0.0001175717652455964 * x1_ * x2_ * x7_ + 0 * x1_ * x2_ - 4.643965319933718e-9 * x1_ * x3_ * x4_ * x5_ * x6_ * x7_ + 1.9358756900289542e-7 * x1_ * x3_ * x4_ * x5_ * x6_ + 8.012609870218512e-8 * x1_ * x3_ * x4_ * x5_ * x7_ - 3.3401232720775553e-6 * x1_ * x3_ * x4_ * x5_6.67.44749386055242e-7 * x1_ * x3_ * x4_ * x6_ * x7_ - 6.200746333919621e-6 * x4_ * x4_ * x6_ * x6_ * x6_ * x6_ * x6_ * x6_ * x6_ * x6_ *) * x3_ * x4_ * x7_ + 0.00010698650352362546 * x1_ * x3_ * * * * * * * * * * *) * x4_ * * * * 4 * x4_ * 4 * x4_x4 * x4_x4 * x4_x4 * x4_x4 * x4_x4 * x4_x4 * x4_x6 * x4_x6 * x4_x6 * x4_x6 * x4_x6 * x4_x6 * x4_x6 * x4_x4 * x4_ * x4_x4 * x4_x6 * x4_ * x4_ * x4_ * x4_ * x4_ * x4_ *) * x5_ * x7_ + 0.00015924648463737888 * x1_ * x3_ * x5_ - 7.091903641665703e-6 * x1_ * x3_ * x6_ * x7_ +4000263631995286495 * x1_ * x3_ * x1_year +400023263192.2207,256,256,256,256,264,256 * x1_ * x3_ * x1_xr + 1.0273144909755949e-7 * x1_ * x4_ * x5_ * x6_ * x7_ - 4.282446163036621e-6 * x1_ * x4_ * x5_ * x6_ - 1.7725089771387925e-6 * x1_ * xJP_AFOR * x4_ * * x4_ * * x4_ * * x4_ * * x4_ * * x4_ * * x4_ * * x4_ * * x4_ * * x5_XX__X4_4 * * x4_ * * x5_XX__XX__X4_4 * * * * x5_ - 5). x6_ * x7_ + 0.0002041734515648569 * x1_ * x5_ * x6_ + 8.45076066374878e-5 * x1_ * x5_ * x7_ - 0.0035227700858871253 * x1_ * x5_ + 0.00015688350080537115 * x1_ * x6_ * x7_ - 0.006539819616205367 * x1_ * x6_ - 0.0027068382268636906 * x1_ * x7_ + 0.11283680975413288 * x1_ - 1.4404933842970813e-9 * x2_ * x3_ * x4_ * x5_ * x6_ * x7_ + 6.004816833354854e-8 * x2_ * x3_ * x4_ * x5_ * x6_ + 2.4854000114926666e-8 * x2_ * x3_ * x4_ * x5_ * x7_ - 1.0360597302149638e-6 * x2_ * x3_ * x4_ * x5_ + 4.614001870156814-8 * x2_ * x3_ * x4_ * x6_ * x7_ - 1.923385171910888e-6 * x2_ * x6_ * x3_ * x4_ * x6_ = * = * o6 * x6_ * x7_ * * x6_ * x7_ * * x6_ * x7_ * * x6_ * x7_ * * x4_ * x7_ + 3.3185723683902546e-5 * x2_ * x3_ * x4_ + 6.867815595043569e-8 * x2_ * x3_ * x5_ * x6_ * x7_ - 2.8629062278143214e-6 * x2_ * x3 _ * x5_ * x6_ - 1.1849599028824348e-6 * x2_ * x3_ * x5_ * x7_ + 4.9396042143235244e-5 * x2_ * x3_ * x5_ - 2.1998097600572225e-6 * x2_2 * x3_x3_ * x6_ * x7_e * * * * * x2_2 * * * * x2_2 * * * * x7_ * * * x7_ * * * x7_ * * * x7_ * *. Understand x86_x6_ + 3.795510120421959e-5 * x2_ * x3_ * x7_ - 0.0015821900589679597 * x2_ * x3_ + 3.1865865045624386e8 * x2_ * x4_ * x5_ * x6_ * x4_ - 1.32835516981727644 * x4_ * x5_ * x6_ * x4_ - 1.32835516981727644 * * x4_ * x5_ * x6_ * x4_ * x32_ * x2_ * x2_ * x4_ * x5_ * x6_ * x4_ * x2_ * x3_ * e-7 * x2_ * x4_ * x5_ * x7_ + 2.2919188659665732e-5 * x2_ * x4_ * x5_ - 1.0206861262122835e-6 * x2_ * x4_ * x6_ * x7_ + 4.254815271210674e-5 * x2_ * x4_ * x6_ + 1.7610725219094348e- 5 * x2_ * x4_ * x7_ - 0.0007341177730757296 * x2_ * 4 *; * x2_ * x5_ + 4.8663077461354176e-5 * x2_ * x6_ * x7_ - 02028560282722149 * x2_ * * * * * * * * * * * * * * * * * * * * * * *> []] []] [] and others » 6 * x3_ * x4_ * x5_ * x6_ - 1.38248054011407e-6 * x3_ * x4_ * x5_ * x7_ + 5.762985469397186e-5 * x3_ * x4_ * x5_ - 2.56649543821313745 e-6 * x3_ * x4_ * x6_ * x7_ + 0.00010698650352363066 * xh___hh__x4_ * x6_ + 4.428182648625982e-5 * x3_ * x4_ * x7_ - 0.00184592488062541 * x3_ * x4_ - 3.82015824144144144144x4_ * x4_ * x4_ * x4_ 0.00015924648463738755 * x3_ * x5_ * x6_ + 6.591228770929172e-5 * x3_ * x5_ * x7_ - 0.0027476087026188038 * x3_ * x5_ + 0.00012236236302704678 * x3_ * x6_ * x7_ - 0.005100777187540465 * x3_ * x6_ - 0.0021112170500446024 * x3_ * x7_ + 0.08800784408220161 * x3_ - 1.7725089771387925 e-6 * x4_ * x5_ * x6_ * x7_ + 7.388851548491629e-5 * xth4_ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *] * * *] * * * * * * * * * * * * * * * * * * * * *] x5 * x5 * x4_ - 0.002366701249731833 * x4_ * x6_ - 0.0009795801398659112 * x4_ * x7_ + 0.040834615376717426 * x4_ + 8.450760663750168e-5 * x5_ * x6_ * x7_ - 0.003522770085887094 * x5_ * x6_ - 0.0014580782487184623 * x5_ * x7_ + 0.060781208246755536 * x5_ - 0.0027068382268636976 * x6_ * x7_ + 0.11283680975413288 * x6_ + 0.04670327439658878 * x7_ + 0.5527559695044361

The formula comes in the form of a string and is to be saved on the server and invoked by user requests. It is assumed that the user requests are passed the parameters x1_, x2_, ... in the form of a simple list of values. It is required to determine the method of organizing such calculations with a bias to minimize the execution time.

Feature 1

The formation time of the formulas themselves is quite large (up to a couple of minutes for the above formula), therefore the processing time and storage of incoming formula-lines in this task is not critical (it will be shown later that these are values of different orders).
')

Feature 2

It is assumed that the bulk of requests will be group-based, i.e. In the same request, several sets of x1_, x2_, ... values can be transmitted for calculation using the same formula.

Instruments

Programming language - Python 3x . As a DBMS - Redis (NoSQL).

A few words about Redis. In my opinion, this task is a great example for using it: the user creates a formula; the formula is processed and sent to the repository; then it is removed from the repository and processed in case someone wanted to use it; the values transmitted on request are substituted into the formula and the result is given. Everything. The only thing that needs to be known to the user who wants to calculate something is the number of unique variables in the formula. Redis has a built-in hash mechanism, so why not use it?

An example of using Python + Redis

import redis r = redis.StrictRedis(host='localhost', port=6379, db=0) #   redis r.hset('expr:1', 'expr', expr) #     'expr:1' r.hset('expr:1', 'params', num) #     'expr:1' r.hget('expr:1', 'expr') #    'expr:1' r.hget('expr:1', 'params') #     'expr:1'

To work with the formulas themselves, we will use the wonderful Sympy library, which can translate a string formula into a symbolic expression and perform the necessary calculations (in general, the library opens up a huge mathematical functionality for working with symbolic expressions).

Profiling and optimization

To measure the execution time of code sections, we will use the following class (somewhere borrowed from the Internet):

 class Profiler(object): #  def __init__(self,info=''): self.info = info def __enter__(self): self._startTime = time() def __exit__(self, type, value, traceback): print(self.info, "Elapsed time: {:.3f} sec".format(time() - self._startTime))

Let's go ... For the purity of the experiment we enter num_iter = 1000 - the number of tests.

Test the profiler on reading the formula line from the file:

 with Profiler('read (' + str(num_iter) + '): cycle'): for i in range(num_iter): f = open('expr.txt') expr_txt = f.read() f.close() >>read (1000): cycle Elapsed time: 0.014 sec

Formula line loaded. Now we define how many variables are in it and what they are (we must know what values to substitute into which variable):

 with Profiler('find unique sorted symbols (' + str(num_iter) + '): cycle'): for i in range(num_iter): symbols_set = set() result = re.findall(r"x\d_", expr_txt) for match in result: symbols_set.add(match) symbols_set = sorted(symbols_set) symbols_list = symbols(symbols_set) >>find unique sorted symbols (1000): cycle Elapsed time: 0.156 sec

The resulting time is quite satisfied. Now we will translate the formula-string into a character expression:

 with Profiler('sympify'): expr = sympify(expr_txt) >>sympify Elapsed time: 0.426 sec

In this form, it can already be used for calculations. Let's try:

 with Profiler('subs cycle (' + str(num_iter) + '): cycle'): for i in range(num_iter): expr_copy = copy.copy(expr) for x in symbols_list: expr_copy = expr_copy.subs(x,1) >>subs cycle (1000): cycle Elapsed time: 0.245 sec

There is a feature here: sympy does not know how (?) To substitute all values at once into variables of a symbolic expression. It is necessary to use the cycle. As a result of execution in expr_copy, we get a real number.

In sympy, it is possible to convert a character expression to a lambda function using the numpy module, which theoretically should speed up the calculations. Let's translate:

 with Profiler('lambdify'): func = lambdify(tuple(symbols_list), expr, 'numpy') # returns a numpy-ready function >>lambdify Elapsed time: 0.114 sec

Not too long it turned out that it pleases. Now let's check how quickly the calculations will be performed:

 with Profiler('subs cycle (' + str(num_iter) + '): lambdify'): for i in range(num_iter): func(*[1 for i in range(len(symbols_set))]) >>subs cycle (1000): lambdify Elapsed time: 0.026 sec

This is the level! Faster almost an order of magnitude. Especially tasty, given the need for group requests (feature 2). Let's check just in case the values match:

 print('exp1 == exp2:', round(expr_copy,12) == round(func(*[1 for i in range(len(symbols_set))]),12)) >>exp1 == exp2: True

Conclusion 1

It is inexpedient to keep the formula string - its conversion time is great for calculations. It makes sense to store either a character expression or a lambda function.

Let's try to deal with storage. The symbolic expression is the sympy class, the lambda function is also a class (in particular, it has not penetrated). We will try to serialize using the built-in pickle, cloudpickle, dill:

 with Profiler('pickle_dumps cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): pickle_dump = pickle.dumps(expr) with Profiler('pickle_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): pickle.loads(pickle_dump) print() with Profiler('cloudpickle_dumps cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): cloudpickle_dump = cloudpickle.dumps(expr) with Profiler('cloudpickle_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): cloudpickle.loads(cloudpickle_dump) print() with Profiler('dill_dumps cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): dill_dump = dill.dumps(expr) with Profiler('dill_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): dill.loads(dill_dump) >>pickle_dumps cycle (1000): sympifyed expr Elapsed time: 0.430 sec >>pickle_loads cycle (1000): sympifyed expr Elapsed time: 2.320 sec >> >>cloudpickle_dumps cycle (1000): sympifyed expr Elapsed time: 7.584 sec >>cloudpickle_loads cycle (1000): sympifyed expr Elapsed time: 2.314 sec >> >>dill_dumps cycle (1000): sympifyed expr Elapsed time: 8.259 sec >>dill_loads cycle (1000): sympifyed expr Elapsed time: 2.806 sec

Note that super pickle quickly serializes symbolic expressions when compared to peers. Deserialization time is different, but not so significant. Now he will try to test serialization / deserialization in conjunction with storing / loading Redis. It should be noted that pickle failed to serialize / deserialize the lambda function.

 with Profiler('redis_set cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): r.set('expr', pickle_dump) with Profiler('redis_get cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): r.get('expr') print() with Profiler('pickle_dumps + redis_set cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): r.set('expr', pickle.dumps(expr)) with Profiler('redis_get + pickle_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): pickle.loads(r.get('expr')) print() with Profiler('cloudpickle_dumps + redis_set cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): r.set('expr', cloudpickle.dumps(expr)) with Profiler('redis_get + cloudpickle_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): cloudpickle.loads(r.get('expr')) print() with Profiler('dill_dumps + redis_set cycle (' + str(num_iter) + '): lambdifyed expr'): for i in range(num_iter): r.set('expr', dill.dumps(expr)) with Profiler('redis_get + dill_loads cycle (' + str(num_iter) + '): lambdifyed expr'): for i in range(num_iter): dill.loads(r.get('expr')) >>redis_set cycle (1000): sympifyed expr Elapsed time: 0.066 sec >>redis_get cycle (1000): sympifyed expr Elapsed time: 0.051 sec >> >>pickle_dumps + redis_set cycle (1000): sympifyed expr Elapsed time: 0.524 sec >>redis_get + pickle_loads cycle (1000): sympifyed expr Elapsed time: 2.437 sec >> >>cloudpickle_dumps + redis_set cycle (1000): sympifyed expr Elapsed time: 7.659 sec >>redis_get + cloudpickle_loads cycle (1000): sympifyed expr Elapsed time: 2.492 sec >> >>dill_dumps + redis_set cycle (1000): lambdifyed expr Elapsed time: 8.333 sec >>redis_get + dill_loads cycle (1000): lambdifyed expr Elapsed time: 2.932 sec

cloudpickle and dill coped with the serialization / deserialization of the lambda functions (in the example above, however, cloudpickle worked with a symbolic expression).

Conclusion 2

Redis shows a good read / write result of 1000 values in one stream. To make a choice in the future, you need to profile the complete chains of actions from entering the formula line to issuing the value calculated by it to the user:

 print('\nFINAL performance test:') with Profiler('sympify + pickle_dumps_sympifyed_expr + redis_set cycle (' + str(num_iter) + '): '): for i in range(num_iter): expr = sympify(expr_txt) r.set('expr', pickle.dumps(expr)) with Profiler('redis_get + pickle_loads_sympifyed_expr + subs cycle (' + str(num_iter) + '): '): for i in range(num_iter): loaded_expr = pickle.loads(r.get('expr')) expr_copy = copy.copy(loaded_expr) for x in symbols_list: expr_copy = expr_copy.subs(x,1) with Profiler('sympify + lambdify + dill_dumps_lambdifyed_expr + redis_set cycle (' + str(num_iter) + '): '): for i in range(num_iter): expr = sympify(expr_txt) func = lambdify(tuple(symbols_list), expr, 'numpy') r.set('expr', dill.dumps(expr)) with Profiler('redis_get + dill_loads_lambdifyed_expr + subs cycle (' + str(num_iter) + '): '): for i in range(num_iter): loaded_expr = dill.loads(r.get('expr')) func(*[1 for i in range(len(symbols_set))]) with Profiler('sympify + cloudpickle_dumps_sympifyed_expr + redis_set cycle (' + str(num_iter) + '): '): for i in range(num_iter): expr = sympify(expr_txt) r.set('expr', cloudpickle.dumps(expr)) with Profiler('redis_get + cloudpickle_loads_sympifyed_expr + subs cycle (' + str(num_iter) + '): '): for i in range(num_iter): loaded_expr = cloudpickle.loads(r.get('expr')) expr_copy = copy.copy(loaded_expr) for x in symbols_list: expr_copy = expr_copy.subs(x,1) with Profiler('sympify + lambdify + cloudpickle_dumps_lambdifyed_expr + redis_set cycle (' + str(num_iter) + '): '): for i in range(num_iter): expr = sympify(expr_txt) func = lambdify(tuple(symbols_list), expr, 'numpy') r.set('expr', cloudpickle.dumps(expr)) with Profiler('redis_get + cloudpickle_loads_lambdifyed_expr + subs cycle (' + str(num_iter) + '): '): for i in range(num_iter): loaded_expr = cloudpickle.loads(r.get('expr')) func(*[1 for i in range(len(symbols_set))]) >>FINAL performance test: >>sympify + pickle_dumps_sympifyed_expr + redis_set cycle (1000): Elapsed time: 15.075 sec >>redis_get + pickle_loads_sympifyed_expr + subs cycle (1000): Elapsed time: 2.929 sec >>sympify + lambdify + dill_dumps_lambdifyed_expr + redis_set cycle (1000): Elapsed time: 87.707 sec >>redis_get + dill_loads_lambdifyed_expr + subs cycle (1000): Elapsed time: 2.356 sec >>sympify + cloudpickle_dumps_sympifyed_expr + redis_set cycle (1000): Elapsed time: 23.633 sec >>redis_get + cloudpickle_loads_sympifyed_expr + subs cycle (1000): Elapsed time: 3.059 sec >>sympify + lambdify + cloudpickle_dumps_lambdifyed_expr + redis_set cycle (1000): Elapsed time: 86.739 sec >>redis_get + cloudpickle_loads_lambdifyed_expr + subs cycle (1000): Elapsed time: 1.721 sec

Conclusion 3

Creating a lambda function and its serialization using cloudpickle, of course, turned out to be the longest, BUT, if you recall (feature 1) the non-critical processing and storage time, then ... Cloudpickle is well done! It was possible to pull out of the base within one stream, deserialize and calculate 1000 times in 1.7 seconds. That, in general, is good, given the complexity of the original formula-line.

Let's try to evaluate the performance for group queries. We will change the number of groups of parameters by orders with the hope of improving the result:

 print('\nTEST performance for complex requests:') for x in [1,10,100,1000]: with Profiler('redis_get + cloudpickle_loads_lambdifyed_expr + ' + str(x) + '*subs cycle (' + str(round(num_iter/x)) + '): '): for i in range(round(num_iter/x)): loaded_expr = cloudpickle.loads(r.get('expr')) for j in range(x): func(*[1 for i in range(len(symbols_set))]) >>TEST performance for complex requests: >>redis_get + cloudpickle_loads_lambdifyed_expr + 1*subs cycle (1000): Elapsed time: 1.768 sec >>redis_get + cloudpickle_loads_lambdifyed_expr + 10*subs cycle (100): Elapsed time: 0.204 sec >>redis_get + cloudpickle_loads_lambdifyed_expr + 100*subs cycle (10): Elapsed time: 0.046 sec >>redis_get + cloudpickle_loads_lambdifyed_expr + 1000*subs cycle (1): Elapsed time: 0.028 sec

The result looks quite viable. The calculations were performed on a virtual machine with the following characteristics: Ubuntu 16.04.2 LTS OS, Intel® Core Processor (TM) i7-4720HQ CPU @ 2.60GHz (1 core is allocated), DDR3-1600 (1Gb is allocated).

Conclusion

Thank you for watching! I will be glad to constructive criticism and interesting comments.

In the matter of profiling and optimizing the required computations, the ideas and approaches presented here (too “weak” formula in the example, but a good set of tests) and here (information on the serialization of lambda functions) were used.

Full text of tests performed, including library imports

 import redis import pickle import dill import cloudpickle import re import copy from time import time from sympy.utilities.lambdify import lambdify from sympy import sympify, symbols class Profiler(object): #  def __init__(self,info=''): self.info = info def __enter__(self): self._startTime = time() def __exit__(self, type, value, traceback): print(self.info, "Elapsed time: {:.3f} sec".format(time() - self._startTime)) num_iter = 1000 dill.settings['recurse'] = True r = redis.StrictRedis(host='localhost', port=6379, db=0) with Profiler('read (' + str(num_iter) + '): cycle'): for i in range(num_iter): f = open('expr.txt') expr_txt = f.read() f.close() with Profiler('find unique sorted symbols (' + str(num_iter) + '): cycle'): for i in range(num_iter): symbols_set = set() result = re.findall(r"x\d_", expr_txt) for match in result: symbols_set.add(match) symbols_set = sorted(symbols_set) symbols_list = symbols(symbols_set) print() with Profiler('sympify'): expr = sympify(expr_txt) with Profiler('lambdify'): func = lambdify(tuple(symbols_list), expr, 'numpy') # returns a numpy-ready function print() with Profiler('subs cycle (' + str(num_iter) + '): cycle'): for i in range(num_iter): expr_copy = copy.copy(expr) for x in symbols_list: expr_copy = expr_copy.subs(x,1) with Profiler('subs cycle (' + str(num_iter) + '): lambdify'): for i in range(num_iter): func(*[1 for i in range(len(symbols_set))]) print() print('exp1 == exp2:', round(expr_copy,12) == round(func(*[1 for i in range(len(symbols_set))]),12)) print() with Profiler('pickle_dumps cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): pickle_dump = pickle.dumps(expr) with Profiler('pickle_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): pickle.loads(pickle_dump) print() with Profiler('cloudpickle_dumps cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): cloudpickle_dump = cloudpickle.dumps(expr) with Profiler('cloudpickle_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): cloudpickle.loads(cloudpickle_dump) print() with Profiler('dill_dumps cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): dill_dump = dill.dumps(expr) with Profiler('dill_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): dill.loads(dill_dump) print() #,     ( 12 ),  ,    redis with Profiler('redis_set cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): r.set('expr', pickle_dump) with Profiler('redis_get cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): r.get('expr') print() with Profiler('pickle_dumps + redis_set cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): r.set('expr', pickle.dumps(expr)) with Profiler('redis_get + pickle_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): pickle.loads(r.get('expr')) print() with Profiler('cloudpickle_dumps + redis_set cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): r.set('expr', cloudpickle.dumps(expr)) with Profiler('redis_get + cloudpickle_loads cycle (' + str(num_iter) + '): sympifyed expr'): for i in range(num_iter): cloudpickle.loads(r.get('expr')) print() with Profiler('dill_dumps + redis_set cycle (' + str(num_iter) + '): lambdifyed expr'): for i in range(num_iter): r.set('expr', dill.dumps(expr)) with Profiler('redis_get + dill_loads cycle (' + str(num_iter) + '): lambdifyed expr'): for i in range(num_iter): dill.loads(r.get('expr')) print('\nFINAL performance test:') with Profiler('sympify + pickle_dumps_sympifyed_expr + redis_set cycle (' + str(num_iter) + '): '): for i in range(num_iter): expr = sympify(expr_txt) r.set('expr', pickle.dumps(expr)) with Profiler('redis_get + pickle_loads_sympifyed_expr + subs cycle (' + str(num_iter) + '): '): for i in range(num_iter): loaded_expr = pickle.loads(r.get('expr')) expr_copy = copy.copy(loaded_expr) for x in symbols_list: expr_copy = expr_copy.subs(x,1) with Profiler('sympify + lambdify + dill_dumps_lambdifyed_expr + redis_set cycle (' + str(num_iter) + '): '): for i in range(num_iter): expr = sympify(expr_txt) func = lambdify(tuple(symbols_list), expr, 'numpy') r.set('expr', dill.dumps(expr)) with Profiler('redis_get + dill_loads_lambdifyed_expr + subs cycle (' + str(num_iter) + '): '): for i in range(num_iter): loaded_expr = dill.loads(r.get('expr')) func(*[1 for i in range(len(symbols_set))]) with Profiler('sympify + cloudpickle_dumps_sympifyed_expr + redis_set cycle (' + str(num_iter) + '): '): for i in range(num_iter): expr = sympify(expr_txt) r.set('expr', cloudpickle.dumps(expr)) with Profiler('redis_get + cloudpickle_loads_sympifyed_expr + subs cycle (' + str(num_iter) + '): '): for i in range(num_iter): loaded_expr = cloudpickle.loads(r.get('expr')) expr_copy = copy.copy(loaded_expr) for x in symbols_list: expr_copy = expr_copy.subs(x,1) with Profiler('sympify + lambdify + cloudpickle_dumps_lambdifyed_expr + redis_set cycle (' + str(num_iter) + '): '): for i in range(num_iter): expr = sympify(expr_txt) func = lambdify(tuple(symbols_list), expr, 'numpy') r.set('expr', cloudpickle.dumps(expr)) with Profiler('redis_get + cloudpickle_loads_lambdifyed_expr + subs cycle (' + str(num_iter) + '): '): for i in range(num_iter): loaded_expr = cloudpickle.loads(r.get('expr')) func(*[1 for i in range(len(symbols_set))]) print('\nTEST performance for complex requests:') for x in [1,10,100,1000]: with Profiler('redis_get + cloudpickle_loads_lambdifyed_expr + ' + str(x) + '*subs cycle (' + str(round(num_iter/x)) + '): '): for i in range(round(num_iter/x)): loaded_expr = cloudpickle.loads(r.get('expr')) for j in range(x): func(*[1 for i in range(len(symbols_set))]) #r.set('expr', func) >>read (1000): cycle Elapsed time: 0.014 sec >>find unique sorted symbols (1000): cycle Elapsed time: 0.156 sec >> >>sympify Elapsed time: 0.426 sec >>lambdify Elapsed time: 0.114 sec >> >>subs cycle (1000): cycle Elapsed time: 0.245 sec >>subs cycle (1000): lambdify Elapsed time: 0.026 sec >> >>exp1 == exp2: True >> >>pickle_dumps cycle (1000): sympifyed expr Elapsed time: 0.430 sec >>pickle_loads cycle (1000): sympifyed expr Elapsed time: 2.320 sec >> >>cloudpickle_dumps cycle (1000): sympifyed expr Elapsed time: 7.584 sec >>cloudpickle_loads cycle (1000): sympifyed expr Elapsed time: 2.314 sec >> >>dill_dumps cycle (1000): sympifyed expr Elapsed time: 8.259 sec >>dill_loads cycle (1000): sympifyed expr Elapsed time: 2.806 sec >> >>redis_set cycle (1000): sympifyed expr Elapsed time: 0.066 sec >>redis_get cycle (1000): sympifyed expr Elapsed time: 0.051 sec >> >>pickle_dumps + redis_set cycle (1000): sympifyed expr Elapsed time: 0.524 sec >>redis_get + pickle_loads cycle (1000): sympifyed expr Elapsed time: 2.437 sec >> >>cloudpickle_dumps + redis_set cycle (1000): sympifyed expr Elapsed time: 7.659 sec >>redis_get + cloudpickle_loads cycle (1000): sympifyed expr Elapsed time: 2.492 sec >> >>dill_dumps + redis_set cycle (1000): lambdifyed expr Elapsed time: 8.333 sec >>redis_get + dill_loads cycle (1000): lambdifyed expr Elapsed time: 2.932 sec >> >>FINAL performance test: >>sympify + pickle_dumps_sympifyed_expr + redis_set cycle (1000): Elapsed time: 15.075 sec >>redis_get + pickle_loads_sympifyed_expr + subs cycle (1000): Elapsed time: 2.929 sec >>sympify + lambdify + dill_dumps_lambdifyed_expr + redis_set cycle (1000): Elapsed time: 87.707 sec >>redis_get + dill_loads_lambdifyed_expr + subs cycle (1000): Elapsed time: 2.356 sec >>sympify + cloudpickle_dumps_sympifyed_expr + redis_set cycle (1000): Elapsed time: 23.633 sec >>redis_get + cloudpickle_loads_sympifyed_expr + subs cycle (1000): Elapsed time: 3.059 sec >>sympify + lambdify + cloudpickle_dumps_lambdifyed_expr + redis_set cycle (1000): Elapsed time: 86.739 sec >>redis_get + cloudpickle_loads_lambdifyed_expr + subs cycle (1000): Elapsed time: 1.721 sec >> >>TEST performance for complex requests: >>redis_get + cloudpickle_loads_lambdifyed_expr + 1*subs cycle (1000): Elapsed time: 1.768 sec >>redis_get + cloudpickle_loads_lambdifyed_expr + 10*subs cycle (100): Elapsed time: 0.204 sec >>redis_get + cloudpickle_loads_lambdifyed_expr + 100*subs cycle (10): Elapsed time: 0.046 sec >>redis_get + cloudpickle_loads_lambdifyed_expr + 1000*subs cycle (1): Elapsed time: 0.028 sec

To use the code, you must:

Create an expr.txt file next to the python script and place in it the appropriate string formula
Install redis, dill, cloudpickle, sympy, numpy libraries

Source: https://habr.com/ru/post/328170/

All Articles