Hello. Today we want to share one more translation prepared on the eve of the launch of the course “Web-developer in Python” . Go!
Recently, I was very surprised when I discovered that
>>> pow(3,89)
running slower than
>>> 3**89
I tried to come up with some acceptable explanation, but I could not. I spotted the execution time of these two expressions using the timeit module from Python 3:
$ python3 -m timeit 'pow(3,89)' 500000 loops, best of 5: 688 nsec per loop $ python3 -m timeit '3**89' 500000 loops, best of 5: 519 nsec per loop
The difference is small. Only 0.1 microseconds, but it did not give me rest. If I can't explain anything in programming, I start to suffer from insomnia.
I found the answer using the Python IRC channel on Freenode. The reason pow works a little slower is that an additional step of loading pow from the namespace appears in CPython. Whereas when calling 3 ** 9 such loading is not necessary in principle. It also means that this time difference will remain more or less constant if the input values increase.
The hypothesis was confirmed:
$ python3 -m timeit 'pow(3,9999)' 5000 loops, best of 5: 58.5 usec per loop $ python3 -m timeit '3**9999' 5000 loops, best of 5: 57.3 usec per loop
In the process of finding a solution to this issue, I also learned about the dis module. It allows you to decompile Python bytecode and examine it. It was an extremely exciting discovery, since recently I have been studying the reverse engineering of binary files, and the discovered module came in exactly the way in this matter.
I decompiled the byte-code of the expressions above and got the following:
>>> import dis >>> dis.dis('pow(3,89)') # 1 0 LOAD_NAME 0 (pow) # 2 LOAD_CONST 0 (3) # 4 LOAD_CONST 1 (89) # 6 CALL_FUNCTION 2 # 8 RETURN_VALUE >>> dis.dis('3**64') # 1 0 LOAD_CONST 0 (3433683820292512484657849089281) # 2 RETURN_VALUE >>> dis.dis('3**65') # 1 0 LOAD_CONST 0 (3) # 2 LOAD_CONST 1 (65) # 4 BINARY_POWER # 6 RETURN_VALUE
You can read about how to understand the output of dis.dis correctly by referring to this answer on Stackoverflow.
Ok, back to the code. Pow decompilation makes sense. The byte code loads pow from the namespace, loads into registers 3 and 89, and finally calls the pow function. But why does the output of the next two decompilations differ from each other? After all, all that we have changed is the value of the exponent from 64 to 65!
This question introduced me to another new concept, which is called "constants convolution". It means that when we have a constant expression, Python calculates its value at the compilation stage, so that when you run the program, its operation will not take much time, since Python uses the already calculated value. Take a look at this:
def one_plue_one(): return 1+1 # --vs-- def one_plue_one(): return 2
Python compiles the first function into the second and uses it when running the code. Not bad, huh?
So why does convolution constant work for 3 ** 64, but not for 3 ** 65? Well, I do not know. This is probably somehow related to the limitation of the number of degrees previously calculated by the system in memory. I could be wrong. The next step I plan to take is to rummage through the Python source code in my spare time and try to understand what is happening. I'm still looking for the answer to my question, so if you have any ideas, share them in the comments.
I want you to draw inspiration from this post on finding solutions to your questions on your own. You never know where the answers will lead you. In the end, you can learn something completely new, as happened to me. I hope the flame of curiosity still burns in you!
Have you noticed similar things? We are waiting for your comments!
Source: https://habr.com/ru/post/460143/
All Articles