Experience in porting a project to Python 3

I want to share the experience of porting a project from Python 2.7 to Python 3.5. Unusual ambushes and other interesting nuances.

A little bit about the project:

Browser: site + game logic (hierarchical finite automata + a bunch of rules);
Age: 4 years (started in 2012);
64k loc logic + 57k loc tests;
2400 commits.

The porting was performed using the 2to3 utility with the subsequent recovery of the tests. How long it took to say difficult, the project - a hobby - I do it in my spare time.

2to3

2to3 converts Python 2 sources to Python 3. To do this, it applies a set of heuristics to them (their list can be customized). In general, there are no problems with the utility, but if you have a large and / or complex project, it is better to familiarize yourself with the list of heuristics before launching.
')
After processing the source, I highly recommend that you subtract the changes, since performance is not something that is put at the forefront when converting.

It is also possible that some of your names will overlap with methods being deleted / modified. For example, 2to3 changed the code that worked with my has_key method of my own class (this is the method of the Python 2 dictionary and removed in Python 3).

Price of progress

So, about what you can stumble, if you start to move progress in the direction of Python 3. I will start with the most interesting.

Banking Rounding

“CHEEEEEOOOOOO!!?” O_O

Something like this was my reaction when, as I was dealing with the next test, I saw the following in the console:

round(1.5) 2 round(2.5) 2

"Banking" rounding - rounding to the nearest even. These are new rounding rules, replacing the “school” rounding up.

The meaning of “banking” rounding is that when working with large amounts of data and complex calculations, it reduces the likelihood of error accumulation. In contrast to the usual "school" rounding, which always results in half values to a larger number.

For most, this change is not critical, but it can lead to a completely unexpected change in the behavior of the program. In my case, for example, the location of the roads on the game map has changed.

Please note it works for any accuracy.

 round(1.65, 1) 1.6 round(1.55, 1) 1.6

Integer division became fractional

If you relied on integer arithmetic with the int type (when 1/4 == 0 ), then get ready for a lengthy reading of the code, because now 1/4 == 0.25 and perform an automatic replacement / on (integer division operator) from -for the lack of information about the types of variables.

Guido van Rossum explained in detail the reason for this change .

New semantics map

The behavior of the map function changed during iteration over several sequences.

In Python 2, if one sequence is shorter than the others, it is complemented by None objects.
In Python 3, if one sequence is shorter than the others, the iteration stops.

Python 2:

 map(lambda x, y: (x, y), [1, 2], [1]) [(1, 1), (2, None)]

Python 3:

 list(map(lambda x, y: (x, y), [1, 2], [1])) [(1, 1)]

In class bodies in generators and list expressions, class attributes cannot be used.

The code below will work in Python 2, but will throw a NameError: name 'x' is not defined exception NameError: name 'x' is not defined in Python 3:

 class A(object): x = 5 y = [x for i in range(1)]

This is due to changes in the scope of generators, list expressions and classes. Detailed analysis on Stackoverflow .

But the following code will work:

 def make_y(x): return [x for i in range(1)] class A(object): x = 5 y = make_y(x)

New and removed methods from standard classes

If you relied on the presence or absence of methods with specific names, then unexpected problems may arise. For example, in one place where the Black Wolf was going on, I distinguished lines from lists by the presence of the __iter__ method. In Python 2, it does not have lines, in Python 3 it appeared and the code broke.

Semantics of operations has become stricter

Some operations, which by default worked in Python 2, stopped working in Python 3 . In particular, the comparison of objects without explicitly specified comparison methods is prohibited.

Expression object() < object() :

In Python 2, returns True or False (depending on the identity of the objects).
In Python 3, it will throw a TypeError: unorderable types: object() < object() .

Standard Classes Implementation Changes

I think there are many different ones, but I am faced with a change in the behavior of the dictionary. The following code will have different effects in Python 2 and Python 3:

 D = {'a': 1, 'b': 2, 'c': 3} print(list(D.values()))

In Python 2, it always prints [1, 3, 2] (or at least the same sequence for a specific Python assembly on a particular machine).

In Python 3, the sequence of elements differs with each launch. Accordingly, the results of executing code relying on this “feature” will differ.

Of course, I did not specifically rely on a fixed sequence of elements in the dictionary, but as it turned out, I did it implicitly.

Memory and CPU Usage

Unfortunately, due to the combination of porting, moving to a new server and refactoring, it was not possible to make specific measurements.

findings

My main conclusion is that Python has become more idiomatic:

indefinite behavior has become truly indefinite;
The recommended programming style is more recommended;
bad practices become harder to follow;
good practices become easier to follow.

In the code, it became easier to detect semantic errors that could have been hidden for years in the past.

The second conclusion: if you are tied up with mathematical operations, it is better to start implementing them immediately in the correct 3 key for Python, even if you are going to drag out with the move until the 20th year.

Write the code in Python 2 using __future__ and there will be no problems with the move.

Source: https://habr.com/ru/post/318384/

All Articles