Delete your dead code

The post “Delete Code” by Ned Batchelder recently appeared on HN , although it was originally written in 2002. Here I want to repeat several of Ned's thoughts, and take a more decisive position than he: remove the code as soon as you notice that it is no longer needed, no questions asked. I will also offer some tips from the trenches how to identify candidates for the dead code.

The fact that dead can not die!

This is not just a “very smart” and timely reference to pop culture. A dead code, that is, a code that is never executed in your program, is a real obstacle to supporting your code base. How many times could you not add something that seemed like a simple function or an improvement, just because you were stumped by the complexity of the code that should work alongside this function? How much nicer would your life be if adding a new feature or correcting an error would be as easy as you thought about when planning your work?

Every time you want to make changes, you need to consider how they interact with all existing functions, “crutches”, known errors and limitations of the surrounding code. This is easier to do with less code around the object you want to add, because at the same time, there will be less things to consider and fewer options that something goes wrong. The dead code is especially harmful, because it seems that it is necessary to take into account the interaction with it, but since it is dead, it is just a red herring. It cannot benefit you, since it is never executed.
')
The fact that a dead code can never die is a real threat to your ability to work with this code base. In the worst case, if code that is never called is not deleted, the size of your application will grow forever. Before you notice that, perhaps, only a few thousand really functioning lines are surrounded by several orders of magnitude with a large amount of code that does nothing useful.

He must go

Ned (Batchelder, not Stark ) was a bit more delicate and tactful than I am in this article:

Let's say you have an excellent class that has a bunch of methods. One fine day you discover that some particular method is no longer called. Will you leave it or delete it?

In this case there is no simple answer to this question, because it depends on the class and on the method. The answer depends on whether you assume that this method may be needed again in the future.
A source

I will say this: burn the earth and do not leave the code alive. The best code is the code of which you do not have.

If you're not as daring as I am, remember that version control systems will cover you in case you ever need this code again.

However, I never felt the need to return something that I already deleted, at least not literally returning line-by-line, literally, a piece of code that I had previously deleted.

Of course, I'm not talking about things like returning erroneous commits — we are all human, and I make as many mistakes as others. I mean, I never deleted objects sent to production, and then returned several weeks or months later, thinking: “Well, the code seems to be the one I wrote a year or more ago was very good, so I’ll return ka i have him back. The code base lives and develops over time, so the old code probably does not fit with the new ideas, techniques, frameworks and styles used today. I could go back to the old version for an update, especially if it’s some subtle points, but I never returned the code back en masse.

So do yourself and your team a favor and delete the dead code as soon as you notice it.

How did we get here?

Ned's post explains in detail how and why a “dead” code appears - perhaps the one who makes the changes does not think that the code should disappear forever and comment on it or add a conditional compilation. Perhaps the person making the changes does not know enough to understand that the code is actually dead (I’ll tell you about this later).

I’ll add another hypothesis to the list: we can all just be lazy. Of course, it is easier to not do something (that is, to leave it as it is) than to do something (delete).

Laziness is, after all, one of the three great virtues of a programmer . But the laziness that Larry Wall is talking about is of a different kind: "The quality that makes you exert more effort to reduce the total energy costs." From this point of view, the removal of dead code is Laziness with a capital L - to do something that is easy to do now, to save yourself from having to do something difficult in the future. We should all try to develop such Laziness. I like to think of her as “disciplined laziness,” our daily habit.

How do we get out of here?

I spend most of my time programming in Python, for which, unfortunately, IDEs usually cannot correctly analyze the full code base and automatically find never called code. But by combining discipline and some tools for analyzing the program at runtime (run-time tooling), we can approach this problem from two sides.

In simple cases, a good sense of code can help identify and remove dead code while you make changes. Imagine that you are working on a particular function and notice that one of the if / else branches can never be executed. I call it “dead code in the small” ^* and it is fairly easy to see and remove, but this requires a bit more effort than it could have been spent.

Until you develop the habit of noticing this during all your normal routine work, you can add one more step to the actions you perform before the commit: check for any dead code next to your changes. This can happen just before sending the code to your colleagues (you do a code review , right?) So they don’t have to repeat this process while watching your changes.

Another kind of dead code appears when you delete the last class that uses it, or when you make changes, not realizing that this is the last place that uses it. This is “dead code in the large” ^* , and is harder to detect in the course of normal programming, unless you are lucky enough to have eidetic memory , or know the code base like your own five fingers.

That's when we can be helped by tools for analyzing a program at runtime. In Magnetic, we use the Ned package (yes, the same Ned) coverage.py to help us make decisions about the dead code. Usually coverage is used during testing to ensure that your test cases correctly execute the test code, but we also use it in our “as usual” code to understand what is being used and what is not:

import coverage cov = coverage.Coverage( data_file="/path/to/my_program.coverage", auto_data=True, cover_pylib=False, branch=True, source=["/path/to/my/program"], ) cov.start() # ...  - ... cov.stop() cov.save()

Here a Coverage object is created with several options to make the report more convenient. First, we tell him where to save his data (we will use it later, to create a convenient HTML report, about what is used and what is not), and we ask it to automatically open and add them to this file using auto_data = True . Further, we ask you not to worry about processing the standard library and installed packages - this is not our code, so we can assume that much of what it contains may not be used by us. This is not a dead code that we have to maintain, so we can safely ignore it. We ask him to calculate the branch coverage (are both true and false states for each if statement)? And finally, we indicate where our sources are located, so that it can combine its knowledge of what is used and not used in the source code for compiling a report.

After starting our program, we can create an HTML report:

 $ COVERAGE_FILE=/path/to/my_program.coverage coverage html -d /path/to/output

Which looks something like this:

(A full HTML report example can be found in the coverage.py documentation. )

The lines highlighted in red were not called during program execution. These strings (and possibly methods) are candidates for deletion as dead code.

I will leave you three warnings about using this approach to find and delete dead code:

Be careful when considering the outcome of the work coverage - the fact that the string or function was not executed during one program run does not mean that they are necessarily dead or not available at all. You should still check the code to determine if they are completely dead in your application.
Calculating coverage means that your program needs to do more work, so it will be slower when running in this mode. I would not recommend running it in production, but in the staging environment or in the target scenarios, everything should be fine. As always, if performance is an important task, you must measure the impact of the calculation of coverage before running it.
Finally, do not trust code coverage reports during test runs. Some code may be dead, but tests will still run it; and some code may be alive, but not run tests!

Goodbye words

Before you, dear reader, I must apologize. I omitted an important part of Ned's post when I quoted him earlier.
He says:

In this case there is no simple answer to this question, because it depends on the class and on the method. [...] The rude answer may be as follows: if this class is part of the framework, then leave it, if it is part of the application, then delete it.

If you are writing a library or framework, and not an application, then the question of dead code becomes more difficult on the one hand, and easier on the other. In essence, you can never remove a part of the public API (except for incompatibility in major versions ). In fact, your entire public API is live code, even if you do not use it yourself. But behind the frontend, a dead code can still occur, and it should be removed.

Delete your dead code!

^* - references to the programming paradigms of "programming in the small" and "programming in the large" (approx. Transl.)

Source: https://habr.com/ru/post/283560/

All Articles