
Measurable and defining code quality is an eternal topic in the programming world. I think all the experts who already have experience with large projects with a long history do not doubt the need to maintain the code in a qualitative state. But not always enough time to find out what characteristics are important in this particular project. This article will not describe how to write and format code and whether spaces are needed around parentheses. Today I will try to highlight the most important aspects that should be paid attention to and what they can affect, and what are the acceptable limits and how to follow them is up to you.
First of all, we need to find out by what metrics we need to determine the quality of the code and why we need it. In programming, we are lucky and, in most cases, to determine the metric, it is enough for us to determine an important characteristic for us:
- compliance with the rules;
- code complexity;
- duplicates;
- commenting;
- coating tests.
Now consider each of them.
Compliance with the rules
Situations come under this point when the code is compiled and, in most cases, does its job, and does it right. This is an interesting feature to a large extent from the fact that the company must first have rules for writing code. You can do it easier and take the work of others (
Java Code Conventions, GCC Coding Conventions, Zends Coding Standard ), or you can work and supplement them with your own ones that are most suitable for the specifics of your company.
But why do we need rules for writing code if the code does its job? To answer the question, select several types of rules:
- syntactic rules are one of the most useless rules (but only a first glance), since they do not in any way wag at the execution of a program. These include the naming style of variables (camelCase, through underscore), constants (uppercase), methods, writing curly braces and whether they are needed if there is only one line of code in the block. This list can be continued. When a programmer writes code, he reads it easily, because he knows his own style. But once he gives the code where the Hungarian notation and brackets are used from the new line, he will have to spend extra attention on the perception of the new style. The situation is especially amusing when several completely different styles are used in one project or even a module.
- code support rules are rules that should signal that the code is too complex and will be difficult to maintain. For example, the complexity index (more about it below) of the method or class is too large or too many lines of code in the method, the presence of duplicates in the code or “magic numders”. I think the essence is clear, they all point us to the bottlenecks that will be difficult to accompany. But we must not forget that we can decide which index of complexity is big for us and which is acceptable.
- cleaning and code optimization are the simplest rules in the sense that rarely anyone will argue that expressions are very necessary, even when they are not used anywhere. This may include excess imports, variables and methods that are no longer used, but for some reason they have been left in inheritance.
The metric here is obvious: compliance with the rules should strive for 100%, that is, the fewer violations of the rules, the better.
Cyclomatic code complexity
A characteristic on which the complexity of code support directly depends. Here it is more difficult to select the metric than in the previous characteristic. If simple, it depends on the number of nested branch operators and cycles. Who is interested in more detailed descriptions, you can read on the
wiki . The lower the index the better, and the easier it will be in the future to change the structure of the code. It is necessary to measure the complexity of the method, class, file. The value of this metric should be limited to a certain limit number. For example, the cyclomatic complexity of the method should not exceed 10, otherwise you need to simplify or break it.
')
Duplicates
An important feature that reflects how easily in the future (or present) it will be possible to make changes to the code. The metric can be expressed as a percentage as the ratio of rows of duplicates to all lines of code. The fewer duplicates, the easier it will be to live with this code.
Commenting
One of the most holivarnyh and sick topics among programmers: “Comment or not to comment?”. Everyone knows the dialogue from the book of Steve McConnell, it has already been published on Habré. From this we can conclude that the characteristics must be approached very individually, based on the specifics of the company and the products with which the company works: commenting is not so necessary for small projects, but for large projects, well-developed rules will greatly facilitate maintenance. For commenting, we can distinguish two important metrics:
- relation of comments to the whole code - from this metric it can be concluded how detailed the comments are and how useful they can be. Of course, it is impossible to say from this metric whether there are comments “cycle” before a cycle, but this needs to be corrected when a review is being conducted.
- commenting on public methods - the ratio of commented public methods to their total number. Since public methods are used outside the class or package, it is best to comment on what this method should do and what it can affect. The number of public methods without comment should tend to zero.
As I have already written, the question of commenting on the code is better solved based on the needs of the company, but it is better to live with the commented code.
Test coverage
It is not necessary to describe the need and role of automatic tests for a project, because this is the topic of a separate article. But this is a very important characteristic of code quality. The level of coverage is read as the ratio of the number of covered code elements to the number of all existing ones. Depending on what is meant as a code element, the following types of coverage are often distinguished:
- file coverage - here you need to determine how to understand that the file is covered with tests: often the file is covered if the test got into the file and executed at least one line of code from the file. Therefore, this metric is used extremely rarely, but it still has the right to exist.
- class coverage - similarly with file coverage, only class coverage :). Also rarely used.
- The coverage of methods is the same method of calculating metrics. True method coverage can be more widely spread: if you have a rule on a project to cover each method with at least one test, then with this metric you can quickly find code that does not comply with the rules.
- row coverage is one of the most commonly used coverage metrics. The same method of calculus, only for ob'ek taken a line.
- the branch coverage is the same, respectively, branching is taken for the element. To achieve a good indicator for this metric is worth the greatest effort. By this metric, one can judge how conscientiously the programmer approached the test coverage.
- total coverage is the coverage metric whereby not one element but several are taken into account in the calculations. Most often use the total coverage of rows and branches.
The higher the coverage of the code with tests, the lower the risk of breaking part of the system and leaving it unnoticed.
Instead of conclusion
The list presented here is not complete, but it may be quite enough to maintain the code in a qualitative state. All these characteristics are included in static code analysis and it is good practice to automate this process. I hope for someone the article will be useful.
