If you consider consistency an important part of a quality code, this article is for you.
Offers:
First, let's define what we call "consistency".
The more the source codes of the program look as if they were written by one person, the more consistent they are.
It is worth noting that often the same person can change their preferences over time, however, the problem of consistency is really acute in large projects involving a large number of developers.
Sometimes instead of the word "consistency" is used "consistency". In this article, I will sometimes use contextual synonyms in order to avoid frequent tautology.
There are different levels of consistency, for example, we can distinguish three of the most obvious:
The lower the list, the harder it is to maintain consistency. At the same time, the lack of consistency at the level of a single file of the source code looks most repulsive.
You can also go down from the file to the function level or a single statement, but this, in our case, is already too much detail. Toward the end of the article it will become clear why.
In Go, there are not so many identical operations that have different spelling (syntactic difference), but there is still room for disagreement. Some of the developers like option A
, while the second one likes B
Both options are valid and have their supporters. The use of any form of operation is permissible and not an error, but the use of more than one form can harm the consistency of the code.
What do you think, which of these two ways to create a slice of length 100 is used by most Go programmers?
// (A) new([100]T)[:] // (B) (&[100]T{})[:]
Neither option is preferred. In real code, I have never seen the use of any of them.
And use in this case is make([]T, 100)
.
Importing a single package can be done in two ways:
// (A) import "github.com/go-lintpack/lintpack" // (B) import ( "github.com/go-lintpack/lintpack" )
At the same time, neither gofmt
nor goimports
do not convert from one form to another. Most likely, in your project there are both options.
As long as Go has a built-in new
function and alternative ways to get a pointer to a new object, you will see both new(T)
and &T{}
.
// (A) new new(T) new([]T) // (B) &T{} &[]T{}
To create an empty slice (not to be confused with nil-slice) there are at least two popular methods:
// (A) make make([]T, 0) // (A) []T{}
You may find the division to create an empty slice and map
not very logical, but not all people who prefer []T{}
will use map[K]V{}
instead of make(map[K]V)
. Accordingly, the distinction is at least not redundant.
// (A) make make(map[K]V) // (B) - map[K]V{}
// (A) af, 0xff // (B) AF, 0xFF
Writing a type of 0xFf
, with a mixed register, is no longer about consistency. This should be found with a static analyzer (linter). How? Try, for example, gocritic .
In mathematics (and some programming languages, but not in Go) you could describe the range as low < x < high
. Source code that will express this constraint cannot be written this way. At the same time there are at least two popular ways to issue a check for entering a range:
// (A) x > low && x < high // (B) low < x && x < high
Did you know that in Go there is a binary operator &^
? It is called and-not
and it performs the same operation as &
, applied to the result ^
from the right (second) operand.
// (A) &^ ( ) x &^ y // (B) & ^ ( ) x & ^y
I conducted a survey to make sure that there may be different preferences at all. After all, if the choice in favor of &^
would be unanimous, this would have to be a linter check, and not part of the choice in the matter of code consistency. To my surprise, supporters were found in both forms.
There are many ways to write a real literal, but one of the most common features that can break the consistency even within a single function is the style of writing the whole and real part (shortened or full).
// (A) 0.0 1.0 // (B) ( ) .0 1.
Unfortunately, there are no established conventions for label naming. All that remains is to choose one of the possible ways and stick to it.
// (A) LABEL_NAME: goto PLEASE // (B) upper camel case LabelName: goto TryTo // (C) lower camel case labelName: goto beConsistent
Snake_case is also possible, but nowhere but Go assembler, I have not seen such tags. Most likely, this option should not be followed.
// (A) "=" var x int32 = 10 const y float32 = 1.6 // (B) "=" var x = int32(10) const y = float32(1.6)
In the case of the simplest calls that fit on one line of code, there can be no problems with parentheses. When, for one reason or another, a function or method call sprawls out into several lines, several degrees of freedom appear, for example, you will need to decide where to put the closing bracket argument.
// (A) multiLineCall( a, b, c) // (B) multiLineCall( a, b, c, )
// (A) " 0" len(xs) != 0 // (B) " 0 " len(xs) > 0 // (C) " 1 " len(xs) >= 1
For strings, s != ""
Or s == ""
( source ) is usually used.
There are two reasonable choices: set default
first or last label. The remaining options, like "somewhere in the middle" - this is a job for the linter. Checking the defaultCaseOrder
from gocritic
will help come up with a more idiomatic variant, and go-consistent
offer one of two possible options that will make the code more uniform.
// (A) default switch { default: return "?" case x > 10: return "more than 10" } // (B) default switch { case x > 10: return "more than 10" default: return "?" }
Above we have listed the list of equivalent operations.
How to determine which one to use? The most simple answer: the one that has a higher frequency of use in the considered part of the project (as a special case, in the whole project).
The go-consistent program analyzes the specified files and packages, counting the number of uses of one or another alternative, offering to replace less frequent forms with idiomatic ones within the analyzed part of the project, those with the highest frequency.
At the moment, the weight of each entry is equal to one. Sometimes this leads to the fact that one file dictates the style of the entire package only because of the fact that this operation is more often used in it. This is especially noticeable in relation to rare operations, such as creating an empty map
.
How this is the optimal strategy is not yet clear. This part of the algorithm will be easy to modify or allow users to choose one of several suggested ones.
If $(go env GOPATH)/bin
is in the system PATH
, then the following command will install go-consistent
:
go get -v github.com/Quasilyte/go-consistent go-consistent --help #
Returning to the boundaries of consistency, here’s how to check each one of them:
go-consistent
on this file.go-consistent
is designed in such a way that it can provide an answer even for huge repositories, where it is quite difficult to load all the packages into memory at the same time (at least on a personal machine without a huge amount of RAM).
Another important feature is the zero configuration. Running go-consistent
without any flags or configuration files is what works in 99% of cases.
Warning : the first run on the project may issue a large number of warnings. This does not mean that the code is written poorly; it is rather difficult to control consistency at such a micro level and it is not worth the effort if the control is performed exclusively by hand.
Reducing the consistency of the code can inconsistent naming of the parameters of a function or local variables.
For most Go programmers, it is obvious that erro
less successful name for an error than err
. What about s
versus str
?
The task of checking the consistency of variable names cannot be solved using go-consistent
methods. It is difficult to do without the manifesto of local conventions.
go-namecheck defines the format of this manifest and allows it to be validated, making it easier to follow the entity naming standards defined in the draft.
For example, you can specify that for parameters of functions of the type string , you should use the identifier s
instead of str
.
This rule is expressed in the following way:
{"string": {"param": {"str": "s"}}}
string
is a regular expression that captures the type of interest.param
- the scope of applicability of the replacement rules (scope). There may be several"str": "s"
indicates a replacement from str
to s
. There may be severalInstead of replacing 1-to-1, you can use a regular expression that captures more than one identifier. Here, for example, the rule that requires replacing the re
prefix for variables of type *regexp.Regexp
with the suffix RE
. In other words, instead of reFile
rule would require the use of fileRE
.
{ "regexp\\.Regexp": { "local+global": {"^re[AZ]\\w*$": "use RE suffix instead of re prefix"} } }
All types are considered ignoring pointers. Any level of indirection will be removed, so there is no need to define separate rules for pointers to the type and the type itself.
A file that describes both rules would look like this:
{ "string": { "param": { "str": "s", "strval": "s" }, }, "regexp\\.Regexp": { "local+global": {"^re[AZ]\\w*$": "use RE suffix instead of re prefix"} } }
It is assumed that the project starts with an empty file. Then, at a certain point, for code review, a request is made to rename a variable or field in the structure. Natural response may be requested to consolidate these previously informal requirements in the form of a verified rule in the naming conventions file. The next time the problem can be found automatically.
go-namecheck
installed and used in go-namecheck
way as go-consistent
, except that in order to get the correct result, you do not need to run a check on the entire set of packages and files.
The features discussed above are not critical separately, but affect the overall consistency in the aggregate. We considered code homogeneity at the micro level, which does not depend on the architecture or other features of the application, since these aspects are most easily validated with an almost zero number of false positives.
If you like the go-consistent or go-namecheck descriptions above, try running them on your projects. Feedback is a truly valuable gift for me.
Important : if you have any idea or addition, please tell about it!
There are several ways:
Warning : adding ago-consistent
and / orgo-namecheck
to a CI can be too radical an action. Running once a month, followed by editing all inconsistencies may be a better solution.
Source: https://habr.com/ru/post/429354/
All Articles