Control of code consistency in Go

If you consider consistency an important part of a quality code, this article is for you.

Offers:

Different ways of doing the same thing in Go (equivalent operations)
Less obvious factors affecting the uniformity of your code.
Ways to increase the consistency of your project

Consistency is our something

First, let's define what we call "consistency".

The more the source codes of the program look as if they were written by one person, the more consistent they are.

It is worth noting that often the same person can change their preferences over time, however, the problem of consistency is really acute in large projects involving a large number of developers.

Sometimes instead of the word "consistency" is used "consistency". In this article, I will sometimes use contextual synonyms in order to avoid frequent tautology.

There are different levels of consistency, for example, we can distinguish three of the most obvious:

Single source file consistency
Package Level (or Library) Consistency
Consistency at the level of the entire project (if it is controlled by one vendor)

The lower the list, the harder it is to maintain consistency. At the same time, the lack of consistency at the level of a single file of the source code looks most repulsive.

You can also go down from the file to the function level or a single statement, but this, in our case, is already too much detail. Toward the end of the article it will become clear why.

Equivalent Go operations

In Go, there are not so many identical operations that have different spelling (syntactic difference), but there is still room for disagreement. Some of the developers like option A , while the second one likes B Both options are valid and have their supporters. The use of any form of operation is permissible and not an error, but the use of more than one form can harm the consistency of the code.

What do you think, which of these two ways to create a slice of length 100 is used by most Go programmers?

 // (A) new([100]T)[:] // (B) (&[100]T{})[:]

Answer

Neither option is preferred. In real code, I have never seen the use of any of them.

And use in this case is make([]T, 100) .

Single import

Importing a single package can be done in two ways:

 // (A)   import "github.com/go-lintpack/lintpack" // (B)   import ( "github.com/go-lintpack/lintpack" )

At the same time, neither gofmt nor goimports do not convert from one form to another. Most likely, in your project there are both options.

Highlight pointer to zero

As long as Go has a built-in new function and alternative ways to get a pointer to a new object, you will see both new(T) and &T{} .

 // (A)   new new(T) new([]T) // (B)     &T{} &[]T{}

Creating an empty slice

To create an empty slice (not to be confused with nil-slice) there are at least two popular methods:

 // (A)   make make([]T, 0) // (A)   []T{}

Creating an empty hash table

You may find the division to create an empty slice and map not very logical, but not all people who prefer []T{} will use map[K]V{} instead of make(map[K]V) . Accordingly, the distinction is at least not redundant.

 // (A)   make make(map[K]V) // (B)  - map[K]V{}

Hex Literals

 // (A) af,   0xff // (B) AF,   0xFF

Writing a type of 0xFf , with a mixed register, is no longer about consistency. This should be found with a static analyzer (linter). How? Try, for example, gocritic .

Check for entry into the range

In mathematics (and some programming languages, but not in Go) you could describe the range as low < x < high . Source code that will express this constraint cannot be written this way. At the same time there are at least two popular ways to issue a check for entering a range:

 // (A)     x > low && x < high // (B)    low < x && x < high

Operator and not

Did you know that in Go there is a binary operator &^ ? It is called and-not and it performs the same operation as & , applied to the result ^ from the right (second) operand.

 // (A)   &^ ( ) x &^ y // (B)  &  ^ ( ) x & ^y

I conducted a survey to make sure that there may be different preferences at all. After all, if the choice in favor of &^ would be unanimous, this would have to be a linter check, and not part of the choice in the matter of code consistency. To my surprise, supporters were found in both forms.

Real Number Literals

There are many ways to write a real literal, but one of the most common features that can break the consistency even within a single function is the style of writing the whole and real part (shortened or full).

 // (A)      0.0 1.0 // (B)      ( ) .0 1.

LABEL or label?

Unfortunately, there are no established conventions for label naming. All that remains is to choose one of the possible ways and stick to it.

 // (A)     LABEL_NAME: goto PLEASE // (B) upper camel case LabelName: goto TryTo // (C) lower camel case labelName: goto beConsistent

Snake_case is also possible, but nowhere but Go assembler, I have not seen such tags. Most likely, this option should not be followed.

Type Specification for a Untyped Numeric Literal

 // (A)     "=" var x int32 = 10 const y float32 = 1.6 // (B)     "=" var x = int32(10) const y = float32(1.6)

Transferring the closing bracket of the function being called

In the case of the simplest calls that fit on one line of code, there can be no problems with parentheses. When, for one reason or another, a function or method call sprawls out into several lines, several degrees of freedom appear, for example, you will need to decide where to put the closing bracket argument.

 // (A)         multiLineCall( a, b, c) // (B)       multiLineCall( a, b, c, )

Check for non-zero length

 // (A)  "    0" len(xs) != 0 // (B)  " 0 " len(xs) > 0 // (C)  "  1 " len(xs) >= 1

For strings, s != "" Or s == "" ( source ) is usually used.

Location of default labels in switch

There are two reasonable choices: set default first or last label. The remaining options, like "somewhere in the middle" - this is a job for the linter. Checking the defaultCaseOrder from gocritic will help come up with a more idiomatic variant, and go-consistent offer one of two possible options that will make the code more uniform.

 // (A) default   switch { default: return "?" case x > 10: return "more than 10" } // (B) default   switch { case x > 10: return "more than 10" default: return "?" }

go-consistent

Above we have listed the list of equivalent operations.

How to determine which one to use? The most simple answer: the one that has a higher frequency of use in the considered part of the project (as a special case, in the whole project).

The go-consistent program analyzes the specified files and packages, counting the number of uses of one or another alternative, offering to replace less frequent forms with idiomatic ones within the analyzed part of the project, those with the highest frequency.

Straightforward counting

At the moment, the weight of each entry is equal to one. Sometimes this leads to the fact that one file dictates the style of the entire package only because of the fact that this operation is more often used in it. This is especially noticeable in relation to rare operations, such as creating an empty map .

How this is the optimal strategy is not yet clear. This part of the algorithm will be easy to modify or allow users to choose one of several suggested ones.

If $(go env GOPATH)/bin is in the system PATH , then the following command will install go-consistent :

 go get -v github.com/Quasilyte/go-consistent go-consistent --help #

Returning to the boundaries of consistency, here’s how to check each one of them:

Check the consistency within a single file by running go-consistent on this file.
The consistency inside the package is calculated at startup with one package argument (or with all the files in this package indicated)
Calculating global consistency will require passing all packets as arguments.

go-consistent is designed in such a way that it can provide an answer even for huge repositories, where it is quite difficult to load all the packages into memory at the same time (at least on a personal machine without a huge amount of RAM).

Another important feature is the zero configuration. Running go-consistent without any flags or configuration files is what works in 99% of cases.

Warning : the first run on the project may issue a large number of warnings. This does not mean that the code is written poorly; it is rather difficult to control consistency at such a micro level and it is not worth the effort if the control is performed exclusively by hand.

go-namecheck

Reducing the consistency of the code can inconsistent naming of the parameters of a function or local variables.

For most Go programmers, it is obvious that erro less successful name for an error than err . What about s versus str ?

The task of checking the consistency of variable names cannot be solved using go-consistent methods. It is difficult to do without the manifesto of local conventions.

go-namecheck defines the format of this manifest and allows it to be validated, making it easier to follow the entity naming standards defined in the draft.

For example, you can specify that for parameters of functions of the type string , you should use the identifier s instead of str .

This rule is expressed in the following way:

 {"string": {"param": {"str": "s"}}}

string is a regular expression that captures the type of interest.
param - the scope of applicability of the replacement rules (scope). There may be several
The pair "str": "s" indicates a replacement from str to s . There may be several

Instead of replacing 1-to-1, you can use a regular expression that captures more than one identifier. Here, for example, the rule that requires replacing the re prefix for variables of type *regexp.Regexp with the suffix RE . In other words, instead of reFile rule would require the use of fileRE .

 { "regexp\\.Regexp": { "local+global": {"^re[AZ]\\w*$": "use RE suffix instead of re prefix"} } }

All types are considered ignoring pointers. Any level of indirection will be removed, so there is no need to define separate rules for pointers to the type and the type itself.

A file that describes both rules would look like this:

 { "string": { "param": { "str": "s", "strval": "s" }, }, "regexp\\.Regexp": { "local+global": {"^re[AZ]\\w*$": "use RE suffix instead of re prefix"} } }

It is assumed that the project starts with an empty file. Then, at a certain point, for code review, a request is made to rename a variable or field in the structure. Natural response may be requested to consolidate these previously informal requirements in the form of a verified rule in the naming conventions file. The next time the problem can be found automatically.

go-namecheck installed and used in go-namecheck way as go-consistent , except that in order to get the correct result, you do not need to run a check on the entire set of packages and files.

Conclusion

The features discussed above are not critical separately, but affect the overall consistency in the aggregate. We considered code homogeneity at the micro level, which does not depend on the architecture or other features of the application, since these aspects are most easily validated with an almost zero number of false positives.

If you like the go-consistent or go-namecheck descriptions above, try running them on your projects. Feedback is a truly valuable gift for me.

Important : if you have any idea or addition, please tell about it!
There are several ways:

Write in the comments to this article.
Create an issue go-consistent
Implement your utility and let the world know about it.

Warning : adding a go-consistent and / or go-namecheck to a CI can be too radical an action. Running once a month, followed by editing all inconsistencies may be a better solution.

Source: https://habr.com/ru/post/429354/

All Articles