Go without global variables

Translation of the article by Dave Cheni - the answer to the previous post of Peter Burgon, "The Theory of the Modern Go" - with an attempt to conduct a thought experiment, what would look like Go without variables in the global scope at all. Although in some paragraphs you can break the language, but the post is quite interesting.

Let's do a thought experiment, what Go would look like if we got rid of the variables in the global packet scope. What would be the implications and what can we learn about the design of Go programs from this experiment?

I will only talk about crossing out var , the other five top-level definitions remain resolved in our experiment, since they are essentially constants at the compilation stage. And, of course, you can continue to define variables in functions and any blocks.

Why are global variables in packages bad?

But first, let's answer the question: why are global variables in packages bad? Leaving aside the obvious problem of a global visible state in a concurrency language, global variables in packages are essentially singletones used to implicitly change states between weakly not very connected things, creating a strong dependency and making code difficult to test.

As Peter Burgon recently wrote :

tl; dr magic is bad; global state is magic → global variables in packages is bad; The init () function is not needed.

Getting rid of global variables in practice

To test this idea, I studied in detail the most popular code base on Go — the standard library, to see how it uses global variables in packages, and tried to evaluate the effect of our experiment.

Errors

One of the most common uses of global var in public packages is errors - io.EOF , sql.ErrNoRows , crypto/x509.ErrUnsupportedAlgorithm , etc. Without these variables, we will not be able to compare errors with predefined values. But can we replace them with something?

I wrote earlier that you should try to look at the behavior and not at the type when analyzing errors. If this is not possible, the definition of error constants eliminates possible error modification and preserves their semantics.

The remaining error variables will be private and simply give the symbolic name of the error message. These variables are not exported, so they cannot be used for comparison from outside the package. Defining them at the top level of the package, and not at the place where they occur, makes it impossible for us to add any additional context to the error. Instead, I recommend using something like pkg / errors to save the frame to an error at the time of its origin.

check in

The registration pattern is used in several packages of the standard library, such as net/http , database/sql , flag and some also in the log . It usually consists of a global variable of type map or a structure that is modified by a certain public function — the classic singleton.

The inability to create such a dummy variable that must be initialized from outside makes it impossible for image , database/sql and crypto packages to register decoders, database drivers and cryptographic schemes. But this is exactly the magic that Peter talks about in his article - importing a package, in order for it to implicitly change the global state of another package and it really looks ominous from the outside.

Registration also encourages repetition of business logic. For example, the net/http/pprof registers itself and, as a side effect, net/http.DefaultServeMux , which is not entirely safe - the other code can no longer use the default multiplexer without the information that pprof gives out and registering it with another multiplexer is not so trivial.

If there were no global variables in the packages, such packages as net/http/pprof could provide a function that would register URL paths for the specified http.ServeMux , and not depend on the implicit change in the global state of another package.

Getting rid of the possibility to use the registration pattern could also help solve the problem with multiple copies of the same package , which, when imported, try to register everything together during the launch.

Interface Satisfaction Check

There is such an idiom for checking whether the interface is of a type:

 var _ SomeInterface = new(SomeType)

It is found at least 19 times in the standard library. In my opinion, such tests are, in fact, tests. They should not compile at all, so that they can be removed when building the package. They need to be _test.go to the appropriate _test.go files. But if we prohibit global variables in batches, this also applies to tests, so how can we keep this check?

One solution would be to remove this definition of the variable from the global scope to the scope of the function, which will still stop compiling if SomeType suddenly ceases to satisfy the interface SomeInterface

 func TestSomeTypeImplementsSomeInterface(t *testing.T) { // won't compile if SomeType does not implement SomeInterface var _ SomeInterface = new(SomeType) }

But, since it is, in fact, just a test, we can rewrite this idiom in the form of a regular test:

 func TestSomeTypeImplementsSomeInterface(t *testing.T) { var i interface{} = new(SomeType) if _, ok := i.(SomeInterface); !ok { t.Fatalf("expected %t to implement SomeInterface", i) } }

As a remark, since the Go specification says that assigning an empty identifier (_) means a full evaluation of the expression on the right side of the assignment sign, a pair of suspicious initializations are probably hidden in the global scope.

But not so simple

In the previous section, so far everything went smoothly and the experiment with getting rid of global variables seems to be a success, but there are several places in the standard library where everything is not so simple.

Real Singles

Although I believe that the singleton pattern as a whole is often used where it is not needed, especially as a registration, there will always be a real singleton in each program. A good example of this is os.Stdout and company.

 package os var ( Stdin = NewFile(uintptr(syscall.Stdin), "/dev/stdin") Stdout = NewFile(uintptr(syscall.Stdout), "/dev/stdout") Stderr = NewFile(uintptr(syscall.Stderr), "/dev/stderr") )

There are several problems with this definition. First, Stdin , Stdout and Stderr are variables of type *os.File , not io.Reader or io.Writer interfaces. This makes replacing them with alternatives rather problematic . But even the very idea of replacing them is just the magic that our experiment is trying to get rid of.

As the previous example with constant errors showed, we can leave the singleton entity for standard IO descriptors so that packages like log and fmt can use them directly, but not declare them as mutable global variables. Something like this:

 package main import ( "fmt" "syscall" ) type readfd int func (r readfd) Read(buf []byte) (int, error) { return syscall.Read(int(r), buf) } type writefd int func (w writefd) Write(buf []byte) (int, error) { return syscall.Write(int(w), buf) } const ( Stdin = readfd(0) Stdout = writefd(1) Stderr = writefd(2) ) func main() { fmt.Fprintf(Stdout, "Hello world") }

Keshi

The second most popular way to use non-exported global variables in packages is caches. They are of two types - real caches, consisting of objects of type map (see the registration pattern above) or sync.Pool , and quasi- sync.Pool variables that improve the cost of compilation (note translator - "shta?")

An example is the crypto/ecsda , in which there is a zr type, whose Read () method resets any buffer that it passes to the input. The package contains a single variable of type zr, because it is built into other structures like io.Reader, potentially running off into a heap each time it is declared.

 package ecdsa type zr struct { io.Reader } // Read replaces the contents of dst with zeros. func (z *zr) Read(dst []byte) (n int, err error) { for i := range dst { dst[i] = 0 } return len(dst), nil } var zeroReader = &zr{}

But at the same time, the zr type does not contain an embedded io.Reader — it implements io.Reader — so we can remove the unused zr.Reader field, thus making zr an empty structure. In my tests, this modified type can be initialized explicitly without any loss in performance: \

 csprng := cipher.StreamReader{ R: zr{}, S: cipher.NewCTR(block, []byte(aesIV)), }

It may be worthwhile to revise some solutions for caches, since inlining and escape analysis have greatly improved since the writing of the standard library.

Tables

And the final most frequent use of private global variables in packages is tables — for example, in unicode , crypto/* and math packages. These tables usually encode constant data in the form of integer arrays or, more rarely, simple structures or objects of type map.

Replacing global variables with constants will require changes in the language, something like the one described here . So, if we assume that there is no way to change these tables during the work of the program, they may be an exception to this proposal (proposal).

Too far gone

Despite the fact that this post was just a thought experiment, it is already clear that the prohibition of all global variables in packages is too draconian to be real in language. Bypassing the problems with the ban can be very impractical in terms of performance, and it will be like hanging a poster "hit me" on the back and invite all Go haters to have fun.

But at the same time, it seems to me there are some very specific tips that can be drawn from this mental experiment without having to go to extremes and change the language specification:

First, the use of public var definitions is better to refuse. This is not a controversial topic, and it is definitely not unique to Go. The singleton pattern is better not to use, and a muddy public variable, which can be changed at any time by anyone who knows its name, is automatically a “stop” signal.
Secondly, if somewhere the public variable is determined, then you need to be extremely attentive to its type and try to make it as simple as possible. There should not be such that the type, in theory, is used on per instance basis (commentator - I do not know how to interpret it correctly), and we assign it to a variable in the global scope of the package.

Private definitions of global variables are more specific, but some patterns can be extracted:

Private variables with public setters (functions Set ()), which I call "registries" in fact have the same effect as their public counterparts. Instead of registering dependencies in the global scope, they should be passed during creation using constructor functions, literals, configuration structures, or options functions .
Caches in the form of variables of type [] byte can often be defined as constants without loss of performance. Do not forget that the compiler very well optimizes calls like string([]byte) where they do not go beyond the scope of the function.
Private variables containing tables, such as in the unicode package, are an inevitable consequence of the absence of the type of a constant array in Go. As long as they are private, and do not provide any way to change them, they can actually be considered as constants in the framework of this discussion.

Summarizing, think twice and three times before adding global variables to the package that can change the value while the program is running. This may be a sign that you have added a magical global state.

Source: https://habr.com/ru/post/331048/

All Articles