📜 ⬆️ ⬇️

Experience of using Go in Yandex production

I want to share the experience of using the Go language in the production systems of Yandex. In general, we are rather conservative about which languages ​​to use for real systems. And this only adds to the usefulness of the experience that we got this time.

We started developing on Go last summer. Then the Go framework for the Cocaine cloud platform appeared. Prior to this, Browser server API applications were written mainly in C ++ and Python. The server API at that time was just switching to a cloud platform, and for the most part we were only determined with which technologies to use in the future for it. The API performs the following functions: receive data, process, send to the internal Yandex service, process again, give back to the Browser. A set of simple applications.


')
The disadvantage of C ++ for us was an obvious overkill for our purposes, it took a lot of time to develop, it was also a big problem for us that the plus framework for Cocaine did not present any way to work asynchronously, except using callbacks. We had a lot of calls to various services, so as a result, soon the whole code became one big noodle of callbacks. Scaling and debugging it was very difficult.

Python had a slightly different problem. Firstly, this work speed was very low even with PyPy , and secondly, dynamic typing could potentially lead to errors. It was necessary to write tests even where one could do without it. Although in general it is worth noting that in Python just developing such applications was quite simple. Development was faster than on the pros and generally easier. There were no callbacks, the framework supported generators, it was possible to write asynchronous code as synchronous.

And then we somehow decided to try in the Go, having read it in advance. Go is a compiled, multi-threaded programming language with strong static typing ( duck typing for interfaces) and a garbage collector. Developed by Google. The initial development of Go began in September 2007, and its direct design involved Robert Grizmer, Rob Pyke and Ken Thompson. Officially, the language was introduced in November 2009.

We have three: me, Vyacheslav Bakhmutov and Anton Tyurin, it was suggested that Go would work better. Looking ahead, I will say that our expectations were confirmed. We now turn in more detail to the fact that after all it has become better and why.

Development speed


On Go, small programs can be written faster than in C ++, about as fast as in Python. And the sensations from the language are about the same.

Go has a good standard library that has almost everything you need. You rarely have to use external libraries, but when it still comes to that, in most cases you can just go get github.com/library/lib and it will be installed.

Go makes it very easy to write asynchronous code, it looks just like synchronous, but the go runtime executes it asynchronously. No callbacks, of course, no. All this is done on top of the goroutines. Rob Pike describes goroutines as "something like threads, but more lightweight." Something similar to what other languages ​​often call “green threads” or “fibers”.

For Go, there are plenty of pretty good IDEs and plugins for them. I personally use the plugin for IntelliJ IDEA, others use Sublime. The companion who makes Go used vim quite successfully with me.



This is typical Go code, almost every method called here actually does asynchronous work. region.GetRegionByLbs walks asynchronously into the geobase squirrel and Lbs, findLanguage in langdetect, and the third method with a long name walks through the urlfetcher into the Yandex string. As you can see, the code looks synchronous, it is convenient to write and debug it.

Convenience testing and error detection


We try hard to cover the code with tests, sometimes we write tests before writing the code, but this is still quite rare. Here Go shows itself from a very good side. Out of the box testing, code coverage in tests. And the latter is conveniently in the form of html those places that are not covered by tests in the code. Accordingly, you can get a general code coverage for the modules. A third-party library with broader functionality can be used over the common testing infrastructure, we do so.



Usually, to run tests, a command of the following kind is executed: go test suggest ... This instruction allows you to test all the modules that lie in the suggest module.

Profiling also works out of the box, allowing you to see the call graph of functions and the time of their execution through the built-in web service. The graph looks the same as in google performance tools.



There is a built-in thread sanitizer. Go is just developing one of the companions that makes the sanitizer in Google. It is possible to get a stack-trace of errors without using rocket science.

There are no memory errors in Go, it helps a lot, as not always and not everyone manages to be attentive. We have an suggest-data application; it loads 300 megabytes of data into itself at startup. When it was written on the pros, it sometimes fell, which caused a slight discomfort for our admin. The first time it fell, it was because of the Cocaine framework. This was fixed, but then the fall continued, the second time we did not fully understand the reasons, it was possible that there was a problem that we had written something wrong. As a result, we decided not to bother and rewrite it on Go (there were only 200 lines of code). The falls immediately disappeared. The problem was further complicated by the fact that the stack was often tainted and it was hard to find the cause of the fall. That is, it was possible, but difficult. After switching to Go, the memory is no longer corrupted, and if there are references to the null pointer, then we will see all this as a stack of traces in the logs.

This is how the log looks like:

 Wrong format of region (lr) /home/lamerman/work/omnibox/.../inside.go:109 (*Inside).getRegionData /home/lamerman/work/omnibox/.../inside.go:194 (*Inside).Call /home/lamerman/work/omnibox/.../main/main.go:57 *Inside.Call·fm /usr/local/go/src/pkg/net/http/server.go:1221 HandlerFunc.ServeHTTP /home/lamerman/work/go/src/github.com/.../httpreq.go:124 func·006 /home/lamerman/work/go/src/github.com/.../worker.go:219 func·015 /usr/local/go/src/pkg/runtime/proc.c:1394 goexit 

Performance


In order not to waste money on servers, the language should be fast enough. Here is a comparison of Go with C ++ and Python in standard tests. Result for Python:



As you can see, on average, Go is ten times faster. The same in comparison with C ++:



On average, Go is two to three times slower. Taking into account the fact that the language is young, one can think that in the future it can still be seriously accelerated.

I would also like to share my own observations about the speed of work with us. In Cocaine there is a service for getting content by url, called urlfetcher. For some reasons, for the time being, we are using our own version of it, and we have it in two copies, pyurlfetcher and gofetcher. As you can easily guess the difference in the language in which they are written. They implement the same interface. Let's try to shoot them. Over 10,000 calls, gofetcher spent 2.52 seconds of CPU time, pyurlfetcher spends 19.5 seconds on this, in fairness it should be noted that under PyPy it runs exactly twice as fast, that is, 10 seconds. In summary, it turns out that Go works 4 times faster than Python under PyPy and 8 times faster than cpython. Well, that is, if you use cpython, then you need to build 8 times more data centers.

Compare with C ++ can also be on one of our applications - suggest. The application shows a smart line in the browser, taking data from the Yandex sadzest and the witches, that is, basically there is json processing and going to all sorts of network services.

For 1000 requests to suggest, 1.10 seconds of CPU time is spent on Go, 0.57 seconds is spent on C ++, that is, you can see that Go is exactly two times slower than C ++ on this application. The application itself is about 6 thousand lines of code.

Memory


Here is a picture of the memory usage of our pens on the production server:

 7855 cocaine  20  0 262m 9620 3744 S   1 0.0 249:35.82 barnavig 7855 cocaine  20  0 262m 9620 3744 S   1 0.0 249:35.82 barnavig 8590 cocaine  20  0 324m 11m 3604 S   1 0.0 87:05.82 umaproto 

You can see that the average memory consumption is not very much, about 10 megabytes per handle in case there is no memory leak or caches. Memory leaks are easily debugged with an internal tool.

Compare two identical applications on the pros and Go. Both applications store a lot of data. As you can see, the consumption is almost identical at the very beginning of the application. Each has 300 megabytes of data:

 14742 cocaine  20  0 1071m 388m 3376 S   0 0.3 26:04.85 suggestdata (suggest data  Go) 2734 cocaine  20  0 825m 345m 3388 S   0 0.5 23:47.80 suggest-data-pr (suggest data  c++) 

Memory can also be profiled in case of leaks, more information about profiling can be read here .

Rest


You can debug applications using gdb and all the beautiful things it gives people. We had no problems debugging these programs under gdb, everything seems to work.

Another thing to add is that all all Go applications come together with all the libraries used in one large binary (static linking), this is most often convenient when deploying applications to the server, no need to think about dependencies, you can also easily deploy to any system, the same the binary itself can be pushed on both lucid and precise, no matter which package bundle is installed there. The only dependency that I have is libc. This approach also has some drawbacks.

Conclusion

I think that Go can be primarily useful for those who write in Python, but are not satisfied with the speed of the applications. Writing to Go can be as easy as using Python, but you can save a lot of machine resources. For people who write in C ++, Go can be useful where you need to write simple applications. Personally, after such a transition, my productivity has greatly increased.

Go, of course, is not perfect. There is a possibility that I just did not fully understand him. What I lack in the first place is generics. They are planning to enter in some future, but until the end of their perspective is not yet clear. Rob Pike himself said the following about this: "There are generics in Go, there they are called interfaces." Indeed, some generic code can be written using interfaces, but in some cases this is not enough. The lack of generics is partially offset by the presence of reflection. But faq and Pike assure that there will be generics.

We have Go applications running on Cocaine in production for about a year, and there have never been any fatal things. Go works, and in my opinion works well.

Go is actively developing, there are a lot of language conferences and new versions are regularly released that improve performance. Go is used internally by Google, Facebook, Docker, disqus.com ( http://blog.disqus.com/post/51155103801/trying-out-this-go-thing ) and many other large companies. The list can be found here .

Source: https://habr.com/ru/post/237985/


All Articles