Docker, or There and back

With the advent of docker with us, as a monitoring service, life has become a little more complicated. As I wrote earlier , one of the chips of our service is the autodetection of services, that is, the agent himself finds the services running on the server, reads their configs and starts collecting metrics.

But at some point in production our docker started to appear at our clients, and our autodetect stopped working. The process, which is launched through the docker, contains various namespaces (mnt, net, user, pid), this rather complicates the work from the outside of the container with the files and the network inside the container.

Under the cut, I'll tell you how we solved this problem, what options we tried, and what worked in the end.

Our task can be divided into 2 parts:

learn to read files in containers
learn how to connect to the network services running in containers

Files in containers

The first hypothesis was very simple: we will simply determine where the fs container on the host disk looks, change paths and go there. Unfortunately, this only works in the case of AUFS, but it is almost never encountered in production.

Then we naively tried to do setns on the MNT namespace directly from the agent code, but this also did not work. The fact is that setns on mnt (and user too) can only do a single-application application:

If it is multithreaded

Our agent is written in golang and by the time we want to call setns, the gush runtime has already created several threads for us. In order for the agent to run some special processes like nsenter , you must first drag them onto the machine to the client, which we strongly did not want.

There was an option to run something through docker exec -ti , but, firstly, this command is available only from version 1.3, secondly, there is not only a docker, but also other containerization services, and thirdly, inside the container not even be a cat.

Then I found an interesting hack for go , which allows you to make setns in the constructor before running go runtime. As a result, we came to the conclusion that the agent runs itself with certain arguments and can read the file in the required ns, open glob on the file system of the container, and the like. But since setns should be executed in C code, I had to write in C and the processing of the launch arguments. And at the time of the call

 __attribute__((constructor))

argv / argc has not yet been initialized, so I had to read the launch arguments myself from /proc/self/cmdline .

When the agent starts up in this mode, it dumps the result of its work into stdout / stderr, and the parent agent reads this. Separately, we had to make a limit on the size of the readable file: in order not to load the disk, we don’t even try to read files of more than 200KB (often weighty nginx configs with geoip mapping), since this can noticeably load the disk on the client server.

Such an approach works well only when you need to read a file once, but it does not work if you need, for example, to tail the log. On the other hand, logs are not usually written on puff fs containers. They are usually either wrapped in a Docker stdout / stderr and run through dockerd, or they are written in mounted partitions on the host fs.

We still do not process the variant with dockerd, but it is worth noting that it is rarely found among our customers. Apparently due to the fact that on a large stream of logs, dockerd begins to load the processor.

For the case with mounted logging directories, we are trying to find the necessary file on the fs host through information from the docker inspect , and the plugin that wants to parse such a log gets the path to the file already outside the container.

Container network

The first idea about how to work on the network with the service in the container was also naive: we will take the container's IP from docker inspect and work with it. Then it turned out that access from the host to the container network might not be at all (macvlan). In addition, there is lxc and so on.

We decided to move toward setns. The network namespace, unlike mnt and user, can be overridden for one particular application thread. In golang, this is at first glance all quite simple:

run gorutina
block the current thread for it runtime.LockOSThread , so the other gorutiny in this thread will not be executed
do setns
if needed, you can setns on our previous namespace and remove the lock on the thread runtime.UnlockOSThread

But everything turned out to be more difficult. In fact, when blocking a thread, runtime does not guarantee that the execution of this gorutina will remain in this thread. There is a good description of just such a case in the post " Linux Namespaces And Go Don't Mix ."

Initially, we were going to launch a plugin that monitors the service in a container just in a locked thread with setns, but it broke on the very first http client.

Since we do not have an opportunity to influence the go planner , we began to look for a way to leave only the code in the thread that does not result in the generation of new threads.

We noticed that if you make tcp connection right after setns, it goes 100% of the time, and if you then exit the namespace and release the lock to the thread, the open connection continues to work (I find it difficult to explain why it works).

Then the task was reduced to ensuring that all the libraries for working with various services that we monitor, slip our Dialer (the function responsible for TCP connect):

redis :

 client := redis.NewClient(&redis.Options{ Dialer: func() (net.Conn, error) { return utils.DialTimeoutNs("tcp", params.Address, params.NetNs, redisTimeout) }, ReadTimeout: redisTimeout, WriteTimeout: redisTimeout, Password: params.Password, })

for memcached, we do not use any libraries, but we work with it on tcp with our hands, therefore there are no problems here either
in rabbitmq we go over http, standard http client can accept custom dial

mysql :

 mysql.RegisterDial("netns", func(addr string) (net.Conn, error) { return utils.DialTimeoutNs("tcp", addr, params.NetNs, connectTimeout) }) db, err = sql.Open("mysql", fmt.Sprintf("netns(%s)/?timeout=%s&readTimeout=%s&writeTimeout=%s", params.Address, connectTimeout, readTimeout, writeTimeout))

c postgresql turned out to be quite a crutch, I had to make my own pseudo driver for database/sql :

 func init() { sql.Register("postgres+netns", &drv{}) } type drv struct{} type nsDialer struct { netNs string } func (d nsDialer) Dial(ntw, addr string) (net.Conn, error) { return utils.DialTimeoutNs(ntw, addr, d.netNs, connectTimeout) } func (d nsDialer) DialTimeout(ntw, addr string, timeout time.Duration) (net.Conn, error) { return utils.DialTimeoutNs(ntw, addr, d.netNs, timeout) } func (d *drv) Open(name string) (driver.Conn, error) { parts := strings.SplitN(name, "|", 2) netNs := "" if len(parts) == 2 { netNs = parts[0] name = parts[1] } return pq.DialOpen(nsDialer{netNs}, name) }

then we call our driver:

 dsn := fmt.Sprintf("%s|postgres://%s:%s@%s/%s", p.NetNs, p.User, p.Password, p.Address, dbName) db, err := sql.Open("postgres+netns", dsn)

Total

Looking back, we did not regret that we chose the option with setns, so the same code recently worked perfectly for the client with lxc.

The only unclosed service at the moment is jvm in a container, but this is a completely different story.

Source: https://habr.com/ru/post/337964/

All Articles

Docker, or There and back

Files in containers

Container network

Total

More articles: