📜 ⬆️ ⬇️

Modern back office of IT company

In one of the discussions recently, I listed the main systems that make the work of an IT company civilized. The list was very extensive, and I decided to issue it as a separate article.

A similar design can be seen in many companies, moreover, I observed companies in which a part of these systems had been missing for a long time, and because of unsolvable permanent problems, these systems began to appear spontaneously.

All of the following applies to companies / departments in which qualified staff works, that is, they do not need the “office for beginners” courses. Just as you do not need group policies on workstations and a special admin for shifting shortcuts on the desktop and installing your favorite program. In other words, this is the back office of IT people, which is significantly different from the back office of the other departments.
')
A brief content spoiler: VCS, source code repository, code review, build server, CI, task tracker, wiki, corporate blog, functional testing, package repository, configuration management system, backups, mail / jabber.

Picture with a fragment of the discussed infrastructure:



So let's start with a simple one.

Workplaces: a computer with buttons (about 90-100 pieces) for each workplace. It is also desirable external / second / third monitor. Typically used laptops, very widely - MacBook. Users have admins or sudo on their machines, they themselves define a set of convenient software, including editor, debugger, email client, browser, terminalku, etc.

The Internet. Usually in the office are WiFi and BYOD (in other words, freely brought in their own laptops, tablets and phones abusing office WiFi). Often wired network may not be at all. Security in such a network is conditional, because all communications are encrypted. We need a lot of Internet. And not only for seals on YouTube during working hours, but also for a sudden urgent "right now download a DVD with raw materials to compare the version of the package." From real life, by the way. Taking into account any stackoverflow and other IT resources, the Internet should be unlimited, uncontrollable, and the faster, the better.

We drove simple. Further serious.

The version control system (VCS) should be common, “it’s not for everyone’s own” here. The de facto standard is git , conventionally popular mercurial (hg), exotic bazar (bzr), from the last century svn / cvs / vcs. Plus, the world has its own world, there is something else there.

The version control system works locally, so there should be a central source code repository , into which everyone pushes (who should push). Gitlab is very, very popular. There are proprietary solutions, there is a github for those who are lazy to raise themselves. It also solves the second important thing: pull requests. So that one person could see what the other did before it gets into the main branch (common repository). I note that pull requests make sense, even if code review is not carried out as such. The principle is simple - one has written, the other is merge.

If the code is complex, then you need a system for code review . Code review implies that programmers (or system administrators — devops, all that) look at each other’s code, and there is some formal procedure for accepting code — for example, “at least two people must see and approve, one of whom must be a senyer / lead”. Examples: gerrit, Crucible. If the complexity is balanced on the edge, then you can try to comply with the agreement voluntarily, discussing in the comments in gitlab. But as everything is voluntary, what the robots do not watch, it will sometimes misfire.

Management of a team of people in the minimum form is carried out by the task scheduler (task tracker) - redmine, jira, mantis. Most often it performs the role of a bug tracker as well. The main goal is to formalize the formulation of the problem, to remove ambiguities and to find the guilty, when someone did something wrong (you said so? No, you didn’t understand it so much! - after that, the text of the task looks and it becomes clear who missed). In the case of a task tracker, the oral right should be diligently eliminated, especially by the superior / timlid. Need to change something? Set the task. The volume of bureaucracy is minimal, the amount of chaos is reduced by a multiple.

Practically obligatory, you can consider having your own wiki — mediawiki, moin-moin, confluence, dokuwiki — anything where you can write articles that are visible to other team members, and where you can edit behind others. An ideal place for folding texts "how to do this," regulations, discussion results, planned architectural solutions, explanations why it will be done this way and not otherwise. A well-structured wiki is good, but even an indiscriminate pile of texts is more useful than an oral tradition that fades out along with the resigned employee who "knew all this."

If the wiki supports blogging, well, if not, you must either agree on the format of blogging in the wiki, or raise something for its internal corporate management. What to write in this blog? Spent 4 hours catching a strange bug in the config? Describe it in a blog - the next time you’ll look for yourself, for read quickly, and do it yourself slowly. Began to deal with a long-long problem, which is entirely in my head not to keep? Instead of a text editor on a computer as a notebook - write to the wiki. Sometimes it may be that someone from colleagues right in the process of debugging will read the already written and say a shorter solution. And at some point, the company's blog will become perhaps the most valuable resource in difficult situations (No. 2 after the Internet).

Writing code that works on the workstation where it was written is a rare luxury. Most often (and the farther, the more) the code is the middleware - a layer between other large pieces of server code, and requires an extensive environment for productive work. “This debugging application needs mysql with a copy of the working database, memcached, redis, and snmp to the switch.” Such an environment at the workstation to raise - something more fun. And it happens that there are several projects, and each has its own environment.

Thus, we get the first difficult thing: stands for programmers . In real life, this could be a microcontroller connected via usb, or a hadoop server farm. It is important that the programmer has his stand, which is at least similar to the working configuration, where the programmer can check the results of his work right away, as he wrote. Save is not worth it, and each programmer must have its own stand. If it is too expensive - it is necessary to raise mockups, if mockups cannot be raised, then the company has problems - programmers write “on production”, and if there are several programmers, then at the same time. If programmers are not allowed to produce, then they write blindly - goodbye productivity.

Next, there is the question of how the code appears in production. Most often these are packages (deb / rpm), executable files (exe), or just a boring calculation (html). Note that it makes sense to even wrap a “boring static display” into a package. There are teams that spread directly from the gita (a specific brunch, tag, or even a master, with the assumption that the development goes in other brunches).

Build packages can be very confusing and difficult, especially if the code is not written from scratch, but depends on existing configurations and a lot of other packages. It makes sense to set up a package building system . To do this, use CI (continuous integration) in the minimum configuration, often with manual control (go to the interface and click on the "run task" assembly package). The standard for opensource is jenkins. Of the most famous proprietary - team city. The minimal configuration simply takes the specified brunch / tag / repository and builds the package. Which can then be downloaded from the CI interface.

But everyone is used to aptitude install - and for this you need to raise your mirror , or package repository . The same CI can upload packages to the repository. Click the mouse - and the source code can be put on all servers where the repository is connected. Note that the availability of repositories allows you to quickly “roll out” the application on a large number of servers in an automatic or semi-automatic mode, and even have a separation into experimental / testing / production / oldstable. This also makes it possible to very quickly repair the damage, since the package managers have all the necessary tools to validate the integrity of the installed files and can download the package again and restore the modified files (note for webmasters who have all the evil backdoors in WordPress spoil your favorite php files ). If the package requires packages that are not in the distribution, they should also be packaged. Production, the rise of which depends on the uptime of pypi, is a sad sight. Note that some dependencies can only be relevant when building, in this case it makes sense to set up the replication of the used package directory to your servers.

CI, at the same time, can run tests. Unit tests are most often run at the package build stage (after compilation). But for functional (acceptance) tests it is necessary to raise (another, and even many different) test environment . After successful compilation, installation of the package on the stand and testing for performance starts. If a company has resources, then a test is made for each strange bug that will catch it. In the minimum form, you need to check the basic happy paths, that is, "the customer can come, see the goods, put in the basket, pay and buy." You can check any sad path (the client has no money / the kernel version does not correspond to the module), but it is much more resources. But even the happy path tests greatly improve stability.

If there are a lot of configurations and tests, then it makes sense to increase the integration of code-review system with tests . The most famous is zuul, which binds gerrit to jenkins. In this case, the proposal for code review is sent only after the programmer’s commit (sysadmin) has passed the tests - people save time, not to mention the fact that the lion's number of simple bugs is caught during the “struggle with gerrit” stage. An ideal example of how it works on hundreds of independent developers is the openstack project infrastructure .

If the tests are configured, the code review is worked out, there is always a previous version, then it makes sense to think about continuous integration , its original sense, that is, automatically rolling out the changes to production immediately, as the tests and code review passed.

Final chords


This usually requires mail, jabber (with which it would be good to link the tracker and CI), possibly, mailings. It often makes sense to raise your vpn-server so that people can work from anywhere without difficulty with closed ports, etc.

Why “your own” when there is a google post? Well, because when letters from nagios begin to mysteriously under-reach, because Google does not like bulk messages for group address, the fight against Google may take longer than its mail server.

And, of course, its own meta-infrastructure is attached to all of this:
  1. This all needs to be configured. According to the mind - configuration management system . chef, puppet, ansible, saltstack.
  2. All this needs to be monitored. Monitoring - nagios, shinken, zabbix, icinga
  3. All this should be backed up. Yes, the repositories also need to be backed up, because collecting 20-30 repositories with “who has the latest version” is still a pleasure. And the comments in merge requests are generally unrecoverable.
  4. All this requires domain names, and it is better that the domain or subdomain be fully managed by the department, so that with each new A-record you don’t have to go and moan “well, add me another booth in DNS”. Your DNS opens up another important opportunity to generate records automatically (for example, for addresses of ipmi-interfaces of servers)


Here is a modest household with a good IT-company. Note that these are only working tools, there are no “user directories” (and, in general, the authorization issue was not understood), access control systems, time and attendance accounting, business planning and other things that are not needed by developers / system administrators, but to stakeholders.

And how much does it cost?


I count in man-hours, multiply by s / n. The numbers are very approximate (that is, taken from the ceiling):

Note, this is a pure refined time (of which there are not so many). In addition, many things require the involvement of all employees, that is, they can not be allocated to a separate special person, that is, the implementation time will be spread over the usual work.

Provided that approximately 30% of the working time is allocated for its own infrastructure (this is a lot), implementation from zero will take from three months to a year. With constant enthusiasm, if there are pauses, the time increases (disproportionately to the idle time, because everything changes around and after the pause you have to redo a lot). Once again, taking the salary from the same ceiling, taking into account the vacation / hospital expenses, we get about 1-2 million rubles, without the cost of iron, electricity and licenses, only for work (the figure takes into account the cost of "white" taxes and fees).

About the price of maintenance. Very dependent on investments in configuration management system, backups and thorough implementation. In a good system, it should be no more than a few hours per month, plus separate costs for adding new projects, configurations, modifying existing ones, etc.

How many servers do you need? The answer strongly depends on the test configurations, if we assume that the test configuration is a small LAMP server (1 GB of memory for everything), then with dense packaging of virtual loops, it is possible to do 2-5 servers (~ 200-300 TR on each) for everything, plus a separate backup server. Oh, yes, add to the list of works on setting up, still maintaining this stack of virtual locks and virtualizers.

All this, of course, pales in comparison with the cost of one workplace for an operator of a tower crane (6 million rubles) or a high-precision robotized drilling machine (I did not even find prices in the public domain for example).

Do you really need it?


Is it possible to save and not do?

Can.

Moreover, nothing fatal will happen. However, some workflows will be longer, some will be tedious (sending zip'ov with changes to each other), a number of employees will be forced to be present in the office in order to do something (for example, to coordinate changes in the code), illness alone of employees can make life difficult for others (and who was here, who knew how to roll out the changes correctly?). Partially, the quality of the code will suffer or the speed of the programmers will decrease. Some very cool things will simply not be available, but even without them it will be good. Some employees will be engaged in low-productivity work, and because of this, not only do not do what they are paid for, but also demotivate, because the monotonous, boring, fraught with mistakes, repeating from time to time the process at work is just an excellent reason for implementing the above. Or resume updates, yes.

Like all capital investments in labor efficiency, all of these systems are optional. In the end, the rockets into space successfully launched with some kind of attempt and without git with code review and tests.

Conclusion


Companies more often follow the path of introducing such systems gradually, and it would be a big mistake to start a startup with a three-month upgrade of the entire infrastructure without writing a single useful line of code.

Often stop at some stage. They are usually caused by the reluctance of the staff, because he (the staff) does not have the pain that the system should eliminate. If there really is no pain, great, you can save on very advanced complex things. If “there is no pain” because a person has never tried “differently”, then it may be that the stop is caused only by a lack of qualifications.

Note that if ignorance of a gita can be considered an analogue of illiteracy or slurred speech, that is, a programmer who does not know how to use guitar, either very “special” (1C, SAP), or not a very programmer, then the “ignorance” of gerrit is quite understandable and will require training. The higher the level of integration of processes, the greater the chances that a critical mass of employees simply will not want to learn so much. Unlike bookkeepers, system administrators and programmers are trained fairly quickly, but if “fast” does not work, then the resistance may be much greater than the outraged moans from the change of version 1C. There is nothing worse than key technical personnel, outraged by the mandatory introduction of an unsuccessful technical solution.

- , , , /.

, -, , , .

PS , — — , .

Source: https://habr.com/ru/post/229561/


All Articles