📜 ⬆️ ⬇️

How we build DevOps in a team of 125 developers

Hello.

My name is Alexander Chernikov , I am the head of development at the Digital Corporate Bank division of Sberbank and Sbertech.

I will tell you today about DevOps in Sberbank Business Online (SBBOL), which we have built in a rather large team (125 developers) with a large Review (75 Pov per day). Now the streamlined CD (CI) process at pull-requests (hereinafter PR) is an integral part of our work and our pride.
')


So, what we did:

  1. Wrote a special plugin for Bitbucket - "Mega Plugin"
  2. Taught him to interact with Jenkins
  3. Screwed the so-called PrCheck (assembly, Selenium tests, linters)

Process in picture


Now more, how it all began. I will tell on the example of one UI repository, and if there are any interesting features in other repositories, I will write separately.

What's up to develop, dude?


Once our develop constantly broke because of some minor errors that were viewed on code-review. These were JS runtime errors, Typescript compilation errors, global application style errors (when the whole page was “parted”), etc. In the chat, familiar questions constantly sounded to everyone: “develop is not broken? Can I upgrade? "," Who else is not going to develop? Or do I have one like that? "And others. Then the team was from 10 people. And on the horizon plans to grow to 10-15 scram teams of 9-10 people each loomed. Our super administrators and experts (reviewers) were sad.

Thought to solve the stability problem through git pre-push hook, but this approach did not work either, for two reasons:

  1. stability can not be where there is a human factor;
  2. It was cruel to force all developers before push to wait for the complete build of the project, the run of the linter and other checks (there were no Selenium tests yet). In addition, a large simple time.

Began to think in the direction of the barrier at the stage of PR.

Operation "Jenkins impossible"


In the first implementations, the start of job was done by this plugin . Those. every 15 minutes he (cron) bypassed the PR with new comments test this please or new PR, and just ran the job. After the build, we wrote a comment in PR (success / failure) in the shell script. The experts looked at the code, put their apruvy (for the Anglicisms do not judge strictly, did not find a good synonym).

BUT only the favorites, god mode, could infuse the code. Usually it was one person on the repository who needed to make sure of a successful build, find “BUILD SUCCESS”, count the number of points and only then press the Merge button. And I will tell you, from experience, morally it is very hard. It seems there are apruvs, and the “checks” have passed, but you still pour in, and you are responsible too. None of our more than three weeks did not survive. The duty began :) And from the point of view of the process, it was still a narrow neck, because even after all the requirements were met, the team needed to find someone who would press the button. As a result, from 1 hour to a whole day, and sometimes even weeks, the teams could not find this “chosen”, who was huddled in the corner of the Main Developer.

It is worth mentioning that Bitbucket has a standard option - display the status of the assembly . And merge checklist : the minimum number of points, at least one successful build, all comments are allowed (ticks), etc. But this was not enough for us.

Later, the idea of ​​the plug-in was just born, quickly threw in the minimum requirements for it and began to develop. Only one person, MaratSadretdinov , who never wrote anything for Bitbucket and did not work with his api, gave us a working prototype in a month (he wrote in his spare time). The highest priority was the requirement: to get a list of PRs ready for influence. What criteria we set:

  1. the build was successful;
  2. the right experts put an apruve (in the future they combined it into labels, as described below).

How it worked:
Creates a PR >> Writing a comment test this please >> Our plugin blocks the Merge button, and the github plugin monitors the comments and runs the job on Jenkins >> Jenkins sends out success / failure >> The plugin activates the Merge button if all the criteria are successful >> " God mode user "infuses the code, every time a little bit old.

Everything. One or two weeks to debug, fix bugs, delays between Jenkins & Bitbucket (every second it was hard to poll all PRs because the plugin scanned all comments). And cheers! Questions about a broken develop are gone.

But then everything became even more interesting, development managers and greedy developers wanted “fat.”

  1. Let's remove the phrase test this please and we will run it automatically.
  2. And what if a person has updated his PR, it is necessary to bang the previous assembly, launch a new one.
  3. And in what sequence to launch it, there is already a queue for Jenkins, so it will stand at the very end of the queue?
  4. Let's pour in the machine! Do not hand press.
  5. Let's create labels that, according to different rules, are hung on PR, initially yellow. As soon as they all turn green - PR flows.
  6. Let the experts in Telegrams write if they have a new PR or do not join in more than two days, etc.
  7. Do not run the assembly until all the hardware is assembled.
  8. If the build has passed, but a long time ago (a week or more), i.e. "Old" PR, then you need to reset the label success.

A lot of things they wanted, let me tell you what we got in the end.

Labels


Let's start with the main label - expert .
There is a group of experts, first of all, the most experienced developers on the project, who know the release dates, the dependence and influence of the code, common approaches and practices and, of course, how to write code better.

team .
Label your team. Expose the minimum number of points from your team. It does not have to be a developer, or js-nick, if the PR is in the UI. The main goal is for the team to confirm that the development is moving in the direction they have planned. Sometimes it could be more experienced developers who, prior to the experts, put comments to their colleague.

arch
We invented such a label when the core, common components, utilities changed. Each such PR usually affected very many forms and pages, and could significantly break the functionality. It was determined by the usual regular basis of the project structure (folder src / Core, src / Common, etc.)

prCheck
The most important technical label. The guarantor of reliability. In the UI repository it is tslint, lesslint, jsonlint, typescript, webpack, for one time even e2e tests for groovy. On the back - compilation, integration tests API. All in docker.

oracle , security , admin , devops
Highly specialized experts in their fields.

sonar
This is Sonar, on the back. It scans the code, starts a task for the rules that you adopted on the project.

predvnedrezh
Special label for the pore “Predvneredzha”, when the introduction on the nose (one or two weeks), and you can not pour anything. And we do not pour in anything. Nearly. Well, you understand;)

not_task, not_story, not_bug, not_test
Jira Greetings Labels PR always has connections with tasks from Jira, because commit without specifying the task number cannot be done (I think many on the server set such git hooks for themselves). The plugin just looks at the type of task in Jira. So this label says that the code is ruled by something else, not what you expect.

css
He says that there are style files in PRe, you need a style, layout and design guru.

This would be your label.

We also implemented the ability to manually add labels (if some rules cannot be regulated), the author of PR or the expert himself could add more experts (even more experts!).

More labels can be made optional, they do not block the Merge button.

Default Reviewers


Another interesting skill of the plugin is the appointment of experts automatically. Prior to the plug-in, each developer had his hands exposed experts, teammates, and others whose interests had been touched by the corresponding label. At first it was just a Russian roulette, with a small counter inside. If the rule of at least two apruvs of an expert works in your repository, then three experts from all of them were selected.

You ask, why bother choosing two or three experts on PR? Why not assign the entire list to each PR? Who can - he will look. But, as the saying goes, "general means nobody." In the beatback there is a bell with notifications, and if we have 50 PRs per day, then each expert (say, 10), had to constantly “drive” this counter to zero. It was almost impossible (considering that experts also write code), again the question arose of demoralizing the team. The plugin saved us all.

Then we added the ability to disable the expert (vacation, illness), and the introduction of a personal work schedule. Some teams and experts sit in other cities. But when the plug-in did not take this into account, it turned out that the developer at 5 am MSK created a PR, and 2 Muscovites were appointed to him, who came to 10 IST. Although there were experts who could watch this PR right away.

Priorities apruvov or how to deal with nepotism


In one of the server repositories, they wanted such a lotion as priorities (or weight) of aprouvs. What it is?

You have an expert (say Mario) who is on the team (say Nintendo). A colleague exposes a PR, and Mario apruvit and greens two labels at once - nintendo_team & expert. It seems normal, he is an expert. Responsible for this code, but still there are suspicions. We decided in our repository not to bother, but in the other we wanted. They put the team label priority higher, and therefore the experts of this team can not green expert. It's funny, but the problem of nepotism between the teams (some 2 Nintendo & Dendy teams) has not been solved :) Or maybe it is not ...

Club dependent pull requests


We also have, and not only here, such a need - to put dependent PRs on UI & Back immediately, with simultaneous edits api, for example, and checking the entire application on the already new code. To do this, they invented the revision of the plug-in "Dependent PR and their simultaneous influence". In PR, you put a link, which PR in another repository you need, and the assembly will be collected simultaneously from your branch and the branch of a dependent PR. This is very helpful when you change the api model simultaneously to a client + server, say, was

// Dog: {age: number, id: string} // Dog: {age: string, guid: string} 

Or if you edit a test in one repository, where you look for a button in the browser with the phrase

 @FindBy("//button[text()=""]") class MyButton 

and in the UI code it is as before:

 <button></button> 

then you need to edit again in two repositories.

Various command buttons


There are Restart buttons that restart checks, if suddenly something went wrong.
There is a button Recalculate, Recycle, if you need to throw off all the labels and re-calculate.
There is a “Too long review” button for those who like to complain. A letter is sent "where it should." Maybe soon we will add a button “Change experts”, “Donate 50 rubles” or something else important and interesting.

Perhaps this is all that I especially wanted to share. Oh, I almost forgot. There is also a admin plugin.

Thank you for your attention, write questions. I hope our interesting experience will be useful to you in your work.

Source: https://habr.com/ru/post/354334/


All Articles