The process of developing and rolling out releases in Badoo. Automatic testing. Developer Environment

In July, together with leading IT-Kompot and Badoo release engineers Vladislav Chernov and Oleg Oyamäe, we recorded the release of the podcast “Development Process and Rolling Out Releases in Badoo. Automatic testing. Developer Environment.
Since the last podcast aroused interest among listeners and readers, we also turned this podcast into an article.

What they talked about:
The process of developing and rolling out releases in the company Badoo. Used tools.

GIT Workflow. Each task in a separate branch;
Using JIRA, TeamCity and AIDA;
Formation of release and roll-out of two releases per day. Problems and solutions (rollback, patches, etc.).

Automatic testing. The recipe for a quick run of a large number of tests.

What we use;
How to drive tests;
Code Coverage;
Start-up. 18,000 tests in 3.5 minutes.

Development environment in a team developing a complex distributed system
And recommendations from the guys: useful books, articles, etc.

Anton Sergeev: Hello everyone, you are listening to the 48th issue of the podcast “IT Compote”, and with you its presenters Anton Kopylov.
')
Anton Kopylov: And Anton Sergeev, hello.

Anton Sergeev: Today, our guests are gallant guys - release engineers from Badoo, Oleg Oyamäe and Vladislav Chernov. Hello guys!

Vladislav Chernov: Hi.

Oleg Oyamäe: Hi.

Anton Sergeev: We have a cycle of podcasts with the company Badoo. Today we decided to talk in more detail about the area, which is, if not the pride of Badoo, then certainly a very important achievement and a clear success. This is what the company is doing successfully rolling out new releases, while using fully automated tools. Plus, the company is very interesting to make and configure the testing process. It allows you to effectively conduct a large number of tests. Everyone who listened to the previous release of the podcast already knows about 2 releases a day - Alexey Rybak spoke about it.
Well, let's get started. Vlad and Oleg, I suggest you to talk briefly about yourself and about what you do in Badoo.

Vladislav Chernov: Hello again. Oleg and I are involved in Badoo configuration management and release engineering. If we talk about myself, then I almost all of my working career is engaged in release engineering: I started as a normal release engineer, did a lot of hands, we rolled out simple releases, and then I went into automation and more and more automated it by twisting this process. Now I am engaged in automation of the entire business process of development and testing in Badoo.

Anton Sergeev: Clearly, cool. And Oleg is your colleague and helps you with this, right?

Oleg Oyamäe: Actually, I have a slightly different story. I was a developer and team leader in other companies, and in Badoo I decided to try myself in a new field for me - in release engineering. But I do more programming, not automation. Let me tell you a little about how everything was arranged in Badoo from the beginning to the present.

Anton Sergeev: Yes, let's talk about the history of your development process, how you started to do all this, because you didn’t manage to reach such a state as it is right now. Probably there were many obstacles and tasks that had to be solved. Tell us how it all began, why it happened as it is now, and how you came to this.

Oleg Oyamäe: For quite a long time in Badoo, SVN was used as a version control system. That is, most of the existence of Badoo was used by SVN, and it is still used in some parts even now. (Note: in far-distant years, of course, was CVS).

Anton Sergeev: Now some listeners are probably starting to "write with boiling water": how is it, subversion! But, as far as I know, SVN, it is more characteristic of projects of the type enterprise of everyone. You had a lot of experience with him since Mamba, that is, it historically happened, right?

Oleg Oyamäe: The transition to Git began two and a half years ago: at that time some boom started and everyone started to switch to it. But in Badoo, it was a planned process, as part of testing implementation in the company.

Anton Kopylov: Did you not consider Mercurial as an alternative to Git?

Vladislav Chernov: As far as I know, we did not consider an alternative to Mercurial, because at that time the number of plug-ins, software for managing repositories, and so on was more on Git. Plus, the community was already formed, and on Mercurial it was just starting. For me, for example, a special plus is that Git uses any languages, and with Mercurial it’s more complicated. (A note from testing manager Ilya Ageev: We tried to enter Mercurial many years ago, but it didn’t take hold of us and we went back to Svn).

Anton Kopylov: Yeah, well.

Oleg Oyamäe: This is how it is. Accordingly, the process of introducing quality control began two and a half years ago. A department was created that is growing and developing to this day. Just at that time, a new workflow was developed, in which each task was done in a separate branch, but everything was still collected by hand, there was neither automation, nor, by and large, nothing. The display took place, you can say, completely by hand. There was a utility that was written a very, very long time ago, through it, everything was laid out, and there was no monitoring. Over time, there was a lot written around this utility - dashboards and other things, it became a kind of kernel. This is if in brief.

Anton Kopylov: And the process of laying out and testing, did you have it assembled together or were these different processes?

Oleg Oyamäe: Previously, these were not entirely related processes. Now it is a full-fledged continuous integration, when tests are run after each commit, and this directly affects the calculation and the staging. These two processes are now interconnected and influence each other.

Anton Sergeev: By the way, guys, I wanted to ask your opinion. It is clear that many projects at start most often use only the version control system and, say, some hooks to control it. Usually, no one immediately puts powerful tools for continuous integration, for deployment. What do you think that can be a transition point, how much should the project grow to make it clear that it is impossible to continue without normal continuous integration and tests?

Oleg Oyamäe: My personal opinion is that on any project from the very beginning you need to think about the integration of these things, and even more so about writing tests. And tests are probably even more primary. Especially if the project is written from scratch - in this situation, the test coverage is worth thinking about at the earliest stages of development.

Anton Sergeev: So if you were doing Badoo again now, you would immediately take such a system and the very first commit would get through continuous integration and pass all the tests?

Oleg Oyamäe: Well, I would - yes.

Vladislav Chernov: I would say a little differently. If we are talking about the “pop” word start-up, then it is clear that no one will make such a complex system, because it is not known whether the start-up will take off, and resources are spent on tests rather big, plus we always have limited time which we have to put the product on the market. Accordingly, it’s probably not necessary to talk about automated testing from the beginning of the project. But talking and thinking about automatic assemblies and, perhaps, some “versioning” and automatic deployment are necessary from the very beginning, because it will not take a lot of time for the project team and these are small resources compared to the same automatic tests.

Anton Sergeev: I see. And let us briefly tell us more about how you have 2 releases per day, about several stages of testing, the n-th number of tests and how it works. About kickbacks, patches, hotfixes.

Vladislav Chernov: Let me first get back to the story a little bit: 2 releases per day - this is true, we have about 50-60 tasks left per day. Everyone will probably be interested to know why it is exactly 2 releases. Everything is very simple. It is clear that at first there was one release per week or one release in several days, then there was a release every day, and then there were two releases per day. Why two?
At some point we moved to a flow, where each task is done in a separate branch, and, accordingly, when we form a release, we merge these tasks into a release branch, re-check each of the tasks, and plus we check the code (integration testing) . When it is from 10 to 30-40 tasks maximum - it is quite easy to do. When you need to solve 100 tasks in this release, it is already much more difficult to do. And therefore we deploy twice a day, we can deploy more often, but there is not much point in this. Why each task in a separate branch is probably also clear.

We have several stages of testing, there are about 5 of them. Each task is in a separate branch, because we can roll back and check it. It turns out that we have a code with production plus this task. And we check it at several stages of testing. The first stage is standard - this is code review, and only this task looks. The task is checked on the developer environment. The developer environment is our mini server with virtual machines, databases, and so on. On it you can check things that can not be checked in production. Tasks are tested there, then we create mini-styling for each task, it's called a shot, the base is used from production, and we check the task again. Then the task goes to the release and is checked for code smazhennost already in the release branch, if it does not break other tasks. And so on and so forth. There the task is checked for the fourth time. And there are optional tests on post-production, when we test this task for some kind of request from the product manager, or this is a very important task and testers re-test it on production, but these are optional things.

Anton Sergeev: And how is code review going on? Do some special people make it, or do developers randomly review other developers' code?

Vladislav Chernov: In each department, everything goes differently. When a task gets to review, a developer or development team is selected. In some groups review is done by team leaders by component. In others, tasks are simply rotated within the group. It depends on the size of the department, and on the experience of people, and on traditions.

Anton Sergeev: Guys, let me ask the questions that the audience sent. For example, Stanislav sent us many questions of all sorts. I broke them into several groups. For example, he asks who makes the final decision about rolling out the release - a soulless machine or some analogue of Sergey Didyk.

Oleg Oyamäe: No, we don’t have Sergey Didyk, he is in another company (note: this is a Begun company). We have a release twice a day at a certain time in the morning and in the evening and, accordingly, we collect a release from the tasks until that time, then the avtoderzh stops and we start testing these tasks. When the tasks are tested and the moment the release rolls out, the release engineer makes a decision and we leave for production.

Anton Sergeev: What are you rolling out? And how do you roll back, if not rolled out? How long can you roll back, if that?

Vladislav Chernov: We have our own deployment tool for rolling out, it is written by our developer Yuri Nasretdinov. If you want to learn more about this, then Yuri has several reports and an article about it. The utility very quickly rolls out a thousand servers, literally in 2-3 minutes. This is our deployment system, and we are constantly improving it.
(Note: rollback is the same rollout, only the version number is different. Thus, we can roll out and roll back in a few minutes. Roll back even faster, because the previous version remains on the servers. You just need to drop a link, but this is somewhat ssh commands to switch and flush the cache).

Anton Sergeev: Is there anywhere on GitHub, can ordinary people touch it?

Vladislav Chernov: No, it is not, and why - this is probably a question for Yuri Nasretdinov.

Anton Sergeev: Clearly, let's hope that lay out. It would be interesting to feel, of course.

Vladislav Chernov: Yes.

Anton Kopylov: And I still have this question, you say that you have a testing stage, when staging with a production-base, the code is checked. And how do you transfer data from production to styling? If a problem is found, what do you do with it?

Oleg Oyamäe: And what do you mean by this problem?

Anton Kopylov: Can I check on pricing, for example, not on one hundred million users, but on one million? Or do you transfer the entire database and fully check this code?

Oleg Oyamäe: No, it uses not a copy, but a full-fledged real production.

Anton Kopylov: Aah.

Oleg Oyamäe: And only the code with some new changes, and the base is used precisely from production.

Anton Kopylov: Yeah. Isn't it scary to connect to production-base on staging?

Oleg Oyamäe: No, because testing takes place on test users, that is, even if something happens there, it doesn’t "affect" or disturb real users (plus the task is tested twice before).

Anton Sergeev: By the way, we have a similar approach used in the company and we are trying to get away from him, but at the same time we think whether we should leave him. Here, the main common sense: if you really do not do anything terrible, collections, for example, do not drop, then probably everything will be fine.
And Stas asks if you use rolling out on a part of the cluster. Are users nailed there or can it be any part of a cluster? I understand that it is a question of whether it is possible to roll out to a part of a cluster, test it and then roll it out for everyone.

Vladislav Chernov: I can answer this question this way: we can roll out to any machine and to any cluster. We use about 10 clusters. And we can roll out this code using the utility to any of the parts of the cluster. Regarding the interest of being rolled out and potent somewhere, we basically roll out a new feature for testing and we want to look at the results. We are rolling out a separate country, and not a set of cars, because it is much more convenient.

Anton Sergeev: I see. And how do you roll back? Suppose you have some kind of feature hit the release branch and something went wrong during the testing process on pricing. For example, you find that there is some serious problem and it is desirable to roll it back for now. That is, tell us how you work with the version control system (Git), whether you use git rebase or git revert. And if so, tell me how.

Vladislav Chernov: We use git rebase, we use it, because git revert does not suit us, as we develop each task in a separate branch. If we rollback using revert after merging the release branch and the master branch, then the developer will have to do revert to revert, so we use git rebase. We instantly roll back, respectively, build a new assembly and lay out on staging.

Anton Sergeev: Did you have any problems with git rebase? As far as I know, this thing is rather capricious, you need to be able to use it correctly. What is your recipe here?

Vladislav Chernov: Actually, it is not so whimsical and the recipe is very simple: we hold other branches in the release branch, and we have a very simple release branch tree, and rolling back the task is very simple, since this is a serious commit. And we understand that git rebase is a completely manual operation. But we have an algorithm and a small script that performs it automatically, well, almost automatically.

Anton Sergeev: We already know that you are using Git, but besides that, you also use such a great tool from JetBrains as TeamCity is a continuous integration server, and you have it all integrated with JIRA bug tracker, as I understand it - and setting tasks, and automatic status changes. How did it all appear and integrate with you, and how do you work with it?

Oleg Oyamäe: Yes, as already mentioned, we use Git, as well as JIRA and TeamCity. We use TeamCity to run tests and display staging code for a successful run. In fact, the flow in JIRA is extensively developed and structured, and all the components of the JIRA, Git and TeamCity systems are integrated. We have a very large workflow, which we regularly optimize to make it convenient for developers, testers, and product managers.

Anton Sergeev: By the way, at the last conference DevConf was a report about problems in architecture, and the speaker said that in no case should JIRA be used, because there it’s impossible to distribute rights to different users at the task viewing level, so that, say, developers did not see some additional metainf there, which the project manager needs, and the project manager did not see any details that could be useful only to developers. How critical is it, in your opinion, and with what problems did you encounter with JIRA? There were some really serious things that you just finished?

Vladislav Chernov: This problem is with the project manager and the developer, it’s some kind of very private plan, we are actually all friends and we don’t have hidden information, and there’s no such problem with fields, for example, to hide them or not. Regarding whether JIRA is used as a bug tracker or not, everyone chooses it himself, but this is at least the most common system in the world, just like a bug tracker. In some ways, it suits us, in some - no. Where it does not suit us, we will automate the work with it. In fact, we have a complicated workflow, but in JIRA there are two levels of nesting tasks and subtasks, and because of this we get more advantages than minuses, because can not complicate the process.

Anton Sergeev: Yeah. Do you use confluence? If our listeners don’t know, I’ll just say that confluence is a wiki where you can store documentation and more.

Vladislav Chernov: Yes, of course, we use it. And I'm afraid to make a mistake, but as far as I know, we even use GreenHopper. But we don’t use FishEye to view the code, because it’s a very cumbersome tool, and we’ll use GitPHP to review the code, which, again, we finished, and it’s fast enough compared to the same FishEye, which indexes a huge number of our branches days.

Anton Sergeev: What is your impression of working with TeamCity? - , Jenkins, ?

: , : Jenkins, TeamCity, Bamboo. TeamCity , , .

: , ? , - , . , , . - , TeamCity?

: TeamCity , , . , , support . , Jenkins , . , , - , Jenkins, , , , .
TeamCity , -, , . , . , , , , . , . , , support , .

: , TeamCity , killer-, ― . - . .

: . , , , - , -. .

: , ― , TeamCity.

: Git. git-flow . Git. , . flow , feature branch, release branch, hot-fix branch, master, developer branch, , - ― , ― ?

: , . git-flow. : , production, , , , Deploy Dashboard. - : , -, , , production - . , , . , , . JIRA: - . , flow, , review, , , , . PHP-, . , , 40 , , , . , .

: : - ? 2 , , , , ?

: . , TeamCity , TeamCity, «» , , , web ― iOS, . , , , « 365», , .

: , . , , , Redmine, , -, -. , . : « -, ?» , .
, , , , JIRA, TeamCity Git, , , , ― . , AIDA, ? , , ?

: , AIDA. , , - , Jabber, , dashboard, , TeamCity, JIRA. commit message, Git log, . JIRA GitPHP : diff, . , , .

: AIDA, , - rocket science?

: : , .

: , , AIDA TeamCity, JIRA, Git?

: , AIDA API JIRA, TeamCity, Git . , , , , API. , , . , . AIDA, , -: , , automerge . , , - , AIDA , . ― .

: , , , , , - , ?

: . , AIDA, - . .

: . , , - , , , , ? - ?

: , , AIDA .

: . . , , , ?

: , . , , unit- . , unit-. , Selenium PHP-, code coverage, . - - : , , , , . , push , , unit- selenium- ― , , , - , . , 18 000 (: 20000), .

, «», 11 , , , - 3,5 ― 18 . ― , . . , JIRA, : , .

: . Selenium, , Selenium?

: , , . , Explorer 6, : Chrome, Firefox, Explorer. . «» , , . GUI, , .

: . , : Mail.ru, , Mail.ru ( , ) Selenium. , ― , - Xen', «» , Selenium , , , , , . , - , - Explorer', , , , - , . , , , . , ? «»?

: , , - selenium- . , , , . : selenium- ― , , ― , . , , , , , . , , Google, , selenium- -.

: .

: , ― production-, . , , selenium-, . - .

: . ?

: , unit- , , . , devel. Selenium, devel, . -, devel . , devel , . , production, . selenium- , , - production. PHP , , .

: : unit- unit? ? , SQL- , ?

: unit- unit-, ― , . Unit Unit' .

: ( , ― , ). A/B- ?

: , A/B-. , Badoo A/B-, , , , , , . .

: ? , , , 18 3,5 , ?

: . That's right.

: , , , response production, - .

: , . , ―24 7 ― , , , , . - , .

: code coverage: , , . , , . , ?

: , - , , , unit-. , QA, QA selenium-, coverage , , - , , , - unit-, Selenium'. .

: «», . , - , , , - , «»?

: , , , - suites n- , . , . , , . , , , - 20 , , . 3,5 , . , 11 , suites , , TeamCity-. , , , , 7 , - , , , . 11 . , , , .

: . - , , ? , open-source , ?

: , , . , , Jenkins, , open-source. , , .

: . , , , , , , , . , , . , , , , - , , . , , , . , , , , GitHub -. , . , production , . , - sandbox, production, ?

: , - , Vagrant Puppet?

: . production - , . , devel'e , - . , . , , - , , ― , .

: - , Puppet Chef ?

: , Puppet production , , . , , ― .

: , . , , Puppet, . Why? . , MongoDB , -, , API , . - , , , . , , PHP, - , Mongo-, . ? , , «», «», Linux, ?

: DEV- . nginx -, , , - . , . devel'e. , , . , , , -, «», , , , «» Linux, , , , 50 50.

: , . , , - devel, , , , , GUI . , . , . , , , , , Vagrant'. Vagrant'; , , - Linux- , DEV-. , , devel, .

: , , , ― . , . , devel-o, - , , , , , - ― , , , . devel, , production. , - .

: , - , DEV, , , , , , . ― , , production - , , , . , .
- , , , , , , , , .

: , , , , - . . , - , - , :

Jez Humble & David Farley «Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation» ;
Paul M. Duvall, Steve Matyas & Andrew Glover «Continuous Integration: Improving Software Quality and Reducing Risk» ;
Alan Berg, «Jenkins Continuous Integration Cookbook»

. , , , -, , -, .

: , , , - , -, , , , . «» , , « », , - . , HighLoad, , . , , , .

: , , . , . , , , , . , , - . , , .

: Badoo ― - . , .

: , , .

: .

: Listen to our episodes on the website “www.itkompot.ru”, also on the podcast terminal podfm.ru, subscribe to iTunes. Good luck to you development and hear, bye.

Play the podcast completely.

Download the podcast release.

Source: https://habr.com/ru/post/190572/

All Articles

The process of developing and rolling out releases in Badoo. Automatic testing. Developer Environment

More articles: