Our team is responsible for the operation and development of a large corporate product.
In early 2017, having rested from a major introduction and re-reading "lessons learned", we firmly decided to revise the process of developing and delivering our application. We were worried about the low speed and quality of delivery, not allowing us to provide the level of service that our customers expect from us.
It was time to move from words to deeds - to change processes.
This article will briefly describe what we started with, what we did, what the situation is now, what difficulties we have encountered, what we had to leave behind brackets, what we are planning to do.
The application is a classic example of a monolithic enterprise application of the "architectural spill of the 2000s":
Delivery is carried out in the framework of monthly releases (as it is arranged with us, I told earlier here ).
Lack of control
Complexity and mistakes
Restrictions
At the beginning of the project, we set clear goals to solve the problems outlined above.
Additionally, using the solutions obtained when achieving the first two goals, we expected:
First step: analyze the existing contractor development process. This helped plan changes so that, if possible, do not interrupt work.
Unfortunately, familiarity with the development process showed that, in the understanding of the present-day IT industry, there was no process.
Dependence on MS platforms and corporate standards predetermined the choice of development environment - Team Foundation Server.
However, by the time we started the project directly (April 2017), a version of Visual Studio Team Services had just been released. The product seemed very interesting, was designated as a strategic direction for MS, offered git repositories, build and deployment for on-prem and cloud.
Corporate on-prem TFS was lagging behind the version and functionality of VSTS, migration to the new version was only in the process of discussion. We did not want to wait. We decided to go straight to VSTS, as it reduced our overhead costs for supporting the platform and gave us full control over how and what we were doing.
At the time of the change, the development team had experience with TFSVC, the application code was stored in such a repository. On the other hand, GIT has actually become a standard for the IT community long ago — the customer and third-party consultants recommended switching to this system.
We wanted the development team to be involved in making a decision on a new version control system, and made an informed choice.
We deployed two projects in VSTS with different repositories - TFSVC and GIT. A set of scenarios were identified that were proposed to test and evaluate the usability of each system.
Among the evaluated scenarios were:
As a result, as expected, GIT was chosen, and so far no one regretted it.
We started using GitFlow as a process. This process provided enough control over the changes and allowed delivery of releases, as we have become accustomed to.
The application consisted of a large number of assemblies, hundreds of solutions. As it turned out during the audit process, all this was collected separately and "manually."
In the first stage, we decided not to redo everything “from scratch” (so as not to stop the existing delivery), but to “wrap” the assembly into the msbuild script set — one script per component.
Thus, we quickly obtained scenarios that carried out all the necessary intermediate artifacts, and in the end - the finished product.
A separate story is a database project. Unfortunately, the system contains several CLR components that were not well structured. Dependencies do not allow a simple database deployment. At the moment, this is solved by the pre-deployment script.
In addition, due to the uneven system landscape (SQL Server versions 2008 and 2014 were installed at different points), the project base assembly for .Net versions 2.0 and 4.0 had to be organized.
After all the scripts were ready and tested, they were used in the build VSTS scripts.
Immediately before the start of the assembly, the versions of all products were updated to a common standard number, including the pass-through build number. The same number was saved in the post-deployment script. Thus, all the components: the database and all client applications — came out consistent and equally numbered.
Once the primary version of the build process has been completed, we proceed to the preparation of the deployment scenario.
As expected, the database has caused the most trouble.
Deploying on top of a copy of the real database revealed many conflicts between the build and the state of real systems:
This, of course, is strange to talk about and, even more so, to write here, but the most serious change for developers was the introduction of the principle “if this is not in git, this does not exist”. Previously, the code was commited "for reporting to the customer". Now - without this, it is impossible to deliver anything.
The hardest thing was with the database code. After moving to the database deployment from the repository, through the assembly and deployment using sqlpackage, the "delta" approach was replaced by the "desired state" approach. Packages were a thing of the past, everything had to be deployed automatically.
But! Until the full transition to the new deployment process, it was still necessary to deliver the changes. And it was necessary to do it in the old manner - "delta updates".
We were faced with the task of ensuring full and constant consistency of the state of the system upon delivery of delta packages, and the contents of the repository.
To do this, we organized the following process:
Thus, with the help of automatic control, it was possible to relatively quickly bring the product database code to git in the current state and maintain it without additional efforts from the project team. At the same time, the developers began to get used to the need to correctly and promptly commit the code to the repository.
After the previous stage was completed, we proceeded directly to deploying the application on a test environment. We have completely stopped the application of delta packages to test systems and switched to automatic deployment using VSTS.
From that moment, the whole team began to receive the first fruits from the efforts expended earlier: the deployment took place without any additional efforts. The custom code is automatically built, deployed, and tested.
Unfortunately, as we understood later, the "alignment of the repository" carried out led to the fact that we had a version of the stably supported version of "develop", but the version of "production" was also not available. And so beyond the test environment - there was nothing to go to QAS and PRD with.
The application code on the database side could be compared with the productive one and understand the differences. There was nothing to compare client applications with - there was only the actual productive version in the form of a set of executable files, and from which they were assembled reliably it was impossible to say.
After changing the approach to the assembly, the product had to be subjected to a large regression test. It was necessary to make sure that the application is running and nothing is lost.
When testing just got easier with the functionality placed on the side of the database. Fortunately, we had a significant set of autotests, covering critical areas.
But there were no tests for C # - so everything was checked by hand. It was a significant amount of work, and the check took some time.
Despite the testing, deploying to production was the first time scary.
We were lucky - we had just scheduled the next deployment of the system at the new site. And we decided to use this chance for a pilot deployment.
Users did not see, it was easy to fix possible errors of the assembly, the real productive work had not yet begun.
We deployed the system, and for several weeks it was in the mode of pre-productive use (low load, a certain pattern of use, which can be skipped in production). During this time, several defects were revealed during testing. They were corrected as they were found, and the new version immediately rolled out for checking.
After the official launch and the week of post-launch support, we announced that this is the first copy assembled and delivered "in a new way".
This version of the assembly became the first stable version of the master branch, it was hung with holiday tags "fisrt_deployment" (we didn’t order the badges with the commit hash, though).
As James Bond used to say: "the second time is much easier." After the success of the pilot deployment, we quickly connected the remaining instances of systems of a similar type.
But the system has several types of use - one functionality can be used for one type, and not used in other cases. Accordingly, the functionality tested on the implementation of the first type did not necessarily guarantee success for other cases.
To test the functionality of the remaining types of use, we began to use active projects that were under development. The idea was similar and the first deployment - we began to use automatic assemblies, "slipping" them to users along with the project functionality. Thus, users, working with the "project" version of the product at the same time tested and the old functionality.
Scaling itself revealed unexpected technical problems:
Non-uniform system landscape
In addition to directly deploying the application, we first had to take care that everything was the same everywhere - .Net versions, Powershell, and modules. It all took a fair amount of time.
Network connection
At some sites, the network connection simply did not allow all the components of the assembly to be pumped. There were timeouts, damage in the process of transfer. A lot of things checked and tried - not very successfully.
I had to dwell on the following solution: the build script was finalized so that all the results were packed into one large archive, which was then cut into small fragments (2 MB each). We finalized the deployment scenario to eliminate concurrency when downloading artifacts, took all 2 megabyte fragments and restored from them what is already possible to deploy.
Conflict with antivirus
Another strange problem we are facing is antivirus software conflict and one of the deployment steps: when any “suspicious” files, such as .js, .dll, are extracted from artifact archives, the antivirus starts to look at them closely. And the strangest thing is that the antivirus starts to rush to the file even before the end of the unpacking and the unpacking process drops with the message “the file is busy by another process”. While we are struggling with this, excluding the local folder with artifacts from scanning is not very good, but nothing else has been invented.
After stabilization of the assembly and deployment processes, we switched to “sewing shoes for shoemakers” - improving internal processes.
No | Stage description | Duration |
---|---|---|
one | From the beginning of the project - to full control over the code, the process of assembly and delivery to the test environment | 6 months |
2 | From the first deployment to the test environment - to the first pilot release on production | 3 months |
3 | From pilot deployment to production — until first release for all instances | 5 months |
Total duration - 14 months
The duration, especially at the final stage, was largely determined by the coordination, and the agreed calendar of system maintenance.
The total cost of the involved employees of the customer and the contractor for all work related to the change is approximately 250 people * days.
Source: https://habr.com/ru/post/427111/
All Articles