From version database migration to database change management

Thanks to people who do not hesitate to share their thoughts and experiences, even negative ones, on many important issues of organizing work with database systems. I came across the article “ Version migration of the database structure: why it is better not to do this, ” I thought about commenting on it, but, having considered the publication date, I decided to write my own. It is obvious that the author had his own idea of the meaning and meaning of the words he pronounced in the title. And an inaccurate representation led to the fact that a completely different task was being solved. On Habré, quite a long time ago, articles appeared on the organization of version-based migration of databases. They are easily detected by keyword searches. Here in this article: VERSION MIGRATION OF DATABASE STRUCTURE: BASIC APPROACHES provides an excellent introduction to the terminology, tasks and basic techniques for solving them.
I would like to tell by example about those unexpected problems that, without an invitation, suddenly arose before our group on one of my old works, and about what we lacked then, for a quick and effective resolution of the situation - in general, also a negative experience - Suddenly, someone will come in handy now or in the future. Despite the fact that in our company we are more broadly approaching the solution of such problems, combining them under the term “Managing changes in the database”, I will try to stay in the field of terminology from the article above.

EXPERIENCE OF THE PAST YEARS - ABOUT UNSOLVED PROBLEMS

In 1997, the development team, where I had just entered, was given the task within 3 months to create a software package that implements an automated technology that was supposed to form the basis of the business activity of the entire company. The matter is long-standing and, with your permission, I will not delve into the details and details of technology and business processes. It is important that it was necessary to process and interconnect the daily received data supplied in significant volumes from two independent external sources, accumulate them in the repository, with minimal delays, give answers to arbitrary requests from our customers - managers within the company, fulfill the forecast of many indicators based on retrospective analysis of the accumulated volumes of information. This task was accomplished, a typical internal corporate system was created, which has been working from that time to the present day, successfully changing and refining it throughout this period.

The first signs of a problem appeared when the management of the IT department ordered the transfer of a copy of this system to a data center of one of the customers. At the same time, the term, as usual, was “yesterday”. Since the source of information was the same as ours, the data streams from the same supplier did not have to significantly change the technology, ETL remained almost the same, the list of requests was narrowed, the set of reports was slightly modified and limited. Nevertheless, the technical base on which all this was supposed to work was already different: instead of Oracle, there was a DBMS MS SQL Server. But, the base structure itself has not changed, even the data types did not require complex conversion.

Now our support has become two close in design and functionality, but different in the implementation of the system versions. Soon the fairy tale affects, after a while we received in support of about 30 more identical systems for work in branches throughout the country. Deploy of new versions was carried out by copying a new version from the center. From the point of view of version migration, this was one standard version, even the configuration parameters of all servers should have been the same, the self-activity of local administrators was not simply discouraged, but prohibited, and violations of this ban should be monitored daily. It was also necessary to have control over the state and contents of directories - the basis for further “information” from the branch databases to the central one - and the contents should be guaranteed the same for all local databases.

Well, the last stage, as many have already guessed, was the emergence of another 4-5 separate options for the operation of the company's branches abroad. Moreover, each country has its own supplier of source data, some data elements are missing, some of the necessary information is “scattered” in the supplied sets, some indicators are calculated according to other rules. This means that each variant is fundamentally different in the ETL part, and this applies not only to the applications and procedures, but also to the so-called data structures. Stage Database (a working database for performing ETL transformations without interference from the main one).
')
Thus, having developed a system for purely domestic use, after a short time, without any plans planned for the future, we found ourselves acting in the role of owners of a large zoo of exotic versions. And I guess not only us.

No matter how general and brief this story is, some characteristic features related to database migration can be extracted from it.

Simultaneous support of a large number of different versions of the application - an objective reality, independent of the desires of developers or users
Migration is required not only when transferring the database to another server or DBMS platform, but also when synchronizing the database with the existing version (version, generation) of the application software
The difference in versions is connected not only with differences in processing logic and database structures, but also in various DBMS platforms, as well as in special settings for the existing hardware.
It requires not only the management of the migration process, but also the control of the immutability of the schemes, data and settings, audit (reporting) on the composition of the elements that have been changed, by whom and by what values
Changes in database structure during migration often entail additional processing / conversion of stored shared data.

Probably, it was easier for us, since the tasks "piled on" gradually. Yes, and the team was stable, with clearly distributed functions of each employee, both in vertical activities and private tasks. However, every time changes were made to the application software, it was still planned or spontaneous, we had to transfer all to the new version of the application. And the version of the working database should correspond to this updated version, you need to perform versioned migration of schemes, data, settings.

In varying degrees, suitable solutions exist and are quite common. It is important to come to the most convenient and less time-consuming method. I am talking about negative experiences, because now I would have chosen a different technology, and very much regret its absence at that time.

It would seem, what is easier to have common scripts to modify all servers and automatically execute them on each server!

The first problem was that in different places the versions of the application programs to which it was necessary to switch were different! A single modification script could not exist, it was necessary to spend a lot of time studying the current state, writing and testing scripts for each server separately. Testing is a separate issue, because first you had to create a test bench.

Supporting the change log that was made on a separate server during the life of the old version of the application did not bring relief, because if there were several such changes, the synchronization scripts must be executed strictly in the "correct" sequence, and, in some cases, repeated execution of one script is absolutely unacceptable - the result will be irreversible.

Now multiply the time spent on the number of different servers!

The reverse problem: the local admin decided to “improve” the database on his own or, for example, due to the lack of disk space, “demolished” a large index and “forgot” to restore it after transferring the database files to another medium. We could detect such changes in the settings and structure fairly easily, but we need to have this particular version of the database in the initial state in order to compare. And if changes in the scheme could be detected and corrected quickly enough, changes in the server configuration parameters or in the actual data were detected with great effort and for a much longer period. And then carry out all the technology to create correction scripts and their implementation. And at the same time to investigate who and at what point made these changes.

If there were available tools that automatically supported the approach to the database structure as to the source code, like all the usual source control systems in various programming languages, the following tasks would be much easier:

Updating a database from a specific version to any other one in one step, both later and returning to the previous one;
Easy receipt of migration scripts in automatic mode, with the possibility of "manual" corrections in a pinch;
Creating from scratch a new instance of the database corresponding to the existing version of the application;
Simple creation of test / developer database instances based on current working databases for developing on them that best fit these workers.
Control and audit of undesirable changes in database instances, if necessary, automatic return to the reference state in a short time.

This is what was missing in the story being described. This approach is not very effective to use without using any instrumental solutions, but then we did not have enough time or resources to develop our own full-fledged utility. Meshala and policy management IT-department. Nevertheless, today there are ready-made products that implement this approach, each with its own set of features and functions. I plan to talk about one of the possible solutions in the next post.

There are many different ways and solutions to store and manage changes in the database. It is important to find the most appropriate approach and apply a tool that will help you increase the degree of automation of database versioning migration, improve the quality and reliability of your work, save resources and time of your employees. In this article, I tried with a life example to tell where the problems of managing changes in the database come from, what difficulties this entails, and what conclusion I came to on the basis of this, by and large, negative experience.

Source: https://habr.com/ru/post/204504/

All Articles

From version database migration to database change management

EXPERIENCE OF THE PAST YEARS - ABOUT UNSOLVED PROBLEMS

More articles: