Prehistory
There are a lot of various articles in the network devoted to the migration of the database structure, in which many options are offered for a painless solution to this problem (it is worth considering that the need for this kind of migration is in itself a serious problem). It is curious, but it is suggested to exclude the need for such a solution less frequently. The author is confident that instead of finding a solution to a complex problem, it is better to try to prevent it. This should be done as early as possible. Below you will find a story about a project that demonstrates the negative side of the migration of the database structure.
Articles worth paying attention to:
Why the author can be trusted
Nobody can be trusted, but the author sincerely hopes that his words will be treated with a proper degree of skepticism, while continuing to be guided by common sense and logic. Nevertheless, the author will be happy if, instead of solving often often contrived problems, you spend time on something more pleasant. The ideas, evaluations, and assumptions described in this article are the result of many years of software development experience. Among other things, the author more than five years led the support of existing and the development of several new products for which the described problem is more than relevant. After several years, finally the opportunity to share information. Below you can find real examples of serious problems, ways to solve and prevent them. The project, which will be discussed, is rather small, but, nevertheless, quite indicative:
Summary
I will try to summarize all the comments, assumptions and conclusions to facilitate the subsequent discussion:
- Modern development process is focused on short iterations, frequent builds and continuous integration.
- Incremental changes in database structure are quite expensive, due to the obvious limitations of the DBMS
- In the early stages of development, the limitations of the DBMS are not always obvious / interesting (shown below)
- A successful product encounters most problems reaching a significant size, and it is then that the cost of fixing many problems is enormous.
- The choice of method for storing and processing data determines the architecture of the application
- Making significant changes to the architecture in the later stages is a very expensive pleasure.
- It is at the later stages of the development that the process is attempted to be accelerated due to clear advantages in terms of business and profitability.
- Changes to the database structure often entail additional processing / conversion of stored data.
- In the later stages, any changes to the database structure, due to the many limitations of the DBMS, are problematic.
- For many projects, the versioned migration of the database structure at the early stages will seem like an obvious solution, later becoming a serious limitation. More or less universal recipe - it is necessary to avoid frequent changes in the structure of the database.
Another
example
Features of modern development
Modern style of development imposes a lot of restrictions and requirements that have to be considered. For example, you may be asked to build assemblies that are ready for testing, with a period from one week to several hours. It should be understood that the team responsible for testing or another development team, independently implementing any functionality, can already have the previous build of the product installed (the database has already been created and contains test data that no one wants to lose), respectively respect the time of people and maintain a transparent upgrade between builds. With intensive development, changes to the database structure can occur with each new build.
')
The final release can include many changes to the database structure (often mutually exclusive), providing an incredibly slow upgrade on the client side, which, in turn, can lead to dissatisfaction on its part, because regardless of the real purpose of the system, the time of forced inactivity is very critical factor. In other words, the development process may not be ideal, require additional time to solve various problems with migration, but for the client everything should go as quickly as possible, without any problems and external interventions.
Short:
- intensive development - frequent assemblies and frequent changes in the database structure
- developers don't work in isolation - remember testers and dependent developers
- problems may arise at any stage of the migration
- very often migration problems are difficult to identify right away
- "Broken base" always leads to time-consuming
- a client who waits for several hours while the base is “updated” is upset or even angry
- customer failure means several orders of magnitude more problems and time needed to solve them
Example One, a 14 year living system
Short description
The main problem to be solved is the management of the average number of network devices of several types. The size of real installations varied from 1 to 60 devices. From the system, the devices receive the configuration and control commands, sending in response confirmations, various statistics and telemetry. The system also implements a user interface for editing various device configurations. It is worth noting the complexity of the configuration (and user interface), the need to keep history, to support a huge number of options, specific user data types, etc. The configuration could include from ten to several hundred different parameters. A database was used to store all the data (mainly MySQL).
Simplified model
In fact, the task of the system is to store several hundred parameters for devices of each type, provide the user with a convenient interface for editing them and creating the resulting configuration that is sent to the device.
Chronology: the beginning
As in the case of any other product, the development of this system began with a single device, several parameters and a very simple database structure. According to the first idea, a separate column was used for each parameter in the table storing the configuration. Such a solution, being straightforward and far from optimal, did not last long: the growing client base required the addition of all new and new functionality, and therefore the number of parameters for each device increased, new types of devices were added, a story appeared with the ability to roll back to one of the previous ones versions, etc.

The growing client base demanded a more and more active style of development, the implementation of new customer requests and the transition to a more stringent release schedule from the company. The need to accelerate the development led to the creation of several, almost independent teams: someone focused on the development of devices, someone continued to develop a management system, several teams were responsible for testing the entire "zoo" of solutions, which, suddenly, began to bring significant profit.
Problems at this stage
Since users interacted with devices through the control system — the tasks of developing and improving the latter, moved into the category of very important ones. The implementation of the new functionality was reduced to adding, deleting or replacing one or more parameters (columns) in the database and corresponding revisions of the user interface and the generated configuration. But, as it turned out, most of the time, developers spent on solving various problems associated with changes in the database structure. Nobody wanted to part with their test data, no one wanted to perform additional steps about bringing the system into working condition after the next changes.
The company's customers had no less problems: few people wanted to lose the ability to control devices for a long period of time and clearly nobody wanted to be without a control system at all. But since the database size varied from a few tens of megabytes to a gigabyte, there were also dissatisfied customers spending several hours waiting for the migration to complete (the release often included dozens or even hundreds of changes) and even blatantly angry customers who received this or that error during the installation process. The new version of the system and waiting until the support service pays attention to them.
Technical nuances
Modifying existing tables using ALTER TABLE will most likely not be a cheap operation. The author omits the ideal variants in advance, which boil down to changing the table's metadata, asking the reader to think about the indexes that may exist, about the need to transform data, about the physical placement of data on disk and the difference between the idealized table view and the actual placement of data and corresponding performance problems.
By the way, you can often come across a call not to use ALTER TABLE, preferring the copy-rename procedure (for example:
Don't Alter Table. Do Copy and Rename )
For data conversion is not the most trivial complexity, in any case, you will have to use additional logic implemented in any programming language.
Decision
At this stage, the bright heads of the company decided to completely get rid of the need to change the database structure and store the device configuration in a very fashionable (just started to gain popularity) XML format. You have to agree, it’s very convenient: we deserialize XML into a tree of objects, work with them, then we serialize it back to XML and save it to the database. Naturally, a BLOB appears in the database, but after a brief hesitation, it was decided to leave the problem to the database. For data conversion, small independent programs were used (initially perl scripts, then java logic), which provided for sequential database conversion, say, from version X to version Y.
Effect
Since the addition of new functionality (new options) was reduced to a change in the XML representation — changes in the database structure were reduced to zero, significantly simplifying the process (at least in terms of working with the database). The development process has accelerated since the opportunity has arisen to work with more complex data structures without worrying about almost anything. Now you can save history (just save a few XML), etc.
Chronology: formation
After considerable processing of the configuration storage method, the development of the system accelerated and the number of problems, at first glance, decreased. The company has released a new version, which, when installed, converted the old database into a new format, once again wasting customer time for no apparent reason, but promising to save them from all the troubles in the future. Clients believed and endured. Then new versions followed, which expanded the functionality of the system and solved the problems of the previous ones. The transition to a newer version of the system took from 15 to 30 minutes, which looked like a very serious achievement. But it was not for long. Increasing the functionality led to the explosive growth of stored data. Far from the last role in this was played by the relative cheapness of the configuration change. The developers chose the easiest way to implement without worrying about the future at all.
The result of this approach was the mountains of XML code and a large number of necessary transformations during the transition to each subsequent version. At this stage, to move the database from the X.1 state to the X.2 state (a new functionality was added), it was necessary to perform one or more transformations on the configuration views stored in the XML database. Only after successful implementation of such transformations could the system start working (the updated code could deserialize the stored representation).
Problems at this stage
Configuration transformations were a set of consecutive transformations of saved XML files (xml files were actually stored in the database, which in the early stages were even completely compatible with devices). Some of these transformations interpreted the XML representation as a string and used the usual substitution. Other programmers preferred to work with the DOM tree or write “optimized” variants using SAX parsers. Such transformations were independent and, for example, the presence of a large number of them in the release, could mean the need to perform several transformations for one configuration type, each of which independently performed deserialization into the DOM tree, tree modification and subsequent serialization, thus repeating resource-intensive and completely optional steps several times.
For developers, this operation took a short time, due to the extremely small size of the test data used during development. The size of the actual configuration stored by real systems was several orders of magnitude larger (from 100 kilobytes to 1-2 megabytes, with a total base size of up to 1-2 GB). The transition to a new version of the system, once again, has become a big problem for customers.

The complexity of the code that was involved in converting the stored XML was constantly increasing as the number of parameters increased, the number of possible options, the number of clients, etc. There were errors that led to the appearance of a "bat" or simply wrong, from the point of view of application logic, configuration. The main part of this kind of problems was hidden and manifested only in the most complex and sophisticated cases - with the biggest and most valuable customers. Complicating the situation and the fact that during the development of database conversion run manually. Then, before the release of the next public version, one of the developers collected all the code responsible for converting the database into one application and formed the final build. Given the huge number of options and a very tight (as always) timeframe, it was not possible to test all the nuances of the system. Clients found bugs, lost databases (if they forgot to take care of the backup), lost databases later, when they got to problematic branches, etc.
It is possible to describe global Armageddon for a long time, I can only say that after the next release, the developers spent several weeks doing the manual editing of the configuration in the customer databases. The overall situation was becoming threatening.
Decision
This time, the bright minds for some time doubted their own highness, but there was no going back. Once again, a global redesign of the system was not possible due to the enormous amount of work. After a lengthy meeting, it was decided:
- introduce the concept of upgrade between builds and during development (testing) use mechanisms that will be used by clients in the future
- carefully write the code to convert the configuration
- test the system more intensively using customer databases (testing costs have increased significantly)
Effect
The effect was not significant, in addition, it was the merit of the customers who finally upgraded (finally) and tested the system on live tasks.
Chronology: Mature System
The system continued to be sold, becoming more and more complex. Clients put up with problems, the support service did what they could, often transferring problems to the developers, because only they could try to fix the base with their hands. Various kinds of patches appeared, in a friendly atmosphere people exchanged working “spells” that prevented this or that problem. In a word - life was in full swing.
Problems at this stage
At this stage, the task was to solve the problem of a long update when upgrading to a new version, streamline the logic of updating the database and simplify the work of other commands, which, by the way, still had to spend considerable time editing and catching various bugs related to the converted configuration.
In addition, some, especially active, customers managed to bring the size of the configuration to 10 megabytes and the total size of the database to 5 gigabytes, after which they faced a new class of very interesting problems. To complete the picture, it is possible by the desire of the management to obtain a system compatible with the devices of previous versions. All the time the device and the control system developed simultaneously. Example: System.1 sent the configuration to Device.1, System.2 to Device.2, etc. The task was to ensure that Sistema.4 could send the correct configuration to Devices. 2, 3, 4. Differences, by the way, were often very significant.
Decision
For writing the configuration conversion logic, a special mechanism was implemented, which, among other things, was able to verify the correctness of the changes made at each stage, thus preventing the appearance of broken databases. Automated system of testing and verification of the entire product did its job, revealing even hidden and sometimes possible problems in theory.
The configuration began to be stored in an archived form, solving both the size problem and the reading and conversion speed problem (processor performance is increasing, but the disk subsystem has remained a terrible brake, and so has it. Base performance (MySQL) has really increased). Using a stable and streamlined update mechanism, an inverse transformation mechanism was implemented, which allowed supporting devices of different versions. Automated tests were used in this case. Developers' lives, among other things, significantly simplified the universal device emulator, which not only showed that the system was sending control, but also checked the resulting configuration, indicating potential errors.

Garbage was removed from the databases of all clients, after which their size was reduced by an order of magnitude and the installation time for new versions was significantly reduced. Nevertheless, to solve all the problems, skillfully and very carefully designed at the earliest stages, was not possible for obvious reasons.
Conclusion
The author received this project in a rather poor state. Together with the project, several hundreds of bugs, real customers and many requests for additional functionality, including a terrible request for a system with backward compatibility and the adaptation of the entire economy under the new-fashioned Agile, were inherited. By the way, the implementation of continuous integration (continuous integration) for a large project that has been existing for a long time and checking everything that can be checked after each commit is a separate and very interesting topic. We managed to cope with the task solely thanks to an excellent team and unlimited optimism. Many bugs have been fixed:
The development process was quite intensive:
Remember, a real client of the two systems will choose the most stable and convenient option, which will not require expensive maintenance and will provide the minimum time of forced inactivity. Developing in the XXI century a system requiring a stop for maintenance is somewhat strange. Developing a “lying” system from several hours to days is at least silly.
In the next article (if such an opportunity presents itself), the author will tell about the work on another system created from scratch and devoid of practically all the problems mentioned in this article. The new design can significantly reduce development and testing time, providing the opportunity to provide an updated assembly within a few hours. Naturally, the database is still in use, but no additional steps are required when upgrading to the new version, everything happens transparently, stably and quickly.
Bonus
Solving the problem of updating XML documents stored in a database, among others, the use of XSLT transformations was considered. The solution seemed attractive and successful, allowing the use of several optimization techniques. For example, several transformations can be represented in the form of a pipeline and with a sufficient number of them, a noticeable gain in performance can be obtained. Below are the results of a small evaluation test. Tests 1-5 differ in the size of the data being processed. Do not take the results as a guide to action or ultimate truth.