📜 ⬆️ ⬇️

How to write a quick scripting and database deployment system

Part One, Historical


Now historians are trying to present,
that in fifteen some year
there was something there.
Yes, there was nothing!
V.S. Chernomyrdin


So, it all started back in 2006, when I got into an IT company that was engaged in developing solutions in the field of telecommunications. Our team developed a C # application that retrieved some data from the MS Sql Server database, processed it and added it back. At first, there were about 10 tables and a couple of stored procedures in our database. Over time, the number of tables and other objects began to grow. We began to wonder how to manage these objects. We stored scripts in the version control system.

Initially, we used the DB Project from Visual Studio, but over time it became obvious that this environment was deploying for a very long time, because compares the target database and source scripts. In addition, we needed in the database a part of objects from a large subsystem with which we integrated and which we did not want to store in our repository. As a result, we wrote a simple application that received an xml file with a list of objects that we want to deploy. This file was manually edited by each developer, and its “textualness” made it possible to eliminate conflicts under committees. In the future, we abandoned the xml file in favor of the strict order of storing scripts and the order of their roll. (tables rolled in front of views, etc.)

Everything was fine until a new problem appeared. Our technical support could edit the code in the release branch, as well as sometimes make changes directly on the client, which led to problems when installing the next release. We decided to implement a subsystem that would detect "illegal" changes on the client. Our application scripted the database, sent the scripts to our server, deployed them to the correct structure and then our deployment system was connected, which at the output received two databases - from the repository and from the client. Next, it started all the same Visual Studio, which compared the database.
')
Scripting was done using the SMO (Server Management Objects) library.

It was easy to work with her, but there was one big problem - speed. She could script about 6 objects per second. The database on the client consisted of about 10,000 objects and the scripting process took half an hour.

A few months ago, my colleague and I decided to implement the fastest database scripting and deployment system, taking into account all the advantages and disadvantages of what we have already done. That's what we did.

Part Two, Technical


Modern technology is so
eased the work of women
what men appeared
a lot of free time.
Vladislav Gzheschik


Scripting

For the scripting system, we defined the following requirements:
• Support for all versions of Ms Sql Server since 2008;
• Scripts need to be formed in parallel (the scripting of any database should last seconds, not minutes);
• The storage structure of the scripts must be defined in the settings;
• Ability to select the types of objects that need to be scripted (for example, get scripts only stored procedures);
• Storage of scripts in the file system, archive, cloud service.

First of all, we tried to start scripting through SMO in parallel, but received about 60 objects per second, which was no good. It was decided to use meta information from system objects Ms Sql Server, from which it was possible to form a script. They began to think, and how to form a script for some abstract data. And here El-expressions from Java came to help us. After a brief encoding, we got our template engine.
An example of a template for generating the primary key script:

${templ::if_cont_coll(${PKFields},${ALTER TABLE [${Schema}].[${TableName}] ADD ${templ::if_cont_field_val(${PKName},${ CONSTRAINT [${PKName}] },${})}${ConstraintType} ${PKType} ( ${templ::for(${PKFields}, ${[${FieldName}] ${Order}},${, })} )WITH (${PKProperties})${templ::if_cont_field_val(${PKFileGroup},${ ON [${PKFileGroup}]},${})}${templ::if_cont_coll(${PartitionPKFields},${(${templ::for(${PartitionPKFields},${[${PartitionFieldName}]},${,})})})}})} 

This template in its logical structure resembles the description from MSDN .

When loading meta information about a database object, we create a set of fields and collections. For a particular primary key, we get fields such as Schema, TableName, etc. We load the fields by which the key is built into the PKFields collection.

Template engine allows you to implement any functions. For example, you can create a conditional IF ... ELSE statement, a FOR loop statement, and so on. (In the example, all functions begin with the keyword templ: :).

Now we have templates for each type of object, it is time to solve the problem of slow scripting.
We have created a special TypeMapper file that contains information about a specific type of scripting. Extract from this file:

 <TypeGroup name="MSSQL2008" version="Microsoft SQL Server 2008"> <Types> <Type name="Users" pattern ="${RootPath}\Users\${Name}.sql" priority="0" sqltype="SQL_USER" TopLevel="true"> <Factory assembly="ScriptEngine" typename="ScriptEngine.Templates.UserDescriptorFactory"></Factory> <ScriptOption assembly="ScriptEngine" typename="ScriptEngine.Templates.UserScriptOption"></ScriptOption> <TemplateGenerator assembly="ScriptEngine" typename="ScriptEngine.Templates.UserTemplateGenerator"></TemplateGenerator> </Type> 

This structure allows you to define different rules, depending on the version of Ms Sql Server, set the path mask by which the script will be saved (the same template engine), define the names of the types responsible for loading the meta information and forming the script template.

As a result, we get a list of the types that we are going to script and build the following processing pipeline:

image

As a result, we create N requests in Ms Sql Server and use them to create descriptor objects that describe a specific object and pass it to the TemplateGenerator through a competitive queue. The template generator runs in parallel by the number of logical cores and sends the finished script with its name to FileSaver through another competitive queue. FileSaver saves files in parallel, depending on the bandwidth of the disk subsystem.

As a result, the database in which there are 15,000 objects is scripted in 50 seconds on a regular desktop computer with a local instance of Ms Sql Server, while SMO can do it in 50 minutes!

Also during scripting, we form a special dependency graph, more about which will be discussed in the next section.

Deployment

In the first part of the article, I told you that at first we deployed scripts from a flat list that was formed by developers. We decided that this was inconvenient and tried to solve this problem. When scripting, we form a special dependency graph, in which there is information about how a particular script should be deployed in order. Thus, we can execute scripts that are on the same level in parallel.
You can ask, what to do if there is a dependency graph, and new scripts have been added to the script directory?
In this case, if the script is not found in the dependency graph, it is considered independent and is executed in the first batch. If he “fell”, then such a script is placed in the “second chance” queue. After deploying all the scripts, it is redeployed from the “second chance” queue. This is repeated until the new “second chance” line is empty.

Part Three, Practical


Theoretically, anything can happen
anything. But in practice
Often happens all that disagreeable.
Yuri Tatarkin


So what we managed to get. We learned how to quickly get scripts of any database objects. Execute scripts, considering their dependencies. What we have made of these features.

Spindle Scripting Too l is a desktop application that allows you to get database scripts in a few clicks by selecting the types of objects to be scripted. You can change the templates for the formation of names of file scripts. Scripts can be saved on disk, in the archive, in the cloud service. At the same time, a dependency graph and a project file are created, which allows for deployment, simply by selecting the desired project file.

Spindle SSMS Addin is an add-on for Sql Server Management Studio. This addition allows you to create database scripts by selecting a database in the Object Explorer window. As well as a unique functionality - right in the script editor window, selecting the database object, click the right mouse button and select the command to get the script in the context menu. The script will be formed in a few milliseconds and open in a new editor window. This feature, unlike the built-in feature, works fine with remote servers.

Well, for sweetness - we realized that we can compare 2 databases with each other in seconds.

Spindle Comparer - super fast database comparator. Simply select any 2 databases and click the Compare button.

Source: https://habr.com/ru/post/248615/


All Articles