I want to talk about how we wanted to make our own backup of databases for Microsoft SQL Server, and what came of it.
Problem
We in the company use Microsoft SQL Server, and it is expected that we need to make database backups for it, which we successfully do. This is done as follows:
- The first script makes backups of necessary databases.
- The second script goes through folders with backups and archives them in 7zip
- The third script takes these backups to external storage.
It seems everything is logical in my opinion, but there is some oddity in all this. Since some time on the server there are files that actively take up space, and then these files are deleted as unnecessary. I wanted to optimize all this. In addition, we have a product in the company, one of whose tasks is to synchronize the database, and a similar code that optimizes the task would also be very useful.
')
Solution options
SQL Compressed Backup
A good idea, but
not every version of SQL supports it, besides, the size of this backup is still not the smallest. And given the fact that in most cases, backups never come in handy, I would like a stronger compression. But a great option in terms of convenience.
Backup to Azure
SQL Server since 2012 supports
backup in Azure . There was an idea to specify your own service as a URL and do everything in it. But it is very difficult and dirty technology came out. And the SQL version should be quite fresh.
Backup to pipe
A great idea, but since 2005 SQL is no longer supported.
Backup via VDI
The secret technology from Microsoft, which replaced the pipe. Why secret? And try to find documentation on it. This is another quest. There is
documentation on the Microsoft site , which seems understandable, but nothing is clear. Because it seems about VDI, and it seems like about VSS. There are its reprints and plush examples of use on various sites.
In fact, the most understandable documentation is in the chm-file at
this link.After all this documentation, there was a feeling of some kind of service technology, but since nothing else was found, settled on it. In addition, it has been working on all SQL since the 2005 version (even since 2000, but with some limitations), on any edition.
We use VDI
There is nothing to do, they took the documentation, took Anya, gave out the documentation, Google and sent to study this technology.
Anya is studying the documentationIf I came up with this technology, I would do this:
- Registered virtual device in SQL
- Would make a backup command to this device with parameters
- Would write an instruction to the user how to use
- ?????
- PROFIT!
That would be logical, but Microsoft is not looking for easy ways. They came up with the following scheme
- We register a one-time device in SQL with a random name (usually everyone uses guides)
- We are waiting for data from SQL
- At the same time, we make backup to this device in another thread (we came up with the name ourselves, so only we can call)
- Get data from SQL in chunks
- All received, close the device and finish the backup command of the base
Of course, everything is written on the "wonderful" COM, so that life with honey does not seem to be seasoned with the following restrictions:
- SQL and application must be on the same server (thank God, different SQL instances are supported)
- SQL user must be sysadmin (modestly)
- The user running the application must be an administrator or a user with magic rights Create Global Objects
On the last point in more detail: in the documentation there is such a masterpiece phrase:
You can modify the client process. For this process, it must be serialized with invocation of CreateEx. CreateEx will not be serialized with external processing. Access to SQL Server service.A typical phrase of the form:
feel like a loser . Those. All words are clear, but there is no point. In fact, as I wrote above, you need the
Create Global Objects right, which by default is available to administrators, Network Service and Local Service. Those. In principle, everything can be started from the service, and not by the administrator (the user who makes backups, usually not the administrator), is impossible. But you can issue the right handles, through group policy (we just made a command in the utility that issues the specified user rights, that is, once issued by the admin, then you can work). Unfortunately, they could not deal with UAC. If you forget to run the program explicitly under the administrator, then the right not to get any side. It was possible to explicitly indicate in the manifesto that we need admin rights, but then, in principle, it would be impossible to use the program not by the administrator. And I want to.
With this sorted out, we go further. I will not write pieces of code and describe in detail what to do, because it is not interesting for anyone, but for whom it is interesting, they will figure it out. :) In general, there is a usual GetCommand view loop, CompleteCommand, in the middle of which is the processing of results. I'll just tell you some of the nuances that came out when using this COM in .NET
Lyrical digressionI completely forgot to tell what kind of utility we wanted to do (and did in the end). We wanted to backup the database immediately to the archive. Normal, zip. Well, recovery from it, of course. Those. no special file formats. You can unpack and restore the backup without this utility, or pack an existing backup and then restore it with the utility. Of course, the first iteration encountered a problem: not all libraries for compression normally digest files larger than 2GB, as a result they stopped at 7zip and
wrapped SevenZipLib for it. As a result, we were able to archive in 7z, zip, bz2, xz, gz. True, 7zip in this mode also works through COM ... which gave us additional joy.
Com great and terrible
Yes, we decided to write everything on .NET. Actually, the marshalling in COM there is quite tolerable (half of the system libraries are essentially wrappers over COM), and it seemed that there would be no problems. But they appeared literally at every turn.
This part will probably be obvious, for those who have already encountered this, but those who have not, may be interesting.
When declaring objects, method names do not matter
Only the order is important (this is in contrast to the WinAPI methods with the DllImport attribute). Those. if you accidentally confuse the two methods in places (especially similar ones like Create and CreateEx), you can catch some wonderful errors when everything seems to be working, but somehow not quite as it should.
Watch this.
.NET for you will throw exceptions
In COM methods usually return a HRESULT, i.e. error code. Well, or the 0th code, if all is well. .NET takes advantage of this and, in the case of a non-zero code, throws a COMException, in which you can already look at the code and process it. It seems comfortable, but not quite. In addition, on one of the servers with the old .NET a great bug was found. When this ekpepshen (for example, when we turn to a closed device, which in general according to the documentation is a normal situation), an ExecutionEngineException came up with the application's fatal crash. As a solution, Google offered to install updates, but in general, this is not a solution for the utility, so we avoided the problem as follows: Added the PreserveSig attribute to each COM method, which in case of an error will not throw an error, but simply return an error code. In this case, no terrible problems arose. Well, the code has become more logical.
STAThread and MTAThread
For developers on .NET, this is a kind of magic attribute that the studio adds to the Main method. If someone tries to understand what it is, he will receive the following text, elegant in its simplicity:
STAThreadAttribute indicates that there is a single-threaded apartment. This is a Windows Forms; if it is omitted, the windows components might not work correctly. If the attribute is not present, it is not supported for Windows Forms.Which in translation means: dude, it should be so.
Well, of course, so, of course necessary, and when our application was archived in zip, everything was fine, and when I tried to use 7z, everything hung epically. What is the problem, and what do these attributes really mean?
In short, the meaning is as follows. COM out of the box is multi-threaded. Those. different objects receive and return data in different streams. But Krivoruk code (or Windows Forms) cannot master such a complex concept, so they wrap all the COM calls in an internal queue and execute in one thread. No problems with multithreading! But we epically stuck in Dead Lock. How much did it cost to understand that this is a deadlock, and deadlock somewhere in the guts of .NET ... uuuh ...
But we were lucky, we have a console application, we just hung up the MTAThread attribute and calmed down. If this does not suit you ... what is the suffering? There are workarounds.
Creating objects can be weird.
.NET creates convenient wrappers, so instantiating objects looks simple and obvious:
var obj = new ClientVirtualDeviceSet();
And everything worked for us until we tried to start everything on a 32-bit Windows. And when you create an object, you get an exception. It was logical to assume that the problem with data types and marshalling (not the same size), but everything was in order. What exactly is the problem, they did not find out, they just created an object like in the good old days, through
CoCreateInstance .
object comObject; var errorCode = CoCreateInstance( CLSID_MSSQL_ClientVirtualDeviceSet, IntPtr.Zero, CLSCTX_INPROC_SERVER, CLSID_MSSQL_ClientVirtualDeviceSet, out comObject);
Other
Any small things with a marshalling, I will not paint the correct layout of structures, in principle, they are brought to the mind once and no longer touch. Just a tip, if it’s hard to deal with pointers, give IntPtr and use the pens to convert it to the desired type. Might be easier than picking the right attributes for .NET
A little about 7z
More precisely about SevenZipSharp, but since it is generally a wrapper (very good), then about him too.
The first thing that can confuse is the fact that the library is sharpened with the work with files (and not with streams), so the logic is inside out. Those. In .NET, the compressor usually is a wrapper over the writeable stream, you write to it, it compresses and transmits the compressed data to the internal stream. In SevenZipSharp, on the contrary, it reads data from your stream and writes it into its internal stream. With regard to the fact that we read from SQL as a whole, Anya has many pleasant minutes of proceedings with asynchronous streams, locks, events ... In general, to the amateur :)
The second, again related to files, is that the library loves to run along the stream in different directions, which is generally good, but if the stream does not allow it (for example, stdout), problems begin. In principle, we managed to get around this (also a separate quest, after which Anya began to hate me) for gzip, xz, gzip, 7z, but with the zip, difficulties arose. The flow structure does not allow this. Reading from stdin is still worse; in principle, the library cannot do this. For the time being, we had to abandon this possibility, and leave only the ability to read from stdin only uncompressed backups.
Results
After all this exciting quest, we got a utility with the original name PackDb, which can make backups of databases immediately to the archive, restore them from there, and a bonus, another fascinating feature: copying the database without creating temporary files. Why is this necessary? For test databases, when you need to copy the reference database to a new one and test it on this database, without touching the main one. Well, all sorts of little things like issuing rights, validating backup correctness or the ability not to use archives, if you don’t just need to take bak files.

Should you go through this quest, if you want to make unusual backups, you decide. But in my opinion, this is a very ungrateful business. Too many nuances to make everything beautiful.