📜 ⬆️ ⬇️

SQLite: create a "hot" backup data

SQLite has a mechanism for creating a backup database “on the fly”. Many developers do not know about this for some reason. This mechanism is primitive, but it is suitable for many situations. In this article, we would like to discuss this most built-in backup capability, and also propose an architecture for our own backup mechanism. Well, or at least give a direction in which to move, if you need to arrange a complex replication of data.



Generally speaking, you need to start with the simplest option. The SQLite database is a single file (by default, the DELETE log mode). The application can regularly complete all transactions, close all connections to the database and simply copy the database file to the backup. If the database file is less than 100 MB in size, this action on a modern computer will take a couple of seconds. And it can be accelerated - read the file into memory (take a "snapshot"), allow work with the database and, in a separate thread, slowly, drop the contents into a file on disk. Many have enough of this, surprisingly enough.
')
Online Backup API

However, the base in memory is not copied in this way. So, Online Backup API . This is the SQLite API for creating a backup on the fly. Everything is arranged quite simply. The sqlite3_backup_init function starts the backup process:

sqlite3_backup *sqlite3_backup_init( sqlite3 *pDest, /*     */ const char *zDestName, /*    */ sqlite3 *pSource, /*     */ const char *zSourceName /*    */ ); 


In the parameters, the connection objects are transferred to the source database and the destination database (as the value of the alias, the “main” for the main database, the “temp” for the temporary one, or used for connection via the ATTACH operator) are transmitted. The object of the backup management is returned (if 0 is returned, then the error must be viewed in the connection to the destination database), which must be passed to the remaining API functions as the first parameter. Now you can make a reservation that is made page by page. To copy a portion of the nPage pages (or all, if nPage = -1), call the function sqlite3_backup_step :

  int sqlite3_backup_step(sqlite3_backup *p, int nPage); 

If this function returns SQLITE_DONE , then the backup is complete, all pages are copied. If SQLITE_OK, SQLITE_BUSY, SQLITE_LOCKED codes are received, then the copying is not completed, but it can be continued normally - the sqlite3_backup_step () function must also be called . Other return codes indicate an error has occurred. If SQLITE_DONE code is received , then sqlite3_backup_finish () should be called :

  int sqlite3_backup_finish(sqlite3_backup *p); 

and sleep peacefully, enjoying a successful backup. The following functions are used to obtain information about the current backup status:

  int sqlite3_backup_remaining(sqlite3_backup *p); //    int sqlite3_backup_pagecount(sqlite3_backup *p); //    

The whole Src base backup algorithm in Dst blocks on the SOME_PAGE_COUNT pages on the pseudocode a la pascal looks like this:

  Backup = sqlite3_backup_init(Dst, 'main', Src, 'main'); if Backup = nil then Abort; try repeat case sqlite3_backup_step(Backup, SOME_PAGE_COUNT) of SQLITE_DONE: break; SQLITE_OK, SQLITE_BUSY, SQLITE_LOCKED: continue; else Abort; end; Write(' ', sqlite3_backup_remaining(Backup) * 100 div sqlite3_backup_pagecount(Backup), '%'); Sleep(SOME_TIME); until false; finally sqlite3_backup_finish(Backup) end; 

When using this API, SQLite does not lock the source database in any way. You can not only read, but also update the data. What happens if we copy portions ( nPage > 0, i.e., not all pages at once in one call to sqlite3_backup_step ()) and the base has changed? Restart copy! Transparent for us, SQLite will start copying pages from the very beginning of the database. In other words, if the source database is changing rapidly, there is a chance not to wait until the backup is complete. There is some good news. If the source database is changed through the same connection to it as the reservation (even if from another thread), SQLite will transparently duplicate the changes to the destination base and the backup will not restart.

Advantages of SQLite Online Backup API - the source database is not blocked for reading and, if it is updated through a single connection, then writing to the database does not interfere. What to do if the database is very large and is often updated from different applications? It was the turn to think about creating a data replication system. Actually, nothing new will be invented here - with the help of triggers, we track what records have changed and keep a history of changes. We use the fact that any table contains a column ROWID , which contains a unique record number. Regularly transfer changes to another database. Just show how it is done in detail.

Simple data replication scheme

So, in the source database you need to create a table of records of affected records:

  CREATE TABLE IF NOT EXISTS system_replicate_table(name TEXT UNIQUE); CREATE TABLE IF NOT EXISTS system_replicate_record(name TEXT, id INTEGER, PRIMARY KEY(name,id) ); 

To enable replication on the Foo table, add it to the system_replicate_table :

  INSERT OR IGNORE INTO system_replicate_table(name) VALUES ('Foo'); 

and create replication triggers for it:

 function CreateTrigger(const Operation, TableName: string): string; begin result := Format('CREATE TRIGGER IF NOT EXISTS system_trigger_%s_replicate_%s AFTER %s ON %s FOR EACH ROW BEGIN ', [ TableName, Operation, Operation, TableName ]); if (Operation = 'INSERT') or (Operation = 'UPDATE') then result := result + Format('INSERT OR IGNORE INTO system_replicate_record(name, id) VALUES("%s", NEW.ROWID); ', [ TableName ]); if (Operation = 'DELETE') or (Operation = 'UPDATE') then result := result + Format('INSERT OR IGNORE INTO system_replicate_record(name, id) VALUES("%s", OLD.ROWID); ', [ TableName ]); result := result + ' END; '; end; Execute( CreateTrigger('INSERT', 'Foo') ); Execute( CreateTrigger('UPDATE', 'Foo') ); Execute( CreateTrigger('DELETE', 'Foo') ); 

Triggers are simple. They add the ROWID of the affected table entries to system_replicate_record (read NEW and OLD in SQLite triggers by yourself), if they are not already there. So, we include replication on tables, interesting to us, and we start to work with initial base. Data changes are tracked by triggers. At some point (by the time interval or by the number of records in system_replicate_record ) we perform data replication, that is, we transfer the changes. How to replicate changed records from table Foo to destination database? This is the most difficult part of replication. We will use the sqlite_master system table, which contains the SQL of all database objects. This SQL is an object creation statement (i.e. for the table Foo there will be “CREATE TABLE Foo (...)”).

The algorithm for copying the table Foo , if it is not already in the database Dst .

1) Get SQL tables:
  SELECT sql FROM sqlite_master WHERE tbl_name = 'Foo' and type = 'table'; 

and execute it in the destination database “as is” (we simply transfer to the execute method the connections to the destination database).
2) Select all records and transfer data (how exactly we transfer a little later):
  SELECT rowid as rowid, * FROM Foo 

3) If indices and triggers are also required to be transferred, then we also execute SQL in the destination database, which we obtain as follows (we exclude system indices and triggers):
  SELECT sql FROM sqlite_master WHERE tbl_name = 'Foo' and (type = "index" or type = "trigger") and not name LIKE 'system_%' and not name LIKE 'sqlite_%' 

The replication algorithm of the table Foo from the Src base to the Dst base.

1) If there is no Foo table in Dst , then copy it there from Src (see above) and go to 5)
2) Otherwise, select the ROWID of the affected records:
  SELECT id FROM sqlite_replicate_record WHERE name = 'Foo'; 

and combine them separated by a comma into a long string R, i.e. should get in R something like "123,256,334,4700, ..." .
(Do not even think about performing this operation by adding lines! Select the buffer and expand it as necessary. It is also a good idea to remember that the ROWID is an 8 byte integer with a sign) .
3) Delete these records from the Foo table in the Dst database:
  DELETE FROM [Foo] WHERE rowid IN (<   R>); 

4) select data from Foo in the Src database and copy (a little later about the actual copying of data) to the Dst database:
  SELECT rowid as rowid, * FROM [Foo] WHERE rowid IN (SELECT id FROM system_replicate_record WHERE name = 'Foo'); 

5) Clean the replication table in the Src database:
  DELETE FROM sqlite_replicate_record WHERE name = 'Foo'; 

It remains for us to understand - how to copy the data. It requires a little more programmer shamanism. Records are selected by the following request:

  SELECT rowid as rowid, * FROM [Foo] [ WHERE ... ] 

This is the only way to ensure that the ROWID is retrieved (and will have the name "ROWID"). For each extracted record we form the SQL insert statement (in UTF-8 encoding):
  INSERT INTO [Foo](<>) VALUES(<>) 

You should bypass all the columns in the selected record and add the column name to the " <names> " part, and the value to the " <values >" part, separated by commas. The column name should be framed '[' and ']'. The value should be represented as a SQL literal. As you know, SQLite has the following types of values:
  SQLITE_INTEGER = 1; SQLITE_FLOAT = 2; SQLITE_TEXT = 3; SQLITE_BLOB = 4; SQLITE_NULL = 5; 

We need to learn how to get each in the form of an SQL literal. The literal SQLITE_NULL is “null” . The literal SQLITE_INTEGER is a string representation of an integer (64 bits), 1234567890: "1234567890" . The literal SQLITE_FLOAT is a string representation of a real number with a dot as the separator of the fractional and integer parts, 123.456789: "123.456789" . To turn a string ( SQLITE_TEXT ) into a literal, double all single quotes in it and frame the resulting single quotes, "Hello, Mc'Duck": "'Hello, Mc''Duck'" . Stay BLOB . The SQLITE_BLOB (binary data) literals in SQLite have the form "x'A0B1C2 ... '" , where "A0" is the hex code of the first byte, "B1" is the hex code of the second byte, etc.

That's all. We described the simplest working version of data replication, in which the record is copied in its entirety. There is a field for optimization, of course. It will be useful to wrap all changes in the Dst database into a transaction. When forming the insert statement, the part with column names can be created once and reused.

The architecture does not support schema replication. If you change the source table by adding fields to it, this will break its replication. You must either delete the table in the destination database (so that it is completely copied again), or complicate replication by adding schema synchronization. The same considerations apply to newly created indices and triggers.

Ps. Use a modern manager to administer the SQLite database.

Source: https://habr.com/ru/post/200360/


All Articles