
Many of us have encountered in our work with the DBMS. At the moment, databases in one form or another surround us everywhere, starting with mobile phones and ending with social networks, including our favorite Habr. Relational DBMS are the most common members of the DBMS family, and most of them are transactional.
At the institute, we were forced to memorize the definition of ACID and the properties behind it, but for some reason the party avoided the details of the implementation of this paradigm. In this article I will try to partially fill this gap, talking about the MVCC, which is used in such databases as Oracle, Postgres, MySQL, etc. and is very simple and intuitive.
So, it is worth starting with the definition of ACID:
- A tomicity - transactions are atomic, that is, either all transaction changes are committed (commit), or all are rolled back (rollback);
- C onsistency - transactions do not violate the consistency of the data, that is, they transfer the database from one correct state to another. Here you can mention the valid values ​​of the fields, foreign keys and more complex integrity constraints;
- I solation - simultaneous transactions do not affect each other, that is, multi-threaded transaction processing is performed in such a way that the result of their parallel execution corresponds to the result of their sequential execution;
- D urability - if the transaction was successfully completed, no external event should result in the loss of the changes made to it.
Each of these requirements looks more than rational, especially if it affects such important areas as banking and other currency transactions: agree, it will be very unpleasant if the money is written off from your account and they do not come to the store account (violation “atomicity "), Or as a result of the failure of the ABS, information will be lost that a salary has been credited to your account (violation of" durability ").
If you start talking about how DBMS supporting ACID transactions work, the Isolation property will most of all raise questions: modern DBMSs support hundreds and thousands of simultaneous transactions, and they all refer to the same tables. How to make sure that they do not interfere with each other? This is where MVCC (MultiVersion Concurrency Control) comes to the rescue, that is, control of concurrent access to data through the creation of multiple “versions” of variable data. In a simplified form, this mechanism can be represented as follows: all operations with data can be divided into reading (select), insert (insert), delete (delete), update (update). Here is what happens during these operations:
- Select - read valid table entries. A record is considered valid if it is created by a transaction that was committed (commit) before the current transaction;
- Insert - a new entry is simply added to the free space of the table;
- Delete - the record in the table is marked as not valid, while the record itself is not deleted;
- Update - the combination of delete and insert. First, the old version of the record is marked as not valid, then a new record is added with updated data.
In general, this approach can be implemented using only one additional bit with the flag is_valid = 1 for valid entries and 0 for non-valid entries. But there is a problem with multithreading: with this approach only sequential access to data will be possible (writers will block both readers and other writers).
Suppose we have “Table A” with the following data:

')
And two transactions that try to update the data as follows:
- Transaction 1: update A set value = value || “Transaction 1” where ID = 1;
- Transaction 2: update A set value = value || “Transaction 2” where ID = 1;
If you allow them to work in parallel on the same data, you might get the following result:
- Transaction 1 read the valid Value for the row with ID = 1: “Hello”;
- Transaction 2 read the valid Value values ​​for the row with ID = 1: “Hello”;
- Transaction 1 counted a new value for the Value field: “Hello transaction 1”;
- Transaction 2 counted a new value for the Value field: “Hello transaction 2”;
- Transaction 1 updated the table data;
- Transaction 2 has updated the table data.
Here is the result of this scenario:

But the result of their consistent implementation:

Obviously, the isolation requirement is violated: transactions running simultaneously affected each other. Such errors can be divided into Dirty Reads, Non-repeatable Reads and Phantom Reads.
Thus, we can conclude that:
- Validity flag alone is not enough, we need a more complex mechanism;
- Some types of operations require locks. If two transactions attempt to update the same table entry, it is quite obvious that they can only do this sequentially.
Consider how this works in Postgres:
- All transactions in the system have sequential numbers (conventionally, in fact, the transaction number for each of the tables is different and vacuum is reset to avoid overflowing the 4-byte integer used to store the transaction ID);
- There is a global transaction registry that contains information about which transactions are in progress, and which have been rolled back;
- For each entry in the table there are technical fields Xmin and Xmax, which store information about transactions that modified this entry. Xmin is the transaction identifier that added the entry to the table. Xmax is the identifier of the transaction that deleted the record from the table;
- For each entry in the table there are technical fields Cmin and Cmax, which are used to ensure the operation of transactions in several teams.
Suppose a transaction first adds a row to the table, then updates it, then deletes it. Thanks to Xmin and Xmax, until the transaction is committed (commit), all these operations will be invisible to external transactions. But what about the transaction itself, which should see its changes, even if they contradict each other (they first added the record, then deleted it)? For this purpose, min and max are created, which work in much the same way as Xmin and Xmax, but within the framework of one specific transaction.
With this set of metadata, we are much closer to the implementation of ACID than with one flag of validity. Now consider the operations described above in the context of MVCC:
- Select - read valid table entries. For each of the records we have Xmin, Xmax, Cmin, Cmax. If Xmin is greater than the current transaction ID or is on the list of active or canceled transactions, the entry is not valid. If Xmax is set and less than the ID of the current transaction and the transaction with this ID is not in the list of running or canceled transactions, the record is not valid. If Xmin is equal to the current transaction ID, then to check the validity of the record, we look at Cmin and Cmax (Cmax must be set so that the record is not valid). If Xmax is the current transaction ID, everything is exactly the opposite. If the above checks did not work, the entry is valid. There is a very good piece of code from src / backend / utils / time / tqual.c with the author’s comment on this topic:

- Insert - a new record is simply added to the free space of the table, Xmin is put in the current transaction ID and Cmin in the current operation ID as part of the transaction;
- Delete - an entry in the table is marked as not valid, while the entry itself is not deleted. This is done as follows: Xmax is put in the ID of the current transaction, Cmax in the ID of the current operation;
- Update - the combination of delete and insert. First, the old version of the record is marked as not valid, then a new record is added with updated data;
- Commit transactions are performed by removing the transaction ID from the list of employees;
- Rollback transactions are performed by marking the transaction ID as a rollback transaction.
This transaction model is simplified, since it does not cover the locking mechanism, as well as cases with distributed transactions. Nevertheless, this approach is common to many modern DBMS and its understanding will allow better understanding of what is happening inside the DBMS engine.
A more detailed analysis of Postgres examples can be found here.