📜 ⬆️ ⬇️

Towards the right SQL transactions (Part 1)



I often had to deal with the fact that people perfectly understand what transactions are in the database and what they are for, but they do not always know how to use them correctly. Of course, to achieve the 80th level of sacred knowledge, you need more than one year of experience and read a lot of thick books on SQL. Therefore, in this article I will not even try to describe everything that may be related to transactions in MS SQL. I want to touch on one simple but very important issue that developers often overlook - transaction isolation levels.
Despite the fact that the topic is very simple, in many sources it is consecrated poorly - information is either very little or very much. Those. Having read 5-6 brief theoretical definitions it is impossible to put them into practice. For a confident understanding of the subject matter of an article, one should refer to specialized literature, but there is so much information there that not everyone can devote the necessary time for its assimilation.
Today I want to share my simple recipe, which helped me to remember the features of transaction isolation levels once and for all, and to this day it helps us to make informed decisions about choosing the required level without problems.

The secret of the proposed method of memorization is that a brief theory will be accompanied by simple practical examples, which I would be much clearer than a detailed description.
And so, to understand the differences in isolation levels, it is necessary to deal with unwanted side effects that can occur if transactions are not isolated from each other. Having understood the specifics of these effects, we will only have to look at what effects each individual level protects from. After that, I am sure that the topic of isolation of transactions will forever cease to seem like something incredibly complex.

Concurrency side effects


All operations in the database do not occur instantaneously and the following side effects are possible when simultaneously changing data by different users:

Further, these effects are discussed in detail and are SQL scripts that show the problem in practice. I highly recommend trying them and seeing the problem "live", but for this you need to first prepare your server. The preparation steps and the features of running scripts are described below.
Requirements for running scripts
  1. The first is to run the script for transaction # 1, and then immediately the script for transaction # 2 (no later than 10 seconds after the start of the first script).
  2. A table with the name Table1 and the columns Id and Value must exist in the database. It is expected to have a single line:

    To create a table and fill it with data, you can run the following script.
    IF (EXISTS (SELECT * FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA = 'dbo' AND TABLE_NAME = 'Table1')) DROP TABLE Table1 CREATE TABLE Table1 (Id INT IDENTITY, Value INT) INSERT INTO Table1 (Value) VALUES(1) 

    It is also desirable to execute this script before examining each example. This will ensure that the results obtained are identical with those described below.


')

Lost update


The effect is manifested when simultaneously changing one data block with different transactions. And one of the changes may be lost.
This formulation may be interpreted differently.
Lost Update - Interpretation # 1

Two transactions are simultaneously UPDATE for the same row, and changes made by one transaction are overwritten by another.
Transaction 1Transaction 2
 UPDATE Table1 SET Value = Value + 5 WHERE Id = 1; SELECT Value FROM Table1 WHERE Id = 1; 
 UPDATE Table1 SET Value = Value + 7 WHERE Id = 1; SELECT Value FROM Table1 WHERE Id = 1; 
Result:Value = 6Value = 8

Why it happens?
Before updating, both transactions read the value in the Value column - it is 1. Suppose that transaction 2 has time to write the value first, then the new value in the Value column will be 8 (1 + 7). Then, transaction 1 also calculates the new value, but uses the previously subtracted value (1) for the calculation. As a result, after the completion of transaction 1 in the Value column there will be 6 (1 + 5), and not 13 (1 + 7 + 5).
Fortunately, this scenario is impossible in MS SQL, because even the lowest isolation level prevents such a situation and the result will always be 13, not 8.


Lost Update - Interpretation # 2

The script is similar to the first one, but the Value value is read into a temporary variable.
Transaction 1Transaction 2
 BEGIN TRAN; DECLARE @Value INT; SELECT @Value = Value FROM Table1 WHERE Id = 1; WAITFOR DELAY '00:00:10'; UPDATE Table1 SET Value = @Value + 5 WHERE Id = 1; COMMIT TRAN; SELECT Value FROM Table1 WHERE Id = 1; 
 BEGIN TRAN; DECLARE @Value INT; SELECT @Value = Value FROM Table1 WHERE Id = 1; UPDATE Table1 SET Value = @Value + 7 WHERE Id = 1; COMMIT TRAN; SELECT Value FROM Table1 WHERE Id = 1; 
Result:Value = 6Value = 8


"Dirty" reading (dirty read)


This is such a reading, in which the added or modified data can be read from another transaction, which is subsequently not confirmed (rolled back).
Since this effect is possible only with a minimum isolation level, and a higher isolation level (READ COMMITTED) is used by default, the isolation level will be explicitly set to READ UNCOMMITTED in the data reading script. If you return the default isolation level (READ COMMITTED) for transaction 2, the behavior changes.
Transaction 1Transaction 2
 BEGIN TRAN; UPDATE Table1 SET Value = Value * 10 WHERE Id = 1; WAITFOR DELAY '00:00:10'; ROLLBACK; SELECT Value FROM Table1 WHERE Id = 1; 
 SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; --SET TRANSACTION ISOLATION LEVEL READ COMMITTED; BEGIN TRAN; SELECT Value FROM Table1 WHERE Id = 1; COMMIT TRAN; 
Result for READ UNCOMMITTED:Value = 1Value = 10
Result for READ COMMITTED:Value = 1Value = 1
We see that within the second transaction a value of 10 was read, which was never successfully saved to the database (it was rejected by the ROLLBACK command).

Non-repeatable read


Appears when, when re-reading within a single transaction, previously read data turns out to be changed. This effect can be observed at an isolation level lower than REPEATABLE READ.
Transaction 1Transaction 2
 SET TRANSACTION ISOLATION LEVEL READ COMMITTED --SET TRANSACTION ISOLATION LEVEL REPEATABLE READ BEGIN TRAN; SELECT Value FROM Table1 WHERE Id = 1; WAITFOR DELAY '00:00:10'; SELECT Value FROM Table1 WHERE Id = 1; COMMIT; 
 BEGIN TRAN; UPDATE Table1 SET Value = 42 WHERE Id = 1; COMMIT TRAN; 
Result for READ COMMITTEDValue = 1
Value = 42
Instant execution
Result for REPEATABLE READValue = 1
Value = 1
Waiting for completion of transaction 1


Phantom read (phantom reads)


It can be observed that one transaction several times selects multiple rows according to the same criteria during its execution. At the same time, another transaction in the intervals between these samples adds or deletes rows, or changes the columns of some rows used in the selection criteria of the first transaction, and ends successfully. The result is that the same samples in the first transaction give different sets of rows. This effect can be observed when the isolation level is lower than SERIALIZABLE.
Transaction 1Transaction 2
 SET TRANSACTION ISOLATION LEVEL REPEATABLE READ --SET TRANSACTION ISOLATION LEVEL SERIALIZABLE BEGIN TRAN; SELECT * FROM Table1 WAITFOR DELAY '00:00:10' SELECT * FROM Table1 COMMIT; 
 BEGIN TRAN; INSERT INTO Table1 (Value) VALUES(100) COMMIT TRAN; 
Result for REPEATABLE READ:- first SELECT
ID: 1; Value: 1
- second SELECT
ID: 1; Value: 1
ID: 2; Value: 100
Instant execution
Result for SERIALIZABLE:- first SELECT
ID: 1; Value: 1
- second SELECT
ID: 1; Value: 1
Waiting for completion of transaction 1


Isolation levels


Understanding the meaning of side effects, it is very easy to understand the purpose of each isolation level, because they differ in the number of side effects.
Effects
Lost updateDirty readingNon-repeatable readingPhantom read
Isolation levelsRead uncommitted
No / Yes (*)
there is
there is
there is
Read committed
or
Read committed snapshot (**)
No / Yes (*)
Not
there is
there is
Repeatable read
Not
Not
Not
there is
Serializable
or
Snapshot (**)
Not
Not
Not
Not

(*) - the effect is present only if it is treated as described in the section “Lost update - Interpretation No. 2”.
(**) - for these levels, isolation is achieved not by using locks, but by creating a copy of the variable data that is placed in tempdb for the duration of the transaction; more here .

Conclusion


Now, having understood the purpose of each level, you are ready for a more meaningful use of transactions. But I would not stop there. In the second part of the article, the material will be of slightly less practical value, but at the same time it will not be less useful. Once, Leigh Campbell once perfectly said: "You must understand at least one level of abstraction below the level at which you program." That is why, understanding the implementation will allow you to understand the topic as deeply as possible and you will be able to properly and effectively use the proposed tool.

Source: https://habr.com/ru/post/261097/


All Articles