Transaction isolation level is a conditional value that determines the extent to which inconsistent data is allowed to be obtained as a result of logically parallel transactions in the DBMS. The transaction isolation level scale contains a series of values ranked from lowest to highest; a higher level of isolation corresponds to better data consistency, but its use can reduce the number of physically parallel transactions. Conversely, a lower isolation level allows for more concurrent transactions, but reduces data accuracy. Thus, choosing the level of transaction isolation used, the developer of the information system to a certain extent provides a choice between the speed of work and ensuring guaranteed consistency of data received from the system.
Transactional Concurrency Issues
The following problems are possible with parallel execution of transactions :
- lost update ( Eng. lost update ) - when a single data block is changed by different transactions at the same time, all changes except the last are lost;
- “Dirty read” ( English dirty read ) - reading data added or modified by a transaction, which subsequently will not be confirmed (rolled back);
- non-repeatable read ( eng. non-repeatable read ) - when re-reading within the same transaction, previously read data is changed;
- phantom reads - one transaction during its execution selects multiple rows several times according to the same criteria. Another transaction in the interval between these samples adds or deletes rows or changes the columns of some rows used in the selection criteria of the first transaction, and successfully ends. As a result, it turns out that the same samples in the first transaction give different sets of rows.
Consider situations in which these problems may occur.
Lost Update
A situation when one of the changes is lost when one data block is changed by different transactions at the same time.
Suppose there are two transactions running at the same time:
| Transaction 1 | Transaction 2 |
|---|---|
UPDATE tbl1 SET f2=f2+20 WHERE f1=1; | UPDATE tbl1 SET f2=f2+25 WHERE f1=1; |
In both transactions, the value of the field f2 changes; upon completion, the value of the field should be increased by 45. In fact, the following sequence of actions may occur:
- Both transactions simultaneously read the current state of the field. Exact physical simultaneity is not necessary here, it is enough that the second-order read operation is performed before another transaction writes its result.
- Both transactions calculate the new field value, adding 20 and 25, respectively, to the previously read value.
- Transactions are trying to write the result of the calculations back to the field f2. Since it is impossible to physically execute two recordings at the same time, in reality, one of the recording operations will be performed earlier, the other later. In this case, the second write operation will overwrite the result of the first.
As a result, the value of the field f2 at the end of both transactions can increase not by 45, but by 20 or 25, that is, one of the transactions that changes data will “disappear”.
Dirty Reading
Reading data added or changed by a transaction, which subsequently will not be confirmed (rolled back).
Suppose there are two transactions opened by various applications in which the following SQL statements are executed:
| Transaction 1 | Transaction 2 |
|---|---|
UPDATE tbl1 SET f2=f2+1 WHERE f1=1; | |
SELECT f2 FROM tbl1 WHERE f1=1; | |
ROLLBACK WORK; |
In transaction 1, the value of the field f2 is changed, and then in transaction 2, the value of this field is selected. After that, transaction 1 is rolled back. As a result, the value obtained by the second transaction will be different from the value stored in the database.
Non-repeating reading
A situation where when re-reading within the same transaction, previously read data turns out to be changed.
Suppose there are two transactions opened by various applications in which the following SQL statements are executed:
| Transaction 1 | Transaction 2 |
|---|---|
SELECT f2 FROM tbl1 WHERE f1=1; | |
UPDATE tbl1 SET f2=f2+1 WHERE f1=1; | |
COMMIT; | |
SELECT f2 FROM tbl1 WHERE f1=1; |
In transaction 2, the value of the field f2 is selected, then in transaction 1, the value of the field f2 is changed. If you try again to select a value from the field f2 in transaction 2, a different result will be obtained. This situation is especially unacceptable when data is read in order to partially modify it and write it back to the database.
Reading Phantoms
A situation where, when re-reading within the same transaction, the same selection gives different sets of rows.
Suppose there are two transactions opened by various applications that execute the following SQL statements:
| Transaction 1 | Transaction 2 |
|---|---|
SELECT SUM(f2) FROM tbl1; | |
INSERT INTO tbl1 (f1,f2) VALUES (15,20); | |
COMMIT; | |
SELECT SUM(f2) FROM tbl1; |
In transaction 2, an SQL statement is executed using all the values of the f2 field. Then, in transaction 1, a new line is inserted, resulting in the repeated execution of the SQL statement in transaction 2 will give a different result. This situation is called phantom reading (phantom reading). It differs from non-repeating reading in that the result of repeated access to the data has changed not because of the change / deletion of the data itself, but because of the appearance of new (phantom) data.
Isolation Levels
By “ transaction isolation level ” is meant the degree of protection provided by internal DBMS mechanisms (that is, not requiring special programming) from all or some of the types of data inconsistencies listed above that occur during parallel transactions. The SQL-92 standard defines a scale of four isolation levels: Read uncommitted, Read committed, Repeatable read, Serializable. The first of them is the weakest, the last is the strongest, each subsequent includes all the previous ones.
Read uncommitted
The lowest (first) level of isolation. It guarantees only the absence of lost updates [1] . If several parallel transactions try to change the same row in the table, then in the final version the row will have the value determined by the entire set of successfully completed transactions. Moreover, it is possible to read not only logically inconsistent data, but also data whose changes have not yet been committed.
A typical way to implement this level of isolation is to lock the data for the duration of the change command, which ensures that the change commands of the same lines launched in parallel will actually be executed sequentially and no changes will be lost. Read-only transactions at this isolation level are never blocked.
Read committed
Most industrial DBMSs, in particular Microsoft SQL Server , PostgreSQL and Oracle , use this level by default. At this level, protection is provided against rough, “dirty” reading, however, during the operation of one transaction, the other can be successfully completed and the changes made by it are recorded. As a result, the first transaction will work with a different data set.
Implementing a complete read can be based on one of two approaches: blocking or versioning.
- Blocking readable and modifiable data.
- It consists in the fact that a writing transaction blocks mutable data for reading transactions operating at the read committed level or higher until it is completed, thus preventing dirty reading, and data blocked by the reading transaction is released immediately after the SELECT operation is completed (thus, a situation of “non-repeating reading” may occur at a given level of isolation).
- Saving multiple versions of simultaneously modifiable strings.
- Each time a row is changed, the DBMS creates a new version of this row with which the transaction that changed the data continues to work, while the last committed version is returned to any other “reading” transaction. The advantage of this approach is that it provides more speed, as it prevents blocking. However, it requires, in comparison with the first, a significantly higher consumption of RAM, which is spent on storing versions of strings. In addition, a parallel change of data by several transactions may create a situation where several parallel transactions will make inconsistent changes in the same data (since there are no locks, nothing will prevent it). Then the transaction that is committed first will save its changes in the main database, and the remaining parallel transactions will be impossible to commit (since this will lead to the loss of updating the first transaction). The only thing that a DBMS can in this situation is to roll back the remaining transactions and give an error message “The record has already been changed”.
A specific implementation method is chosen by the DBMS developers, and in some cases it can be customized. So, by default, MS SQL uses locks, but (in version 2005 and higher) when setting the READ_COMMITTED_SNAPSHOT parameter of the database, it switches to the versioning strategy, Oracle initially works only according to the versioned scheme. In Informix, you can prevent conflicts between reading and writing transactions by setting the USELASTCOMMITTED configuration parameter (starting with version 11.1), while the reading transaction will receive the latest confirmed data [2]
Repeatable read
The level at which the reading transaction “does not see” changes in the data that it previously read. However, no other transaction can change the data read by the current transaction until it is completed.
Locks in splitting mode are applied to all data read by any transaction instruction and are stored until it is completed. This prevents other transactions from modifying rows that were read by the transaction in progress. However, other transactions may insert new lines that match the search terms of the instructions contained in the current transaction. When the instruction is restarted, the current transaction will retrieve new lines, which will lead to phantom reading. Given that dividing locks are stored until the transaction is completed, and not removed at the end of each instruction, the degree of parallelism is lower than with the isolation level READ COMMITTED. Therefore, using this data and higher levels of transactions without need is usually not recommended.
Serializable
Highest level of isolation; transactions are completely isolated from each other, each is performed as if parallel transactions do not exist. At this level only, concurrent transactions are not affected by the “phantom read” effect.
Support for transaction isolation in real DBMS
DBMSs that provide transactionality do not always support all four levels, and can also introduce additional ones. Various nuances in providing insulation are also possible.
So, Oracle, in principle, does not support the zero level, since its implementation of transactions excludes “dirty reads”, and formally does not allow setting the Repeatable read level, that is, it supports only Read committed (by default) and Serializable. At the same time, at the level of individual commands, it, in fact, guarantees read repeatability (if the SELECT command in the first transaction selects a set of rows from the database, and at the same time the parallel second transaction changes some of these lines, then the resulting set obtained by the first transaction will be contain unchanged rows, as if there was no second transaction). Oracle also supports the so-called READ-ONLY transactions, which correspond to Serializable, but cannot change data themselves.
Microsoft SQL Server supports all four standard levels of transaction isolation, and additionally, the SNAPSHOT level, at which the transaction sees the data state that was recorded before it was launched, as well as the changes made by it itself, that is, it behaves as if it received during running a snapshot of database data and working with it. The difference from Serialized is that locks are not used, but as a result, committing changes may not be possible if a parallel transaction changed the same data earlier; in this case, the second transaction when trying to execute COMMIT will cause an error message and will be canceled.
Behavior at various levels of isolation
“+” - prevents, “-” - does not prevent.
| Isolation level | Phantom reading | Non-repeat reading | Dirty reading | Lost Update [3] |
|---|---|---|---|---|
| SERIALIZABLE | + | + | + | + |
| REPEATABLE READ | - | + | + | + |
| READ COMMITTED | - | - | + | + |
| READ UNCOMMITTED | - | - | - | + |
| Null | - | - | - | - |
Notes
- ↑ Understanding Isolation Levels
- ↑ USELASTCOMMITTED configuration parameter http://publib.boulder.ibm.com/infocenter/idshelp/v115/topic/com.ibm.adref.doc/ids_adr_0186.htm
- ↑ Understanding the Available Transaction Isolation Levels