Concurrency Control
- sumitnagle
- Jun 25
- 2 min read
So far, we have discussed what a transaction means, and how does any database implements ACID principle for reliable and persistent storage. There we discussed nuance of isolation and where we discussed that isolation ensures that concurrent transactions do not interfere with each other. So this blog is about that! concurrency control.
Concurrency refers to the ability of the database to allow multiple transactions to be executed simultaneously. It ensures that the system's performance is optimised (by allowing concurrent operations) while maintaining data consistency.
Concurrency Control is the set of techniques and mechanisms used to ensure that multiple transactions can operate concurrently without causing unexpected conflicts or corrupting data. It ensures that these overlapping transactions appear as though they occur sequentially or in some logically consistent manner, this is called serialisability.\
🌟 When a transaction reads data, it’s actually reading from those in-memory data pages (if not present in-memory, the data is fetched from disk and stored in memory), and when a transaction writes data, the database modifies the in-memory page first. So, concurrency control primarily governs how transactions interact with in-memory data (buffer pool pages) and log records (WAL). The actual disk flush of the modified pages (the “real” data files) typically occurs after the transaction is already committed, and logically, the concurrency control ensures that no two transactions corrupt each other’s in-memory changes or observe each other’s partial updates.
A transaction is marked committed immediately after it is recored on WAL file (i.e. log is written safely on disk (even WAL records itself are stored on disk, because if the system crashes, the WAL ensures those changes can be recovered or rolled back)) and not necessarily after its data pages are written to disk. Which means, concurrency control techniques like locking are being acted upon the in-memory (buffer pool) representation of data. Whereas, WAL ensures that once a transaction commits, its changes are not lost (durability), even if the buffer pool content hasn’t yet been written to the main data files on disk.
Ok, so we have discussed everything except the actual things which is how? how is concurrency control achieved! Technically, there are three way of concurrency control, with a few variants under them,
Pessimistic. This mechanism assume conflicts will happen, so prevent them ahead of time by blocking other transactions from accessing data that might cause conflicts. This is SQL Server’s default concurrency control by using locking.
Optimistic. Which assume conflicts are rare, allow transactions to proceed concurrently, and check for conflicts only at commit time. Some distributed systems (like Google Spanner, CockroachDB) and in-memory systems often use optimistic concurrency control.
Multi-version concurrency control. Here the idea is, instead of blocking, allow readers and writers to access different versions of data concurrently. PostgreSQL, Oracle, MySQL InnoDB (since 5.5), and modern NoSQL systems like MongoDB (WiredTiger engine) use multi-version concurrency control.