A Guide to Concurrency Control Techniques in DBMS
Concurrency Control in a Database Management System is a fundamental concept that ensures multiple transactions can occur concurrently without compromising the integrity or consistency of the database. Let's understand more!
Concurrency control in a Database Management System (DBMS) refers to the methods used to manage simultaneous operations without conflicting with each other. Let's see the concurrency control techniques in DBMS in detail.
Table of Content
- What is Concurrency Control in DBMS?
- Why is Concurrency Control Needed?
- Concurrency Control Techniques in DBMS
- Real-Life Example
What is Concurrency Control in DBMS?
Concurrency Control is a crucial Database Management System (DBMS) component. It manages simultaneous operations without them conflicting with each other. The primary aim is maintaining consistency, integrity, and isolation when multiple users or applications access the database simultaneously.
In a multi-user database environment, itâs common for numerous users to want to access and modify the database simultaneously. This is what we call concurrent execution. Imagine a busy library where multiple librarians are updating book records simultaneously. Just as multiple librarians shouldnât try to update the same record simultaneously, database users shouldnât interfere with each otherâs operations.
Executing transactions concurrently offers many benefits, like improved system resource utilization and increased throughput. However, these simultaneous transactions mustnât interfere with each other. The ultimate goal is to ensure the database remains consistent and correct. For instance, if two people try to book the last seat on a flight at the exact moment, the system must ensure that only one person gets the seat.
But concurrent execution can lead to various challenges:
- Lost Updates: Consider two users trying to update the same data. If one user reads a data item and then another user reads the same item and updates it, the first userâs updates could be lost if they werenât aware of the second userâs actions.
- Uncommitted Data: If one user accesses data that another user has updated but not yet committed (finalized), and then the second user decides to abort (cancel) their transaction, the first user has invalid data.
- Inconsistent Retrievals: A transaction reads several values from the database, but another transaction modifies some of those values in the middle of its operation.
To address these challenges, the DBMS employs concurrency control techniques. Think of it like traffic rules. Just as traffic rules ensure vehicles donât collide, concurrency control ensures transactions donât conflict.
Why is Concurrency Control Needed?
As we just discussed above about what concurrency control is, from that we can now figure out that we need concurrency control because of the following reasons listed below:
- Ensure Database Consistency: Without concurrency control, simultaneous transactions could interfere with each other, leading to inconsistent database states. Proper concurrency control ensures the database remains consistent even after numerous concurrent transactions.
- Avoid Conflicting Updates: When two transactions attempt to update the same data simultaneously, one update might overwrite the other without proper control. Concurrency control ensures that updates donât conflict and cause unintended data loss.
- Prevent Dirty Reads: Without concurrency control, one transaction might read data that another transaction is in the middle of updating (but hasnât finalized). This can lead to inaccurate or âdirtyâ reads, where the data doesnât reflect the final, committed state.
- Enhance System Efficiency: By managing concurrent access to the database, concurrency control allows multiple transactions to be processed in parallel. This improves system throughput and makes optimal use of resources.
- Protect Transaction Atomicity: For a series of operations within a transaction, itâs crucial that all operations succeed (commit) or none do (abort). Concurrency control ensures that transactions are atomic and treated as a single indivisible unit, even when executed concurrently with others.
Best-suited IT & Software courses for you
Learn IT & Software with these high-rated online courses
Concurrency Control Techniques in DBMS
The various concurrency control techniques are:
- Two-phase locking Protocol
- Time stamp ordering Protocol
- Multi version concurrency control
- Validation concurrency control
Letâs understand each technique one by one in detail
1. Two-phase locking Protocol
Two-phase locking (2PL) is a protocol used in database management systems to control concurrency and ensure transactions are executed in a way that preserves the consistency of a database. Itâs called âtwo-phaseâ because, during each transaction, there are two distinct phases: the Growing phase and the Shrinking phase.
Breakdown of the Two-Phase Locking protocol
- Phases:
- Growing Phase: During this phase, a transaction can obtain (acquire) any number of locks as required but cannot release any. This phase continues until the transaction acquires all the locks it needs and no longer requests.
- Shrinking Phase: Once the transaction releases its first lock, the Shrinking phase starts. During this phase, the transaction can release but not acquire any more locks.
- Lock Point: The exact moment when the transaction switches from the Growing phase to the Shrinking phase (i.e. when it releases its first lock) is termed the lock point.
The primary purpose of the Two-Phase Locking protocol is to ensure conflict-serializability, as the protocol ensures a transaction does not interfere with others in ways that produce inconsistent results.
2. Time stamp ordering Protocol
The Timestamp Ordering Protocol is a concurrency control method used in database management systems to maintain the serializability of transactions. This method uses a timestamp for each transaction to determine its order in relation to other transactions. Instead of using locks, it ensures transaction order based on their timestamps.
Breakdown of the Time stamp ordering protocol
- Read Timestamp (RTS):
- This is the latest or most recent timestamp of a transaction that has read the data item.
- Every time a data item X is read by a transaction T with timestamp TS, the RTS of X is updated to TS if TS is more recent than the current RTS of X.
- Write Timestamp (WTS):
- This is the latest or most recent timestamp of a transaction that has written or updated the data item.
- Whenever a data item X is written by a transaction T with timestamp TS, the WTS of X is updated to TS if TS is more recent than the current WTS of X.
The timestamp ordering protocol uses these timestamps to determine whether a transactionâs request to read or write a data item should be granted. The protocol ensures a consistent ordering of operations based on their timestamps, preventing the formation of cycles and, therefore, deadlocks.
3. Multi version concurrency control
Multi version Concurrency Control (MVCC) is a technique used in database management systems to handle concurrent operations without conflicts, using multiple versions of a data item. Instead of locking the items for write operations (which can reduce concurrency and lead to bottlenecks or deadlocks), MVCC will create a separate version of the data item being modified.
Breakdown of the Multi version concurrency control (MVCC)
- Multiple Versions: When a transaction modifies a data item, instead of changing the item in place, it creates a new version of that item. This means that multiple versions of a database object can exist simultaneously.
- Reads arenât Blocked: One of the significant advantages of MVCC is that read operations donât get blocked by write operations. When a transaction reads a data item, it sees a version of that item consistent with the last time it began a transaction or issued a read, even if other transactions are currently modifying that item.
- Timestamps or Transaction IDs: Each version of a data item is tagged with a unique identifier, typically a timestamp or a transaction ID. This identifier determines which version of the data item a transaction sees when it accesses that item. A transaction will always see its own writes, even if they are uncommitted.
- Garbage Collection: As transactions create newer versions of data items, older versions can become obsolete. Thereâs typically a background process that cleans up these old versions, a procedure often referred to as âgarbage collection.â
- Conflict Resolution: If two transactions try to modify the same data item concurrently, the system will need a way to resolve this. Different systems have different methods for conflict resolution. A common one is that the first transaction to commit will succeed, and the other transaction will be rolled back or will need to resolve the conflict before proceeding.
4. Validation concurrency control
Validation (or Optimistic) Concurrency Control (VCC) is an advanced database concurrency control technique. Instead of acquiring locks on data items, as is done in most traditional (pessimistic) concurrency control techniques, validation concurrency control allows transactions to work on private copies of database items and validates the transactions only at the time of commit.
The central idea behind optimistic concurrency control is that conflicts between transactions are rare, and itâs better to let transactions run to completion and only check for conflicts at commit time.
Breakdown of Validation Concurrency Control (VCC):
- Phases: Each transaction in VCC goes through three distinct phases:
- Read Phase: The transaction reads values from the database and makes changes to its private copy without affecting the actual database.
- Validation Phase: Before committing, the transaction checks if the changes made to its private copy can be safely written to the database without causing any conflicts.
- Write Phase: If validation succeeds, the transaction updates the actual database with the changes made to its private copy.
- Validation Criteria: During the validation phase, the system checks for potential conflicts with other transactions. If a conflict is found, the system can either roll back the transaction or delay it for a retry, depending on the specific strategy implemented.
Real-Life Example
Scenario: A world-famous band, âThe Algorithmics,â is about to release tickets for their farewell concert. Given their massive fan base, the ticketing system is expected to face a surge in access requests.
EventBriteMax must ensure that ticket sales are processed smoothly without double bookings or system failures.
1. Two-Phase Locking Protocol (2PL):
- Usage: Mainly for premium ticket pre-sales to fan club members. These sales occur a day before the general ticket release.
- Real-Life Example: When a fan club member logs in to buy a ticket, the system uses 2PL. It locks the specific seat they choose during the transaction. Once the transaction completes, the lock is released. This ensures that no two fan club members can book the same seat at the same time.
2. Timestamp Ordering Protocol:
- Usage: For general ticket sales.
- Real-Life Example: As thousands rush to book their tickets, each transaction gets a timestamp. If two fans try to book the same seat simultaneously, the one with the earlier timestamp gets priority. The other fan receives a message suggesting alternative seats.
3. Multi-Version Concurrency Control (MVCC):
- Usage: Implemented in the mobile app version of the ticketing platform.
- Real-Life Example: Fans using the mobile app see multiple versions of the seating chart. When a fan selects a seat, theyâre essentially choosing from a specific version of the seating database. If their choice conflicts with a completed transaction, the system offers them the next best seat based on the latest version of the database. This ensures a smooth mobile user experience without frequent transactional conflicts.
4. Validation Concurrency Control:
- Usage: For group bookings where multiple seats are booked in a single transaction.
- Real-Life Example: A group of friends tries to book 10 seats together. They choose their seats and proceed to payment. Before finalizing, the system validates that all 10 seats are still available (i.e., no seat was booked by another user in the meantime). If thereâs a conflict, the group is prompted to choose a different set of seats. If not, their booking is confirmed.
The concert ticket sales go off without a hitch. Fans rave about the smooth experience, even with such high demand. Behind the scenes, EventBriteMaxâs effective implementation of the four concurrency control protocols played a crucial role in ensuring that every fan had a fair chance to purchase their ticket and no seats were double-booked. The Algorithmics go on to have a fantastic farewell concert, with not a single problem in the ticketing process.
Conclusion
Thus, Concurrency control techniques in Database Management Systems (DBMS) are pivotal for maintaining data integrity, consistency, and reliability in multi-user database environments. These methods prevent multiple transactions from interfering with one another, preventing possible data inconsistencies and clashes.
FAQs
What is Concurrency Control in DBMS?
Concurrency control in DBMS is a method used to manage simultaneous operations on the database without letting them interfere with each other. It ensures the consistency of the database in a multi-user environment. This control is crucial for maintaining the integrity of the database when multiple transactions are executed concurrently.
What are the main techniques used in Concurrency Control?
The primary techniques used in concurrency control include:
- Two-phase locking Protocol
- Time stamp ordering Protocol
- Multi version concurrency control
- Validation concurrency control
What is the Two-Phase Locking Protocol?
The two-phase locking protocol in DBMS is a locking scheme that divides the transaction into two phases: the growing phase, where locks are acquired but not released, and the shrinking phase, where locks are released but not acquired. This protocol ensures serializability of transactions but can lead to deadlocks.
How does Deadlock occur in Concurrency Control, and how is it handled?
Deadlock occurs when two or more transactions are waiting indefinitely for one another to release locks. It is a common issue in lock-based concurrency control. Deadlock handling strategies include:
- Deadlock Prevention: Modify the systemâs operation to prevent the possibility of deadlock.
- Deadlock Detection and Recovery: Allow deadlocks to occur, detect them through specific algorithms, and then recover by aborting one or more transactions.
- Deadlock Avoidance: Use additional information about transactions to decide whether or not a lock request should be granted to avoid deadlocks.
What are the challenges in Concurrency Control?
The main challenges in concurrency control are:
- Maintaining Isolation: Ensuring that concurrently running transactions do not interfere with each other.
- Performance Overhead: Implementing concurrency control can lead to increased system overhead and complexity.
- Deadlocks: Managing and resolving deadlocks without significantly impacting system performance.
- Scalability: Ensuring the concurrency control mechanism scales effectively with an increase in the number of concurrent transactions.
Hello, world! I'm Esha Gupta, your go-to Technical Content Developer focusing on Java, Data Structures and Algorithms, and Front End Development. Alongside these specialities, I have a zest for immersing myself in v... Read Full Bio