💾Intro to Database Systems Unit 11 – Transaction Management & Concurrency
Transaction management and concurrency are crucial aspects of database systems. They ensure data integrity and consistency when multiple users access and modify shared data simultaneously. Understanding these concepts is essential for designing reliable and efficient databases.
ACID properties, locking mechanisms, and isolation levels form the foundation of transaction management. These principles help prevent issues like lost updates, dirty reads, and deadlocks while maintaining data consistency. Recovery techniques ensure databases can be restored after failures.
Databases are used to store and manage critical data for businesses and organizations
Multiple users and applications may need to access and modify the same data simultaneously
Transactions ensure data integrity is maintained during concurrent access and system failures
Without proper transaction management, data inconsistencies and errors can occur leading to incorrect results and decisions
Concurrency control mechanisms prevent conflicts between simultaneous transactions accessing shared data
Isolation levels define the degree to which transactions are isolated from each other's effects
Recovery techniques ensure that the database can be restored to a consistent state after a failure
Understanding transaction management and concurrency is crucial for designing and implementing reliable and efficient database systems
Key Concepts
Transactions are a sequence of database operations that are treated as a single unit of work
ACID properties (Atomicity, Consistency, Isolation, Durability) ensure the reliability and integrity of transactions
Concurrency control manages simultaneous access to shared data by multiple transactions
Locking mechanisms (shared locks, exclusive locks) are used to control access to data items
Deadlocks occur when two or more transactions are waiting for each other to release locks
Isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable) define the degree of isolation between transactions
Transaction states (Active, Partially Committed, Committed, Failed, Aborted) represent the different stages of a transaction's lifecycle
Recovery techniques (logging, checkpointing) ensure that the database can be restored to a consistent state after a failure
ACID Properties Explained
Atomicity ensures that a transaction is treated as a single, indivisible unit of work
Either all operations within a transaction are completed successfully, or none of them are
If a transaction fails, all changes made by the transaction are rolled back (undone)
Consistency ensures that a transaction brings the database from one valid state to another
Transactions must follow all defined rules and constraints of the database
Any data written to the database must be valid according to these rules
Isolation ensures that concurrent transactions do not interfere with each other
Each transaction should execute as if it were the only transaction running on the system
Changes made by one transaction should not be visible to other transactions until the transaction is committed
Durability ensures that once a transaction is committed, its changes persist even in the event of a system failure
Committed transactions are permanently stored in the database and can survive system crashes or power outages
Durability is typically achieved through the use of transaction logs and regular backups
Concurrency Issues
Lost Updates occur when two transactions read the same data, modify it, and write it back, causing one transaction's changes to be overwritten
Dirty Reads happen when a transaction reads data that has been modified by another uncommitted transaction
If the uncommitted transaction is rolled back, the reading transaction will have read invalid data
Non-Repeatable Reads occur when a transaction reads the same data twice but gets different results due to modifications made by another transaction
Phantom Reads happen when a transaction re-executes a query and discovers new rows that were not visible in the previous execution due to another transaction's insertions
Deadlocks occur when two or more transactions are waiting for each other to release locks, resulting in a circular dependency
Deadlocks can be resolved by aborting one of the transactions and rolling back its changes
Starvation happens when a transaction is repeatedly denied access to a resource due to other transactions constantly acquiring locks on that resource
Locking Mechanisms
Locking is a concurrency control mechanism used to manage simultaneous access to shared data
Shared Locks (S-Locks) allow multiple transactions to read the same data item simultaneously
Multiple transactions can hold shared locks on the same data item at the same time
Shared locks are compatible with other shared locks but not with exclusive locks
Exclusive Locks (X-Locks) give a single transaction exclusive access to a data item for reading and writing
Only one transaction can hold an exclusive lock on a data item at a time
Exclusive locks are not compatible with any other type of lock (shared or exclusive)
Lock Manager is responsible for granting, denying, and releasing locks on data items
Two-Phase Locking (2PL) is a protocol that ensures serializability by requiring transactions to acquire all necessary locks before releasing any locks
Growing Phase: transactions acquire locks on data items they need to access
Shrinking Phase: transactions release all acquired locks and cannot obtain new locks
Isolation Levels
Read Uncommitted allows transactions to read uncommitted changes made by other transactions
Prone to dirty reads, non-repeatable reads, and phantom reads
Read Committed ensures that transactions only read committed data
Prevents dirty reads but allows non-repeatable reads and phantom reads
Repeatable Read guarantees that repeated reads within a transaction will always return the same result
Prevents dirty reads and non-repeatable reads but allows phantom reads
Serializable is the highest isolation level and ensures that transactions execute as if they were serialized (executed one after another)
Prevents dirty reads, non-repeatable reads, and phantom reads
Achieved through the use of shared and exclusive locks or by using optimistic concurrency control methods
Transaction States
Active: the transaction is currently executing its operations
Partially Committed: the transaction has completed its operations but has not yet been committed
At this stage, the transaction's changes are not yet visible to other transactions
Committed: the transaction has successfully completed, and its changes are permanently stored in the database
Committed transactions are durable and visible to other transactions
Failed: the transaction has encountered an error during execution and cannot proceed
Failed transactions must be rolled back to undo any changes made
Aborted: the transaction has been rolled back due to a failure or a user-initiated abort
All changes made by the aborted transaction are undone, and the database is restored to its previous consistent state
Recovery Techniques
Recovery techniques ensure that the database can be restored to a consistent state after a failure
Logging is a technique used to record all changes made by transactions
Undo Logs record the old values of modified data items, allowing transactions to be rolled back if necessary
Redo Logs record the new values of modified data items, allowing committed transactions to be reapplied after a failure
Checkpointing is a process of creating a consistent snapshot of the database at a specific point in time
Checkpoints are used to minimize the amount of work required to recover from a failure
During recovery, the database can be restored to the most recent checkpoint and then apply any necessary redo log entries
Shadow Paging is a recovery technique that maintains a separate copy (shadow page) of the database pages being modified
Changes are made to the shadow pages, and when the transaction commits, the shadow pages replace the original pages
In case of a failure, the original pages remain unchanged, and the database remains in a consistent state