Intro to Database Systems

💾Intro to Database Systems Unit 11 – Transaction Management & Concurrency

Transaction management and concurrency are crucial aspects of database systems. They ensure data integrity and consistency when multiple users access and modify shared data simultaneously. Understanding these concepts is essential for designing reliable and efficient databases. ACID properties, locking mechanisms, and isolation levels form the foundation of transaction management. These principles help prevent issues like lost updates, dirty reads, and deadlocks while maintaining data consistency. Recovery techniques ensure databases can be restored after failures.

What's the Big Deal?

  • Databases are used to store and manage critical data for businesses and organizations
  • Multiple users and applications may need to access and modify the same data simultaneously
  • Transactions ensure data integrity is maintained during concurrent access and system failures
  • Without proper transaction management, data inconsistencies and errors can occur leading to incorrect results and decisions
  • Concurrency control mechanisms prevent conflicts between simultaneous transactions accessing shared data
  • Isolation levels define the degree to which transactions are isolated from each other's effects
  • Recovery techniques ensure that the database can be restored to a consistent state after a failure
  • Understanding transaction management and concurrency is crucial for designing and implementing reliable and efficient database systems

Key Concepts

  • Transactions are a sequence of database operations that are treated as a single unit of work
  • ACID properties (Atomicity, Consistency, Isolation, Durability) ensure the reliability and integrity of transactions
  • Concurrency control manages simultaneous access to shared data by multiple transactions
  • Locking mechanisms (shared locks, exclusive locks) are used to control access to data items
  • Deadlocks occur when two or more transactions are waiting for each other to release locks
  • Isolation levels (Read Uncommitted, Read Committed, Repeatable Read, Serializable) define the degree of isolation between transactions
  • Transaction states (Active, Partially Committed, Committed, Failed, Aborted) represent the different stages of a transaction's lifecycle
  • Recovery techniques (logging, checkpointing) ensure that the database can be restored to a consistent state after a failure

ACID Properties Explained

  • Atomicity ensures that a transaction is treated as a single, indivisible unit of work
    • Either all operations within a transaction are completed successfully, or none of them are
    • If a transaction fails, all changes made by the transaction are rolled back (undone)
  • Consistency ensures that a transaction brings the database from one valid state to another
    • Transactions must follow all defined rules and constraints of the database
    • Any data written to the database must be valid according to these rules
  • Isolation ensures that concurrent transactions do not interfere with each other
    • Each transaction should execute as if it were the only transaction running on the system
    • Changes made by one transaction should not be visible to other transactions until the transaction is committed
  • Durability ensures that once a transaction is committed, its changes persist even in the event of a system failure
    • Committed transactions are permanently stored in the database and can survive system crashes or power outages
    • Durability is typically achieved through the use of transaction logs and regular backups

Concurrency Issues

  • Lost Updates occur when two transactions read the same data, modify it, and write it back, causing one transaction's changes to be overwritten
  • Dirty Reads happen when a transaction reads data that has been modified by another uncommitted transaction
    • If the uncommitted transaction is rolled back, the reading transaction will have read invalid data
  • Non-Repeatable Reads occur when a transaction reads the same data twice but gets different results due to modifications made by another transaction
  • Phantom Reads happen when a transaction re-executes a query and discovers new rows that were not visible in the previous execution due to another transaction's insertions
  • Deadlocks occur when two or more transactions are waiting for each other to release locks, resulting in a circular dependency
    • Deadlocks can be resolved by aborting one of the transactions and rolling back its changes
  • Starvation happens when a transaction is repeatedly denied access to a resource due to other transactions constantly acquiring locks on that resource

Locking Mechanisms

  • Locking is a concurrency control mechanism used to manage simultaneous access to shared data
  • Shared Locks (S-Locks) allow multiple transactions to read the same data item simultaneously
    • Multiple transactions can hold shared locks on the same data item at the same time
    • Shared locks are compatible with other shared locks but not with exclusive locks
  • Exclusive Locks (X-Locks) give a single transaction exclusive access to a data item for reading and writing
    • Only one transaction can hold an exclusive lock on a data item at a time
    • Exclusive locks are not compatible with any other type of lock (shared or exclusive)
  • Lock Manager is responsible for granting, denying, and releasing locks on data items
  • Two-Phase Locking (2PL) is a protocol that ensures serializability by requiring transactions to acquire all necessary locks before releasing any locks
    • Growing Phase: transactions acquire locks on data items they need to access
    • Shrinking Phase: transactions release all acquired locks and cannot obtain new locks

Isolation Levels

  • Read Uncommitted allows transactions to read uncommitted changes made by other transactions
    • Prone to dirty reads, non-repeatable reads, and phantom reads
  • Read Committed ensures that transactions only read committed data
    • Prevents dirty reads but allows non-repeatable reads and phantom reads
  • Repeatable Read guarantees that repeated reads within a transaction will always return the same result
    • Prevents dirty reads and non-repeatable reads but allows phantom reads
  • Serializable is the highest isolation level and ensures that transactions execute as if they were serialized (executed one after another)
    • Prevents dirty reads, non-repeatable reads, and phantom reads
    • Achieved through the use of shared and exclusive locks or by using optimistic concurrency control methods

Transaction States

  • Active: the transaction is currently executing its operations
  • Partially Committed: the transaction has completed its operations but has not yet been committed
    • At this stage, the transaction's changes are not yet visible to other transactions
  • Committed: the transaction has successfully completed, and its changes are permanently stored in the database
    • Committed transactions are durable and visible to other transactions
  • Failed: the transaction has encountered an error during execution and cannot proceed
    • Failed transactions must be rolled back to undo any changes made
  • Aborted: the transaction has been rolled back due to a failure or a user-initiated abort
    • All changes made by the aborted transaction are undone, and the database is restored to its previous consistent state

Recovery Techniques

  • Recovery techniques ensure that the database can be restored to a consistent state after a failure
  • Logging is a technique used to record all changes made by transactions
    • Undo Logs record the old values of modified data items, allowing transactions to be rolled back if necessary
    • Redo Logs record the new values of modified data items, allowing committed transactions to be reapplied after a failure
  • Checkpointing is a process of creating a consistent snapshot of the database at a specific point in time
    • Checkpoints are used to minimize the amount of work required to recover from a failure
    • During recovery, the database can be restored to the most recent checkpoint and then apply any necessary redo log entries
  • Shadow Paging is a recovery technique that maintains a separate copy (shadow page) of the database pages being modified
    • Changes are made to the shadow pages, and when the transaction commits, the shadow pages replace the original pages
    • In case of a failure, the original pages remain unchanged, and the database remains in a consistent state


© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.

© 2024 Fiveable Inc. All rights reserved.
AP® and SAT® are trademarks registered by the College Board, which is not affiliated with, and does not endorse this website.