Upload
ella-hopkins
View
212
Download
0
Embed Size (px)
Citation preview
Transaction Management
Reading: CB, Ch. 22
Dept of Computing Science, University of Aberdeen 2
In this lecture you will learn
• the problems of concurrency in multi-user DBMSs
• why the concept of a transaction is useful
• how locks can be used to serialise transactions
• the concept of deadlocks and how they can be resolved
• how DBMSs roll back transactions after software or hardware failures
Dept of Computing Science, University of Aberdeen 3
We are here!
Dept of Computing Science, University of Aberdeen 4
Transaction Management Overview
• Objectives:– schedule queries from multiple simultaneous
users efficiently– keep the database in a consistent state
• Transaction management involves:– performing “logical units of work” (defn of
transactions)– controlling concurrency - stop user tasks
interfering– resolving conflicts - e.g. simultaneous update
attempts– recovering from errors - restore DB to
consistent state
Dept of Computing Science, University of Aberdeen 5
The “ACID” RequirementsFor a Transaction
• A transaction should have certain well-defined properties:
• Atomicity - each unit of work is indivisible; “all-or-nothing” (transactions that don't complete must be undone or “rolled-back”)
• Consistency - a transaction transforms the database from one consistent state into another (intermediates may be inconsistent)
• Isolation - each transaction effectively executes independently - one transaction should not see the inconsistent/incomplete state of another transaction
• Durability - once a transaction is complete, its effects cannot be undone or lost (it can only be “undone” with a compensating transaction)
Dept of Computing Science, University of Aberdeen 6
How Are Transactions Defined?
In SQL, by default, each SQL statement is treated as atransaction (even if it affects multiple rows or tables):
UPDATE Staff SET Salary = 1.1 * Salary;
This is one atomic unit of work on the database.
Multiple statements may be grouped together into a singletransaction. Example: A new member of staff takes over aproperty for rent:
BEGIN TRANSACTIONUPDATE PropertyForRentSET StaffNo = 'SN99'WHERE StaffNo = 'SG37';DELETE FROM Staff WHERE StaffNo = 'SG37';COMMIT TRANSACTION
Dept of Computing Science, University of Aberdeen 7
Concurrent TransactionsThe Lost Update Problem
• Suppose an account holds $100. If T1 deposits $100 and T2 withdraws $10, the new balance should be $190. With concurrent transactions, we could get:
• T1: UPDATE Account SET Balance = Balance + 100;• T2: UPDATE Account SET Balance = Balance - 10;
Time T1 T2 Balance
t1 read(bal) $100
t2 bal += 100 read(bal) $100
t3 write(bal) bal -=10 $200
t4 write(bal) $90
•How did $100 vanish? This is the lost update problem
Dept of Computing Science, University of Aberdeen 8
Serialising Transactions
• Clearly, one solution would be to serialise all transactions:– make first transaction finish before next one
starts• However, this would not be efficient on
multi-user systems:– Only need to serialise transactions that refer
to common tables– Not all transactions need serialisation (e.g.
print a mailing list)– Some transactions might benefit from running
simultaneously (e.g. if they both read the same tables)
Dept of Computing Science, University of Aberdeen 9
Controlling Concurrency With Locks
• Locks may be used to serialise only those parts of a transaction that need it...
• Locks may have different levels of granularity:– Table locks - easy to implement, not so efficient– Row locks - more complicated, better performance– Page locks - used by the Memory Manager (not here)
• Transactions can cooperate by using locks to indicate their intention:– Read (shared) - want to read an object– Write (exclusive) - want to read and write an object
Dept of Computing Science, University of Aberdeen 10
The Rules for Using Locks
• Suppose two transactions want to access a given row...
T1 holds No lock Read Write
T2 wants
Read Read Read
Then T2 Acquires
Acquires
Waits
T1 holds No lock Read Write
T2 wants
Write Write Write
Then T2 Acquires
Waits Waits
Dept of Computing Science, University of Aberdeen 11
Preventing Lost Updates With Locks
Time T1 T2 Balance
t1 write_lock(bal) $100
t2 read(bal) write_lock(bal) $100
t3 bal += 100 wait $100
t4 write(bal) wait $200
t5 unlock(bal) wait $200
t6 read(bal) $200
t7 bal -= 10 $200
t8 write(bal) $190
t9 unlock(bal) $190
•Note - if T2 started first, T1 would WAIT
Dept of Computing Science, University of Aberdeen 12
Concurrent TransactionsInconsistent Analysis Problem
• Often, a single transaction may modify multiple rows...
• What if two transactions execute at the same time ?– User1 (T1): UPDATE Salary ...– User2 (T2): SELECT SUM(Salary) ...
• Depending on the timing, User2 might see:– Sum of all old salaries– Sum of all new salaries– Sum of some old + some new salaries (problem!!)
• This is called the inconsistent analysis problem• Clearly, multiple locks are required...
Dept of Computing Science, University of Aberdeen 13
Two-Phase Locking
• What if a transaction requires more than one lock?• The transaction should:
– acquire the locks as it needs them ...– but only release the locks at the end of the transaction
• This is two-phase locking - So called because there is:– a growing phase - more & more locks are acquired– a shrinking phase - the locks are finally released
• Two-phase locking solves inconsistent analysis problem.
• Can you work out why??• Hint: consider read & write lock waiting rules...
Dept of Computing Science, University of Aberdeen 14
Deadlocks
• Two-phase locking ensures serialisability, but it cannot prevent deadlocks
• Example: Suppose a pair of transactions are in the growing phase, and both need write-locks on objects A and B:
Time T1 T2
t1 Acquire A Acquire B
t2 Wait for B Wait for A
T3 Deadlock Deadlock
As deadlocks are relatively rare, most DBMSs allow them to occur rather than attempt to prevent them...
Dept of Computing Science, University of Aberdeen 15
Detecting Deadlocks
• Deadlocks can be detected if the Lock Manager maintains a transaction dependency graph, which is sometimes called a wait-for graph (WFG)...
• The dependency graph contains:– a node for each transaction T– an arc Ta Tb for each dependency
T1 T2 T3
T4
•T1 T2 means T1 is waiting for a resource held by T2• A cycle in the graph indicates deadlock...
Dept of Computing Science, University of Aberdeen 16
Deadlock Resolution
• If the DBMS detects deadlock, it picks a victim transaction:– the victim is killed - rolled-back and re-
scheduled– the other transaction proceeds...
• Strategies for picking the victim include:– pick youngest, oldest, or random transaction– consider amount done (or to do) by each
transaction– pick deadlocked node in WFG with the most
dependencies
Dept of Computing Science, University of Aberdeen 17
The Journal File andRolling Back Transactions
• Usually, a log of each step of transaction is written to a special journal file. Each “record” of the journal file contains (in order of time):– a transaction ID & timestamp– a before-image (if the operation is an update or
delete)– an after-image (if the operation is an update or insert)
• Periodically, the DBMS flushes all memory to disc & writes a checkpoint record. The Journal file can then be used for:– rolling back transactions– system error recovery (roll back to last checkpoint)
Dept of Computing Science, University of Aberdeen 18
Specifying Locking Modes in SQL
• ANSI SQL (and some storage engines in MySQL Server) supports several locking modes. These are specified according to the degree of concurrency/isolation required. Syntax:– SET TRANSACTION ISOLATION LEVEL level
• where level is one of:– SERIALIZABLE - essentially full table locking– REPEATABLE READ - like table locking (but another T
may INSERT)– READ COMMITTED - like row locking (the default)– READ UNCOMMITTED - no locking (might see “phantom”
rows)
Dept of Computing Science, University of Aberdeen 19
Summary
• Transaction Management is largely concerned with implementing the “ACID” requirements:– Defining logical units of work - atomic transactions– Using rollback and transaction scheduling to maintain
consistency– Using locks to stop transactions interfering - isolation– Using journaling to recover from system errors -
durability
• In SQL, its up to the programmer to consider & specify:– how much concurrency/isolation is required...