CS346: Advanced Databases
Concurrency Control

Graham Cormode [email protected]

CS346: Advanced Databases

Concurrency Control


Chapter: “Concurrency Control Techniques” in Elmasri and Navathe

Why?¨ Concurrency a big issue in distributed systems, finance, telecoms…¨ Recommended by the DCS advisory board as a vital topic¨ Programming contest: http://db.in.tum.de/sigmod15contest/

Concurrency Control Protocols

¨ Concurrency Control Protocols: rules to ensure serializability– Enforce isolation property of transaction processing– Provide database consistency when processing transactions– Resolve conflicts between transactions– Decide which transaction prevails in a conflict

¨ Several different types of protocol– Two-phase locking: 1 transaction can access a data item at a time– Timestamps: used to determine which version of an item to use– Multi-version: allow multiple versions of an item to exist– Optimistic: plough on ahead and roll back if needed

Locks and Locking

¨ Associate a lock with each data item to limit access¨ A lock is a variable that describes the state of the item

– Simplest form: locked, or available¨ Use locks to control access

– Only the transaction “holding” the lock can edit the item¨ Rely on operating system/processor support to manage locks

– Ensure that only one transaction can grab a lock at a time

Binary Locks

¨ Binary locks have two states: locked and unlocked (available)– Denote locked = 1, unlocked = 0– lock(X) gives current state of lock for item X, 0 or 1

¨ If lock(X)=0, the item can be accessed on request (lock_item(X))– lock(X) is set to 1 to indicate it is locked

¨ If lock(X)=1, then X cannot be accessed by any other transaction– Must wait for the lock to be released, unlock_item(X)

¨ A binary lock enforces mutual exclusion on the item X– Rely on a “lock manager” to moderate access to lock(X)

¨ lock/unlock must be indivisible units: cannot be preempted– Implemented with a simple bit per item, plus record of lock holder– Keep a list of locked items in a lock table

Enforcing locks

¨ For locks to be effective, must enforce the following rules:– A transaction T must hold lock(X) before any read or write to X– T must unlock_item(X) after it is done reading/writing X– T should not request lock(X) if it already holds it!– T cannot unlock_item(X) if it doesn’t hold the lock on X!

¨ These can be enforced by the lock manager of the DBMS– Manages the locks, tracks who has which locks

Shared/Exclusive Locks

¨ Binary locks can be too restrictive– Only one transaction can access the data item at a time

¨ Can allow multiple transactions access to X, if they only read it– Shared access to X [read access]

¨ Still only one transaction can access X if it will write it– Exclusive access to X [write access]

¨ Three states: read-locked, write-locked, unlocked– Operations: read_lock(X), write_lock(X), unlock(X)

Enforcing shared locks

¨ Lock manager’s work is more complex now– Track how many transactions hold a read lock on an item– Lock table may include entries like:

<Item_name, Lock_state, No_of_reads, Transaction(s)>¨ Must enforce more rules:

1. A transaction T must hold a lock on X before any read of X2. T must write_lock(X) before any write operation to X3. T must unlock_item(X) after it is done reading/writing X4. T should not request read_lock(X) if it already holds a lock!5. T should not request write_lock(X) if it already holds a lock!

May later relax: upgrade a read lock to a write lock6. T cannot unlock_item(X) if it doesn’t hold the lock on X!

Lock conversion

¨ Sometimes we want to relax conditions 4. and 5.– Allow request for a lock on X when some lock is already held

¨ Lock conversion: change the type of the lock held– Upgrade: write_lock(X) when a read_lock is held

Succeeds if only 1 read_lock is held, else must wait– Downgrade: read_lock(X) when a write_lock is held

Should always be permitted– Need to update the lock table to reflect the change

¨ Lock conversion is also possible using primitives:– downgrade = unlock + read_lock– But, someone might ‘steal’ the lock between unlock and read_lock

Guaranteeing serializability

¨ Locks alone do not guarantee serializability– Transactions that look reasonable can still have (subtle) bugs

¨ Example transactions:

¨ Use is made of (the values of) X and Y after their locks are released– Need a protocol to govern the use of locks

Deadlock and Starvation

¨ Deadlock is when each transaction T is waiting for an item that is locked by some other transaction T’– Because no one can move, nothing happens– Example: T’1 wants lock(X), T’2 wants lock(Y)

¨ Starvation: a transaction T can’t proceed indefinitely while others go ahead normally– Can occur if T is waiting for a lock on a popular item, everyone else

gets it first; or if T keeps getting chosen as the victim to abort¨ Starvation can be fixed by ensuring everyone gets a chance

– Fix 1: use a “fair” queuing scheme, e.g. first-come, first-served– Fix 2: assign priorities, increase priority of old transactions

Two-Phase Locking

¨ A transaction follows two phase locking (2PL) if all lock operations precede the first unlock operation in the transaction– Hence two phases: a 1st growing phase, then a 2nd shrinking phase– No locks released in growing phase, none taken in shrinking phase

¨ If lock conversion is allowed, then:– Only upgrades in growing phase– Only downgrades in shrinking phase

¨ Previous example violates 2PL– T1 unlocks Y before X is locked– T2 unlocks X before Y is locked

Two-phase locking example

¨ Modified version of the transactions– Meet 2PL requirements– Previous schedule is not allowed

T2 cannot get write_lock(Y) Must wait for T1

¨ The transactions can deadlock!– E.g. T2 holds read_lock on Y

T1 holds read_lock on XBoth want write_lock on the other

– Can’t go on until one drops a read_lock

Properties of 2PL

¨ If every transaction in a schedule follows 2PL, the schedule is guaranteed to be serializable– Proof omitted from this module– Means no need to test for serializability

¨ 2PL may limit the level of concurrency achievable– Means transaction T can’t release a lock if it needs a lock later– Or T must lock an item long before it is needed

¨ 2PL does not permit all possible serializable schedules– Some serializable schedules are prohibited by 2PL

2PL variants

¨ Version described so far: basic 2PL– Other variants: conservative 2PL, strict 2PL, rigourous 2PL

¨ Conservative (static) 2PL based on read set and write set– Recall: read set is all items read, write set is all items written by T– Lock all items in read-set and write-set before transaction starts– If any are unavailable, wait until they all are– A deadlock-free protocol

¨ But it’s not always possible to know what is needed to lock– E.g. can only know some needed items by inspecting others

2PL for strict schedules

¨ Strict 2PL guarantees strict schedules [see ‘Transaction Processing’]– No write locks released until the transaction commits/aborts– No transaction can read an item written by T unless T has committed– S2PL is not deadlock free

¨ Rigourous 2PL (Strong Strict 2PL) also guarantees strict schedules– T does not release any locks until it commits/aborts under SS2PL

¨ Contrast different emphasis: – Conservative locks everything at the start (always in shrinking phase)– Rigourous releases all locks at the end (always in expanding phase)

¨ The concurrency control system can automate lock requests– E.g. strict 2PL: lock each item as needed, automatically release at end– Place transactions in a queue if they need a currently locked item

Granularity of Locking

¨ The notion of a database item can apply to different objects– A single field of a record; Database record; disk block; whole file

¨ The size of items is referred to as the data item granularity– Fine granularity: small sizes– Coarse granularity: large sizes

¨ Coarse granularity means lower amount of concurrency possible– Suppose a transaction locks a disk block to modify a record– Then other records in the same block are also locked

¨ Fine granularity means more items in the database– More overhead for the lock manager, more operations performed

¨ Picking the right granularity is a significant design issue– Try to pick a level that matches the needs of transactions

Multiple Granularity Level locking

¨ Some systems offer multiple levels of granularity– E.g. lock a single seat on a flight, or a whole plane

¨ A granularity hierarchy with a multiple granularity 2PL protocol– Locking becomes more complicated: more cases to consider– Locks for each node in the hierarchy e.g. file lock, record lock– Tricky: obtaining file locks means all record locks must be dropped– Intention locks can be used to check conflicts efficiently

Dealing with deadlock

¨ Recall deadlock: no transaction can proceed due to locks– Use a deadlock prevention protocol (not always practical)

¨ E.g. Lock all needed items in advance (conservative 2PL)¨ E.g. place a total order on the items in the database

– A transaction can only lock items in item order– Nice idea in theory, but not practical in reality

¨ More aggressive protocols: abort a transaction to break deadlock– How to pick which transaction to kill?

Deadlock Detection

¨ Detect deadlock and abort transactions as needed– Can be effective if transactions rarely overlap– I.e. when transactions are short, only lock a few items

¨ Since deadlock is from cycles of dependency, create the graph– Wait-for graph: transaction nodes, edges for waiting relationships– Whenever Ti wants to lock X held by Tj, create edge (Ti Tj)– When Tj releases locks on items Ti was waiting for, delete edge– There is deadlock if and only if there is a cycle in the wait-for graph


Deadlock Detection

¨ When to check for a cycle in the graph?– Every time an edge is added? Could be high overhead– When the number of current transactions is high enough?– When several transactions have been waiting for a while?

¨ Victim selection is how to choose which process to abort– Typically prefer younger transactions (less to redo)

Concurrency control by Timestamp ordering¨ Methods seen so far all involve locking (2PL etc.)¨ Timestamp ordering concurrency control doesn’t use any locks

– Timestamps are used to determine precedence order of operations– No locks, hence no possibility of deadlock

¨ A timestamp is a (unique) identifier, assigned in increasing order– Define the timestamp of transaction T as TS(T)– Either based on a counter, or system time (ensuring no duplicates)

Timestamp ordering algorithm

¨ Timestamp ordering: order transactions by their timestamps (TS)– Ensure schedule is equivalent to serial schedule in timestamp order– Resolution of conflicting operations can’t violate timestamp order

¨ Each item X is associated with two timestamp values– read_TS(X): TS of youngest transaction to do a successful read of X– write_TS(X): TS of youngest transaction to do a successful write to X

¨ Outline of basic timestamp ordering algorithm (TOA)– For each operation, check that timestamp order is not violated– If T violates order, T is aborted and restarted with new timestamp– If T is rolled back, any transaction using writes of T is rolled back– Can cause cascading rollback: needs extra work to avoid

Basic Timestamp Ordering Algorithm

¨ Whenever transaction T issues a write_item(X) operation:– If read_TS(X) > TS(T) or if write_TS(X) > TS(T), then

Abort and roll back T, reject the operation { A younger transaction has read/written X, violating ordering }

– Else execute the write, set write_TS(X) TS(T)¨ Whenever transaction T issues a read_item(X) operation:

– If write_TS(X) > TS(T), then Abort and roll back T, reject the operation { A younger transaction has written X, violating ordering }

– Else, execute the read, set read_TS (X) max(TS(T), read_TS(X))¨ When TOA detects conflicting operations, it rejects the later one

– Hence, schedules are conflict serializable

Strict Timestamp Ordering

¨ Strict TO ensures schedules are strict and conflict serializable– If T issues a read or write operation on X where TS(T) > write_TS(X),

T is delayed until transaction T’ that wrote X commits or aborts– Effectively the same as locking X until T’ commits or aborts– No deadlock, as T only waits for T’ if TS(T) > TS(T’)

¨ Thomas’s write rule modifies checks on write from basic TO– If read_TS(X) > TS(T), abort and roll back T, reject the read– If write_TS(X) > TS(T), don’t execute write but continue

A later transaction has already written X, so write should be lost If this caused a conflict, it would be detected by the above rule

– If neither condition holds, do the write and set write_TS(X) = TS(T)

Deadlock avoidance via timestamp protocols¨ Can combine locks with Timestamp-based protocols

– Record the (unique) start time of the transaction, TS(T)– Suppose Ti tries to access an item X but X is locked by Tk

¨ Wait-die protocol: if TS(Ti) > TS(Tk), abort (younger) Ti – Restart Ti later with the original timestamp TS(Ti)– Else, Ti is older, and is allowed to wait– The usurper is aborted if it is younger, else it can wait

¨ Wound-wait protocol: if TS(Ti) < TS(Tk), abort (younger) Tk – Restart Tk later with the original timestamp TS(Tk)– Else, Ti is allowed to wait– The usurper pre-empts the lock holder if it is older,

Timestamp-based protocol properties¨ Both protocols prefer older transactions over younger ones

– Older have made more progress, younger have less to lose¨ Both wound-wait and wait-die are deadlock free protocols

– Suppose there is a deadlock, then there is a cycle of transactions– All transactions in the cycle are in the ‘wait’ state– Wait-die: can only wait if older than the holder– Wound-wait: can only wait if younger than the holder

¨ Can’t be a cycle where everyone is older (younger) than the next– Hence, contradiction: no deadlock possible

¨ Wound-wait and wait-die are possibly aggressive: – They abort transactions unnecessarily (wouldn’t lead to deadlock)

Timestamp-free protocols

¨ No waiting algorithm: can’t obtain a lock? Abort immediately!– Restart after a time delay– No transaction is ever waiting, so no deadlock– A lot of needless aborting and restarting

¨ Cautious waiting algorithm: tries to reduce the waste– Ti tries to lock X which is held by Tk – If Tk is not blocked, Ti is blocked and allowed to wait– Else, Tk is blocked, abort Ti

¨ Cautious waiting is deadlock free– Suppose there is a cycle of waiting (blocked) transactions– Consider time at which each transaction became blocked– Can only complete a cycle if some T is blocked by T’ already blocked

Multiversion Concurrency Control

¨ We can keep old versions of a data item when it is updated– The appropriate version can be given to a transaction – Choose a version that will maintain serializability

¨ Increased space cost: need to keep more versions of the data– Use garbage collection ideas to remove unneeded versions

¨ Can be more time efficient: less waiting as a version is available¨ Several realizations possible:

– Multiversion based on timestamp ordering– Multiversion two-phase locking using certify locks

Multiversion Based on Timestamp Ordering

¨ Several versions of X are kept, X1, X2, … Xk

– For each version, keep the value and two timestamps– read_TS(Xi) : the largest timestamp of a transaction that read Xi

– write_TS(Xi) : the timestamp of the transaction that wrote Xi

¨ When a transaction T writes to X, Xk+1 is created– read_TS(Xk+1) = write_TS(Xk+1) = TS(T)

¨ When T reads from Xi, read_TS(Xi) max(read_TS(Xi), TS(T))

Multiversion Based on Timestamp Ordering¨ Rules enforce serializability:1. If T tries to write X, if Xi with highest write_TS(Xi) ≤ TS(T)

has read_TS(Xi) > TS(T), then abort T and roll backelse create new Xk with read_TS(Xk) = write_TS(Xk) = TS(T)

2. If T tries to read X, find Xi with highest write_TS(Xi) ≤ TS(T)Return value of Xi to T, update read_TS(Xi) max(read_TS(Xi),TS(T))

¨ Reads are always successful (rule 2.)¨ Writes may cause abort (rule 1.) if T tries to write a version that

should have been read by a later transaction – Rollback of T can cause cascading rollbacks– So T cannot commit until all T’ that wrote X that T reads also commit

Multiversion 2PL using certify locks

¨ Add an extra type of lock: certify– Modes are read-locked, write-locked, certify-locked and unlocked

¨ Allow transactions to read while T holds the write lock– Two versions of an item: one edited version and one committed– Transactions can read the committed version while T edits– T’s writes do not affect the committed version

¨ To commit the edited version, T must obtain certify lock– Certify is not compatible with read locks: they must be dropped– When certify is acquired, the new version replaces the old one

¨ Avoids cascading aborts as only committed versions can be read– Deadlock still possible, and can be handled by previous methods

Optimistic Concurrency Control (OCC)¨ Methods discussed so far check before an operation is allowed

– E.g. whether it is locked, or whether timestamps agree– This can represent a significant overhead during transactions

¨ Optimistic concurrency control has no checking during execution– Updates are applied to local copies of the data items

¨ A validation phase checks if any updates violate serializability – If OK, transaction commits and database updates from local copies– Else, transaction aborts and is restarted later

¨ Three phases for transaction T in this OCC protocol– Read phase: T reads committed data items, updates local copies– Validation phase: Check T’s updates don’t violate serializability – Write phase: if successful, apply the updates to the database

Optimistic Concurrency Control

¨ OCC performs all checks together for more efficiency– Works well if transactions don’t overlap much in general– A lot of interference leads to a lot of aborts and restarts– “Optimistic” since we assume the former case holds

¨ Validation checks transaction Ti against other transactions– The checks require timestamps, read sets and write sets– For each Tk that is either committed or also in validation:

Tk must complete its write phase before Ti starts its read phase Ti starts its write phase after Tk computes its write phase and

there are no items in read_set(Ti) write_set(Tk) (read set(Ti) write set(Ti)) write set(Tk) = 0 and

Locks on indexes

¨ May want to apply locking to more complex database objects– Indexes are a good example: hierarchical, often changing

¨ Directly applying locking ideas doesn’t work well– Every update wants to lock the root: no concurrent access

¨ Make use of knowledge about index structure– Read locks for parent nodes can be dropped after the child is found– If insertion affects a non-full leaf node, only lock on leaf is needed– Drop locks on parents of non-full internal nodes

¨ Modify index data structures to make them more “lock-friendly”– E.g. the B-link tree add more links between internal nodes– Links make it easier to find data if tree is updated during a search

Insertion, deletion and phantom records¨ Insertion: when a new item is inserted into the database

– The item is given a new unique name by the system– A lock is created (if needed), given to creating transaction– Or read and write timestamps are set to TS of creating transaction

¨ Deletion: a transaction tries to delete an item X– Locks: deleting transaction must hold exclusive (write) lock on X– Timestamps: ensure no later transaction has read or written X

¨ Phantom problem: when a new record X is created by T– If X meets a condition that T’ is applying to but is missed by T’– E.g. if T’ is accessing all employees with DNO=5, and X is in dept 5– Can be hard to detect: X appeared after T’ searched– Possible solution: lock index during T’ to delay insertion of X

¨ Concurrency control via locks– Two-phase locking, shared and exclusive locks– Conservative, strict, rigourous 2PL, multiple granularity locking– Detecting and preventing deadlock via wait-for graphs

¨ Concurrency control via timestamps– Wait-die and wound-wait protocols, cycle-freeness– Timestamp ordering and Thomas’s write rule

¨ Multiversion concurrency control via locks and timestamps¨ Optimistic concurrency control: db.in.tum.de/sigmod15contest/

¨ Chapter: “Concurrency Control Techniques” in Elmasri Navathe

CS346 Advanced Databases38