65
Concurrent execution of user programs is essential for good DBMS performance. Disk accesses are frequent, and relatively slow. Want to keep the CPU working on several user programs concurrently. Challenges Concurrency Control : How do the DBMSs handle concurrent transactions? Crash Recovery : How do the DBMSs handle partial transactions because of machine crashes or users abort the transactions ? Concurrent Execution DBMS DB P1 P2 P3 R/W R/W

Concurrent execution of user programs is essential for good DBMS performance

  • Upload
    lew

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

Concurrent Execution. Concurrent execution of user programs is essential for good DBMS performance. Disk accesses are frequent, and relatively slow. Want to keep the CPU working on several user programs concurrently. Challenges - PowerPoint PPT Presentation

Citation preview

Page 1: Concurrent execution of user programs is essential for good DBMS performance

Concurrent execution of user programs is essential for good DBMS performance.

•Disk accesses are frequent, and relatively slow.

•Want to keep the CPU working on several user programs concurrently.

Challenges•Concurrency Control: How do the DBMSs handle concurrent transactions?

•Crash Recovery: How do the DBMSs handle partial transactions because of machine crashes or users abort the transactions ?

Concurrent Execution

DBMS

DB

P1P2

P3

R/W

R/W

Page 2: Concurrent execution of user programs is essential for good DBMS performance

Transaction Management

Page 3: Concurrent execution of user programs is essential for good DBMS performance

Definition of Transaction:An execution of a user program in a DBMS

•Executing the same program several times generates several transactions.

•From the DBMS’s point of view, a transaction contains a sequence of reads and writes of database objects (e.g., pages, records).

•A user’s program may have many operations on the data retrieved from the database, but the DBMS is only concerned about what data is read/written from/to the database.

DBMS

DB

P1

R/W

R/W

Page 4: Concurrent execution of user programs is essential for good DBMS performance

RT(O): A transaction T reading an object O into a program variable O in memory

WT(O): A transaction T writing an object O to disks:

Each transaction consists of a final action which is either commit or abort.

Commit: Transaction is completed successfully.

Abort: Transaction is terminated and all actions done so far are undone.

AbortT denotes the action of T aborting.

CommitT denotes T committing.

Notation

T1 T2

R(A)W(A)R(B)W(B)abort

R(A)W(A)R(B)W(B)Commit

Page 5: Concurrent execution of user programs is essential for good DBMS performance

• Users submit transactions, and can think of each transaction as executing by itself.

– Concurrency is achieved by the DBMS, which interleaves actions (reads/writes of DB objects) of various transactions.

– Each transaction must leave the database in a consistent state if the DB is consistent when the transaction begins.

TransactionDB DB’

In a consistent state In a consistent state

Inconsistency is allowed.

Page 6: Concurrent execution of user programs is essential for good DBMS performance

Properties of Transactions: ACID

•ATOMICITY: All actions in a transaction are carried

out or none are.

•CONSISTENCY: Each transaction with no concurrent execution of other transactions must preserve the consistency of the database. (Users have to ensure this).

•ISOLATION: Transactions are isolated from the effects of other concurrently executing transactions.

•DURABILITY: Once the transaction has been successfully completed, its effects should persist if the system crashes before all its changes are reflected on disk.

Page 7: Concurrent execution of user programs is essential for good DBMS performance

T1 T2

R(A)W(A)

R(B)W(B)Commit

R(C)W(C)Commit

Read object A into a variable A.

Write object B to a disk.

Time

•Schedule: A list of actions from a set of transactions and the order in which any two actions of a transaction T appear in a schedule must be the same order as they appear in T.

Page 8: Concurrent execution of user programs is essential for good DBMS performance

T1 T2

R(A)W(A)

R(B)W(B)Commit

R(C)W(C)Commit

Read object A into a variable A.

Write object B to a disk.

Time

•Schedule: A list of actions from a set of transactions and the order in which any two actions of a transaction T appear in a schedule must be the same order as they appear in T.

A complete schedule contains either an abort or commit for each transaction in the schedule.

Page 9: Concurrent execution of user programs is essential for good DBMS performance

T1 T2

R(A)W(A)

R(B)W(B)Commit

R(C)W(C)Commit

Read object A into a variable A.

Write object B to a disk.

Time

•Schedule: A list of actions from a set of transactions and the order in which any two actions of a transaction T appear in a schedule must be the same order as they appear in T.

A complete schedule contains either an abort or commit for each transaction in the schedule.

Not all schedules are “good” schedules!!!

Page 10: Concurrent execution of user programs is essential for good DBMS performance

Scheduling Transactions

• Serial schedule: Schedule that does not interleave the actions of different transactions.

• There is no guarantee on the order of which transactions are executed. Given a set of n transactions, there are n! possible execution results.

DB0T1

DB1T2

DB2Tn

DBn

Page 11: Concurrent execution of user programs is essential for good DBMS performance

Scheduling Transactions

• Serial schedule: Schedule that does not interleave the actions of different transactions.

• Given a set of n transactions, there are n! possible execution results.

• Serializable schedule: A schedule whose effect on any consistency must be identical to that of some complete serial schedule (will be refined later on).• The result must be equal to one of n! results.

DB0T1

DB1T2

DB2Tn

DBn

We know the requirement, the problem now is how!

Page 12: Concurrent execution of user programs is essential for good DBMS performance

T1: BEGIN A=A-100, B=B+100 ENDT2: BEGIN A=1.5*A, B=1.5*B END

• T1 is transferring $100 from A’s account to B’s account. T2 is crediting both accounts with a 50% interest payment.

• There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. However, the net effect must be equivalent to these two transactions running serially in some order.

Example of Concurrent Executions

Page 13: Concurrent execution of user programs is essential for good DBMS performance

• Consider interleaving schedule

T1 T2A= A-100

A=A*1.5B=B+100

B=B*1.5

•Serial Schedules

T1T2

A=100,B=100

A=0,B=300

T2T1

A=100,B=100

A=50,B=250

R(A)W(A)

R(A)W(A)

R(B)W(B)Commit

R(B)W(B)Commit

T1 T2

This schedule is OK.

Page 14: Concurrent execution of user programs is essential for good DBMS performance

T1 T2A= A-100

A=A*1.5 B=B*1.5

B=B+100

R(A)W(A)

R(A)W(A)

R(B)W(B)

Commit

R(B)W(B)Commit

T1 T2

This schedule is not OK.

A=100,B=100

A=0, B=250

Page 15: Concurrent execution of user programs is essential for good DBMS performance

1) write operations 2) abort/commit operations

•RW Conflicts•WR Conflicts•WW Conflicts

No abort in any transaction.

Some abort in some transaction.

What causes anomalies with interleaved execution?

Page 16: Concurrent execution of user programs is essential for good DBMS performance

Anomalies: Unrepeatable Reads (RW Conflicts):A has value 5 initially; T1: Increment A; T2: Decrement A.

T1 T2

R(A)R(A)

W(A)Commit

W(A)Commit

The right value of A is 5.

Value of A

556

4

(T1’s view of A)(T2’s view of A)(T1’s view of A)

(T2’s view of A)

The effect of this schedule is different from any serial schedule of T1 and T2

Page 17: Concurrent execution of user programs is essential for good DBMS performance

• WR Conflicts; “dirty reads”:

R(A)W(A)

R(A)W(A)

R(B) W(B)

CommitR(B)W(B)Commit

T1 T2A=A-100

A=A*1.5 B=B*1.5

B=B+100

T1 T2

A=100,B=100

A=0,B=250

Wrong !!

T2T1

A=100,B=100

A=50,B=250

T1T2

A=100,B=100

A=0,B=300

Correct values

Schedule I

I

Page 18: Concurrent execution of user programs is essential for good DBMS performance

• T1 sets A and B to 10; T2 sets A and B to 20.- Consistency constraint: A and B must have the same value.

Anomalies: WW Conflicts

T1 T2W(A)

W(A) W(B) Commit

W(B)Commit

Blind write: Write without reading the value of the objects.

Value of A

10202010

A =20 while B=10.

Page 19: Concurrent execution of user programs is essential for good DBMS performance

Scheduling Involving Aborted Transactions

R(A)W(A)

R(A)W(A)Commit

Abort

T1 T2

Unrecoverable schedule!

Problems

• If T2 has not been committed

-Cascade abort: abort T2; Other transactions reading data updated by T2 are also aborted.

• If T2 has been committed, T2 cannot be aborted:

-Unrecoverable: T2 cannot be aborted-Lost: Rolling back T2 undoes the effect of T2, but T2 will be not be executed again

Page 20: Concurrent execution of user programs is essential for good DBMS performance

A DBMS must ensure that only serializable and recoverable schedules are allowed

Recoverable Schedule: A schedule in which transactions commit only after all transactions whose changes they read commit.

W(X) . .Commit

R(X) .Commit Time

Serializable Schedule:A schedule whose effect on any consistency must be identical to that of some complete serial schedule over the set of committed transactions in S.

Page 21: Concurrent execution of user programs is essential for good DBMS performance

Serial schedule: Once a transaction starts, no other transactions can be started until it either commits or aborts.

Strict schedule: 1) Once a transaction reads a value, then before it commits/aborts, no other transactions are allowed to write the value; 2) Once a transaction writes a value, then before it commits or aborts, no other transactions are allowed to read or write the value

Page 22: Concurrent execution of user programs is essential for good DBMS performance

Serial schedule: Once a transaction starts, no other transactions can be started until it either commits or aborts.

Strict schedule: 1) Once a transaction reads a value, then before it commits/aborts, no other transactions are allowed to write the value; 2) Once a transaction writes a value, then before it commits or aborts, no other transactions are allowed to read or write the value

Time

W(X) . .Commit or Abort

No R(X) or W(X) allowed

T

Strict schedules are serializable and recoverable 1. It avoids RW, WR, WW conflicts, and2. It does not require cascading aborts, and actions

of aborted transaction can be undone.

Time

R(X) . .Commit or Abort

No W(X) allowed

T

Page 23: Concurrent execution of user programs is essential for good DBMS performance

A serial schedule must be a strict schedule,but not vice versa.

S12

R(A)R(A)

W(A)Commit

Commit

strict schedule

S13R(A)Commit

R(A)W(A)Commit

serial schedule

T1 T2

Not a serial schedule!

Page 24: Concurrent execution of user programs is essential for good DBMS performance

Implementing Strict Schedule

Strict Two-phase Locking (Strict 2PL) Protocol:

1. Each transaction must obtain an S (shared) lock on object before reading, and an X (exclusive) lock on object before writing.If a transaction holds an X lock on an object, no other transaction can get a lock (S or X) on that object.

2. All locks held by a transaction are released when the transaction completes.

Requests to acquire and release locks are automatically inserted into transactions by DBMSs.

Page 25: Concurrent execution of user programs is essential for good DBMS performance

DBMS

R W

C

A

O1, ::, On

OIDLockStatus

Holders Suspended

O1 N

O2 S T1, T4 T2

On X T1 T2, T3

: : : :

T1 T2

T4T3

Page 26: Concurrent execution of user programs is essential for good DBMS performance

ST(O): Shared lock on object OXT(O): Exclusive lock on object O

T1 T2

X(A)R(A)W(A) T2 tries to do

X(A) and cannot !T2 has to be suspended until T1 is done.

T1: R(A) W(A)T2: R(A) W(A)

T1 T2

X(A)R(A)W(A)Commit

X(A)R(A)W(A)Commit

In this case, strict 2PL results in serial execution of the two transactions.

T1 T2X(A)R(A)W(A)Commit

X(A)R(A)W(A)Commit

All locks are released.

Page 27: Concurrent execution of user programs is essential for good DBMS performance

T3 T4S(A)R(A)

S(A) R(A) X(B) R(B) W(B) Commit

X(C)R(C)W(C)Commit

Schedule

Example of strict 2PL with interleaved actions.

T3 T4S(A)R(A)X(B)R(B)W(B)Commit

S(A)R(A)X(C)R(C)W(C)Commit

T3 T4R(A)R(B)W(B)Commit

R(A)R(C)W(C)Commit

Page 28: Concurrent execution of user programs is essential for good DBMS performance

Strict 2PL

Strict 2PL ensures strict schedules (why?)

Page 29: Concurrent execution of user programs is essential for good DBMS performance

Deadlocks• Deadlock: Cycle of transactions waiting for locks to

be released by each other.

• Two ways of dealing with deadlocks:– Deadlock prevention– Deadlock detection

T1 T2X(A)W(A)

X(B)W(B)

X(B)X(A)

Page 30: Concurrent execution of user programs is essential for good DBMS performance

Deadlock Detection• Transaction manager maintains a waits-for

graph:– Nodes correspond to active transactions.– Add an edge from Ti to Tj iff Ti is waiting for Tj to

release a lock.– Remove an edge when a lock request is granted.

• Periodically check for cycles in the waits-for graph.

• Use a timeout mechanism: If a transaction has been waiting for too long, abort the transaction.

Page 31: Concurrent execution of user programs is essential for good DBMS performance

T1 T2

T4 T3

T1 T2

T3 T3

S(A)R(A)

X(B)W(B)

S(B)S(C)R(C)

X(C) X(B)X(A)

T1 T2 T3 T4 Wait for graph(Wait for B)

(Wait for C)

(Wait for B)

CyclicDeadlock

Page 32: Concurrent execution of user programs is essential for good DBMS performance

Deadlock Prevention• Assign priorities based on timestamps

- The lower the timestamp, the higher is transaction’s priority

• Assume Ti wants a lock that Tj holds.- Wait-die: (older waits for the younger)

If Ti has higher priority (older), Ti waits for Tj; Otherwise, abort Ti.

- Wound-wait: (younger waits for the older) If Ti has higher priority (older), abort Tj; Otherwise, Ti waits.

• If a transaction re-starts (younger transaction restarts), make sure it has its original timestamp so that no transaction is perennially aborted.

Page 33: Concurrent execution of user programs is essential for good DBMS performance

Performance of Locking• Locked-based schemes resolve conflict using

blocking and aborting, both incurring performance penalty• Blocked transactions may hold locks that force other

transactions to wait• Aborted transactions need to be rolled back and

restarted

Page 34: Concurrent execution of user programs is essential for good DBMS performance

Performance of Locking• Locked-based schemes resolve conflict using

blocking and aborting, both incurring performance penalty• Blocked transactions may hold locks that force other

transactions to wait• Aborted transactions need to be rolled back and

restarted

• Increasing the number of transactions will initially increase the concurrency, but when the number of deadlocks increase to certain level (i.e., thrashing), the performance starts to downgrade

Page 35: Concurrent execution of user programs is essential for good DBMS performance

Relevant Questions with Lock-Based Concurrency Control

• Should we use deadlock prevention or deadlock detection ?

• How frequently should we check for deadlocks?• When deadlock occurs, which transaction

should be aborted?

•Detection-based schemes work well in practice.•Choice of deadlock victim to be aborted:

•Transaction with fewest locks.•Transaction that has done the least work•Transaction that is farthest from completion.•There is a rich literature on this topic.

Page 36: Concurrent execution of user programs is essential for good DBMS performance

•Strict schedule is sufficient but not necessary for serializability and recoverability

-being too strict reduces the concurrency

W(X) . R(X)Commit

W(X) .Commit

Time

Not strict but still serializable and recoverable

W(X) . .Commit or Abort

R(X) or W(X) .Commit

No R(X) or W(X) allowed

T

Strict and therefore serializable and recoverable

T

Page 37: Concurrent execution of user programs is essential for good DBMS performance

Conflict Equivalent Schedules• Two schedules are conflict equivalent if:

– They involve the same actions of the same transactions.

– Every pair of conflicting actions of two committed transactions is ordered the same way.

o Two actions conflict if they operate on the same data object and at least one of them is write.

Page 38: Concurrent execution of user programs is essential for good DBMS performance

Conflict Equivalent Schedules• Two schedules are conflict equivalent if:

– They involve the same actions of the same transactions.

– Every pair of conflicting actions of two committed transactions is ordered the same way.

o Two actions conflict if they operate on the same data object and at least one of them is write.

R1(A)W1(A)

R2(A)W2(A)

R1(B)W1(B)

T1 T2

R1(A)W1(A)

R1(B)W1(B) R2(A)

W2(A)

T1 T2

Page 39: Concurrent execution of user programs is essential for good DBMS performance

Conflict Equivalent Schedules• Two schedules are conflict equivalent if:

– They involve the same actions of the same transactions.

– Every pair of conflicting actions of two committed transactions is ordered the same way.

o Two actions conflict if they operate on the same data object and at least one of them is write.

• If two schedules are conflict equivalent, they have the same effect on a database

– The order of the conflicting actions determines the final state of a database

– Swapping nonconflicting actions does not affect the final state of a database allow more concurrency

Page 40: Concurrent execution of user programs is essential for good DBMS performance

Conflict Serializable Schedules• Schedule S is conflict serializable if S is conflict

equivalent to some serial schedule.- A conflict serializable schedule must be serializable assuming that a

set of objects does not grow or shrink.- A serializable schedule may not be a conflict serializable

Page 41: Concurrent execution of user programs is essential for good DBMS performance

Conflict Serializable Schedules• Schedule S is conflict serializable if S is conflict

equivalent to some serial schedule.- A conflict serializable schedule must be serializable assuming that a

set of objects does not grow or shrink.- A serializable schedule may not be a conflict serializable

T1 T2 T3R(A)

W(A)

Commit

W(A)

Commit

W(A)

Commit

T1 T2 T3

R(A)

W(A)

CommitW(A)

Commit W(A)

Commit

Schedule II: (serial schedule)Schedule I

A serializable schedule (schedule I = T1T2T3 or T2T1T3)but it is not conflict serializable (the conflicting pairs are in different order)

Page 42: Concurrent execution of user programs is essential for good DBMS performance

• To determine if a schedule does not result in anomaly, we just need to make sure it is conflict equivalent to some serial schedule

Page 43: Concurrent execution of user programs is essential for good DBMS performance

• To determine if a schedule does not result in anomaly, we just need to make sure it is conflict equivalent to some serial schedule

• How can we know if a schedule is conflict equivalent to some serial schedule? - Using precedence graph or serializability graph.

Page 44: Concurrent execution of user programs is essential for good DBMS performance

Precedence Graph (Serializability Graph)The precedence graph for a schedule S contains:

•A node for each committed transaction in S.•An arc from Ti to Tj if an action of Ti precedes and conflicts with one of Tj ’s actions.

Page 45: Concurrent execution of user programs is essential for good DBMS performance

Precedence Graph (Serializability Graph)The precedence graph for a schedule S contains:

•A node for each committed transaction in S.•An arc from Ti to Tj if an action of Ti precedes and conflicts with one of Tj ’s actions.T1 T2 T3R(A)

W(A)

CommitW(A)Commit

W(A)

Commit

Page 46: Concurrent execution of user programs is essential for good DBMS performance

Precedence Graph (Serializability Graph)The precedence graph for a schedule S contains:

•A node for each committed transaction in S.•An arc from Ti to Tj if an action of Ti precedes and conflicts with one of Tj ’s actions.

T1 T2

T3T1 T2 T3R(A)

W(A)

CommitW(A)Commit

W(A)

Commit Cycle Not conflict serializable!

Page 47: Concurrent execution of user programs is essential for good DBMS performance

TheoremA schedule is conflict serializable if and only if its dependency graph is acyclic.

Page 48: Concurrent execution of user programs is essential for good DBMS performance

TheoremA schedule is conflict serializable if and only if its dependency graph is acyclic.

Strict 2PL ensures strict schedules and conflict serializable schedules (why??)

Page 49: Concurrent execution of user programs is essential for good DBMS performance

Time

X(A)W(A) . .Commit or Abort S(A)

R(A) .Commit

No R(A) or W(A) allowed

T1

Schedule 1Time

X(A)W(A)

.Commit orAbort

S(A)R(A)

T

Schedule 2

T1: … W(A) …T2: … R(A) …

T2

No W(A)allowed

(the first conflicting pair)

Page 50: Concurrent execution of user programs is essential for good DBMS performance

Time

X(A)W(A) . .Commit or Abort S(A)

R(A) .Commit

No R(A) or W(A) allowed

T1

Schedule 1

Strict 2PL ensures that the precedent graph for any schedule that it allows is acyclic -- the arrow direction is determined by the execution order of the first conflicting pair.

Time

X(A)W(A)

.Commit orAbort

S(A)R(A)

T

Schedule 2

T1: … W(A) …T2: … R(A) …

T2

No W(A)allowed

(the first conflicting pair)

Page 51: Concurrent execution of user programs is essential for good DBMS performance

Two-Phase Locking (2PL)1. Each transaction must obtain a S (shared) lock on

object before reading, and an X (exclusive) lock on object before writing.

2. A transaction can not request additional locks once it releases any locks.

Page 52: Concurrent execution of user programs is essential for good DBMS performance

Two-Phase Locking (2PL)

•2PL allows more concurrency, but is difficult to implement

-Necessary locks may be identified during the compiling phase

-During the run time, need to know when the transaction has obtained all its locks

-Some schedules may be unrecoverable-This is a major problem

Page 53: Concurrent execution of user programs is essential for good DBMS performance

R(A)W(A)

R(A)W(A)

R(B) W(B)

CommitAbort

T1 T2

T1 T2X(A)R(A)W(A)

X(A)

Using Strict 2PL, the following schedule is not allowed.

T1 T2X(A)R(A)W(A)

X(A)R(A)X(B)R(B)W(B)Commit

Abort

Using 2PL, the following unrecoverable schedule is allowed.

X(A) is released.

X(A) and X(B) are released.

Page 54: Concurrent execution of user programs is essential for good DBMS performance

2PL vs. Strict 2PL

•2PL allows conflict serializable schedules.-An equivalent serial order of transactions is given by the order in which transactions enter their shrinking phase.

•Strict 2PL allows both strict schedule and conflict serializable

-When a transaction T writes an object under Strict 2PL, it holds the exclusive lock until it commits or aborts. No other transaction can see or modify this object until T is complete.

Conflict Serializable Conflict Serializable and strict

2PLStrict 2PL

Strict 2PL

Page 55: Concurrent execution of user programs is essential for good DBMS performance

Tuples

files

Pages

Database

contains • A Xact that uses most of the pages in a file should lock the entire file – to reduce the cost of lock management– But, this blocks other transactions

accessing only some pages of the same file.

• If a Xact accesses several records of the same page, the Xact should lock the entire page

DB

f1

p11

r111

f2 f3

p1n

r11j r1n1 r1nj

Page 56: Concurrent execution of user programs is essential for good DBMS performance

•Which granularity should the DBMS provide concurrency control?

•Coarse Granularity means less concurrency

•Fine Granularity incurs more lock management overhead

With multiple granularity locking, how a lock manager can efficiently ensure that an object is not locked by conflicting locks at a different granularity?

Page 57: Concurrent execution of user programs is essential for good DBMS performance

Naïve ApproachDB

f1

p11

r111

f2 f3

p1n

r11j r1n1 r1nj

T1 obtains X locks at time 0 on f1

T2 requests for S lock at time 5.

DBMS can find the conflict efficiently and block T2.

DB

f1

p11

r111

f2 f3

p1n

r11j r1n1 r1nj

T1 requests for X lock at time 5.

T2 obtains an S lock at time 0.

DBMS finds the conflict; T2 must wait.

DBMS must traverse the subtree of f1 to check for conflicting locks.

Tuples

files

Pages

DatabaseContainment hierarchy

Page 58: Concurrent execution of user programs is essential for good DBMS performance

Multiple-Granularity Locking (MGL)

• Intention-shared (IS) indicates that a shared lock(s) will be requested on some descendant node(s).

• Intention-exclusive (IX) indicates that an exclusive lock (s) will be requested on some descendant node(s).

• Shared-Intention-exclusive (SIX) indicates that the current node is locked in a shared mode, but an exclusive lock(s) will be requested on some descendant node(s).

NOTE: SIX is useful since it is common that a transaction needs to read a whole file but modify only a few records in the file

-- IS IX

--

IS

IX

OK

OK

OK

OK OK

OK

S X

OK

OK

S

X

OK OK

OK

OK

OK

OK OK

OK

SIX OK OK

SIX

OK

OK

Lock compatibility matrix

Add these lock types

Page 59: Concurrent execution of user programs is essential for good DBMS performance

Multiple-Granularity Locking

• The lock compatibility matrix must be adhered to.

1. Locking starts from the root node.2. A node N can be locked by a transaction T in S

or IS mode only if the parent node N is already locked by transaction T in either IS or IX mode.

3. A node N can be locked by a transaction T in X, IX, or SIX mode only if the parent node of N is already locked by transaction T in either IX or SIX mode.

4. A transaction T can lock a node only if it has not unlocked any node (to enforce the 2PL protocol).

5. A transaction T can unlock a node, N, only if none of the children of node N are currently locked by T. (i.e., unlocking starts from bottom up).

S

IS IX

IS

X IX SIX

IX SIX

Page 60: Concurrent execution of user programs is essential for good DBMS performance

DB

f1

p11

r111

f2

p1n

r11j

p12

r121 r12j

p21

r211 r21k

p22

r221 r22k

p2m

Three transactions submitted concurrently.T1 updates r111 and r211.T2 updates all records in P12.T3 reads r11j and the entire f2.

T1

T2

T3

Page 61: Concurrent execution of user programs is essential for good DBMS performance

T1 T2 T3IX(db) IX(db) IS(db)IX(f1) IX(f1) IS(f1)IX(p11) X(p12) IS(p11)X(r111) W(r121) S(r11j)W(r111) … R(r11j)

IX(f2) W(r12j) S(f2)IX(p21) Unlock(p12)R(f2)X(r211) Unlock(f1) Unlock(r11j)W(r211) Unlock(db)Unlock(f1)Unlock(r111) Unlock(f2)Unlock(p11)Unlock(db)Unlock(f1) …Unlock(r211)Unlock(p21)Unlock(f2)Unlock(db)

DB

f1

p11

r111

f2

p1n

r11j

p12

r121 r12j

p21

r211 r21k

p22

r221 r22k

p2m

Three transactions submitted concurrently.T1 updates r111 and r211.T2 updates all records in P12.T3 reads r11j and the entire f2.

T1

T2

T3

Page 62: Concurrent execution of user programs is essential for good DBMS performance

T1 T2 T3IX(db)IX(f1)

IX(db)IS(db)IS(f1)IS(p11)

IX(p11)X(r111)

IX(f1)X(p12)

S(r11j)IX(f2)IX(p21)X(r211)Unlock(r211)Unlock(p21)Unlock(f2)

S(f2)Unlock(p12)Unlock(f1)Unlock(db)

Unlock(r111)Unlock(p11)Unlock(f1)Unlock(db)

Unlock(r11j)Unlock(p11)Unlock(f1)Unlock(f2)Unlock(db)

A Serializable Schedule

Does not block each other

Page 63: Concurrent execution of user programs is essential for good DBMS performance

Locking in B+ Trees

• How can we efficiently lock a particular leaf node?• One solution: Ignore the tree structure, just lock

pages while traversing the tree, following 2PL.• This has terrible performance!

– Root node (and many higher level nodes) become bottlenecks because every tree access begins at the root.

• Can we simply use multiple granularity locking?

Data entries

Data Records

Page 64: Concurrent execution of user programs is essential for good DBMS performance

Two Useful Observations

• Higher levels of the tree only direct searches for leaf pages.

• For inserts, a node on a path from root to modified leaf must be locked (in X mode, of course), only if a split can propagate up to it from the modified leaf. (Similar point holds w.r.t. deletes.)

ROOT

A

B

C

D E

F

G H I

20

35

20*

38 44

22* 23* 24* 35* 36* 38* 41* 44*

Do:1) Search 38*2) Delete 38*3) Insert 45*4) Insert 25*

23

Page 65: Concurrent execution of user programs is essential for good DBMS performance

A Simple Tree Locking Algorithm• Search: Start at root and go down;

repeatedly, S lock child then unlock parent.

• Insert/Delete: Start at root and go down, obtaining X locks as needed. Once child is locked, check if it is safe:– If child is safe, release all locks on

ancestors.

• Safe node: Node such that changes will not propagate up beyond this node.– Inserts: Node is not full.– Deletes: Node is not half-empty.

ROOT

A

B

C

D E

F

G H I

20

35

20*

38 44

22* 23* 24* 35* 36* 38* 41* 44*

Do:1) Search 38*2) Delete 38*3) Insert 45*4) Insert 25*

23