Upload
haminh
View
214
Download
0
Embed Size (px)
Citation preview
DATABASE MANAGEMENT SYSTEMS
UNIT IV- TRANSACTIONS
Objectives
To know the definition of transaction
To learn ACID properties
To explain various locking protocols
To define deadlock
To explain serializability
4.1 Transaction ConceptsA transaction is a unit of program execution that accesses and possibly updates various data items.
• A transaction must see a consistent database.
• During transaction execution the database may be temporarily inconsistent.
• When the transaction completes successfully (is committed), the database must be consistent.
• After a transaction commits, the changes it has made to the database persist, even if there are system
failures.
• Multiple transactions can execute in parallel.
• Two main issues to deal with:
– Failures of various kinds, such as hardware failures and system crashes
– Concurrent execution of multiple transactions
4.2 ACID PropertyTo preserve the integrity of data the database system must ensure:
• Atomicity. Either all operations of the transaction are properly reflected in the database or none are.
• Consistency. Execution of a transaction in isolation preserves the consistency of the database.
• Isolation. Although multiple transactions may execute concurrently, each transaction must be unaware of
other concurrently executing transactions. Intermediate transaction results must be hidden from other
concurrently executed transactions.
– That is, for every pair of transactions Ti and Tj, it appears to Ti that either Tj, finished execution before Ti
started, or Tj started execution after Ti finished.
• Durability. After a transaction completes successfully, the changes it has made to the database persist,
even if there are system failures.
Example of Fund Transfer
• Transaction to transfer $50 from account A to account B:
1. read(A)
2. A := A – 50
3. write(A)
4. read(B)
5. B := B + 50
6. write(B)
• Atomicity requirement — if the transaction fails after step 3 and before step 6, the system should ensure
that its updates are not reflected in the database, else an inconsistency will result.
• Consistency requirement – the sum of A and B is unchanged by the execution of the transaction.
• Isolation requirement — if between steps 3 and 6, another transaction is allowed to access the partially
updated database, it will see an inconsistent database (the sum A + B will be less than it should be).
• Isolation can be ensured trivially by running transactions serially, that is one after the other.
• However, executing multiple transactions concurrently has significant benefits, as we will see later.
• Durability requirement — once the user has been notified that the transaction has completed (i.e., the
transfer of the $50 has taken place), the updates to the database by the transaction must persist despite
failures.
4.3 System Recovery7.1 An Over view
An Overview
After some failure, we must be able to restore the database to a state that is known to be correct.
A failure may be caused by
Bugs in application programs, operating system,database system ...
Hardware errors on devices, channels, CPU, or memory Operator errors
External causes: power failure, _re, high temperature, lightning ...
To ensure that the database can be recovered after a failure, the following actions are required during
normal operation:
Database backup
Periodically the entire database is copied to archival storage. This copy should be stored in
a safe place.
Database journalizing (logging)
Every time a change is made to the database, a record containing the new value (after-
image, redo part) and, possibly, also the old value (before-image, undo part) is written to a
special file called the journal.The journal is often kept in duplicate.
Checkpointing- is an operation that synchronizes the journal and the database, in the simplest case by
suspending all processing, and performing all the pending writing.
Types of failure and recovery action required:
Transaction-local failure only one transaction affected, database not damaged. Perform a
ROLLBACK, i.e. undo whatever changes the transaction has made to the database.
ROLLBACK issued either by the transaction or by the system.
System-wide failure, database not damaged.
All transactions in progress affected. Undo the changes made by any transaction in
progress at the time of failure. Redo every committed transaction for which it is not
known whether all its changes have physically been written to the database. Possibly,
restart the transactions that were rolled back.
System-wide failure, database damaged
Restore the database from the latest back-up copy and redo all committed transactions.
This may be a very slow process.
Incremental journal with immediate updates
Log structure:
Begin transaction record: transaction number, (input message)
Sequence of change records: transaction number, page address, old value, new value
End transaction record: transaction number, commit/rollback
Before a modified page is physically written to the database, the journal record (at least the undo
part) must be physically written to the journal (not just placed in the journal buffer). Before the buffer
manager writes a modified page to the database it must make sure that all journal records related to
that page have been written to the journal. The simplest solution is to use the journal buffer. Before
the transaction commits, all its journal records and the end-transaction record must be physically
written to the journal. Again the simplest solution is to use the log buffer.
Transaction UNDO
Scan backwards through the journal
For every record related to the transaction to be undone: undo the change by rewriting the old
value (before-image) from the change record until reaching the begin-transaction record.
Cascading rollbacks
When rollback of a transaction forces rollback of another transaction.
If some other transaction has been allowed to read or write a page modified by the transaction
being rolled back, then that transaction must also be rolled back.
May occur as soon as transactions are allowed to see uncommitted changes (dirty reads).
Note that this may force rollback of a transaction that has already committed. NOT
ACCEPTABLE.
To avoid cascading rollback, before a transaction commits, any data written by it are not
allowed to be read by other transactions.
Checkpointing
A checkpoint synchronizes the log and the contents of the database by making sure that all
writes have been physically performed.
Taking a checkpoint consist of the following steps:
1. Suspend transaction processing
2. Physically write out all log bu_ers
3. Physically write out all modi_ed pages in the page bu_er
4. Write a checkpoint record to the log list of active transactions pointers
to their most recent log records (possibly)
5. Record the address of the checkpoint record in a \restart _le"
6. Resume transaction processing
During recovery, only the following transactions need to be considered: transactions that started after
the most recent checkpoint, and transactions that were active at the time of the most recent
checkpoint.
Restart procedure
Scan the log forwards from the most recent checkpoint, identifying the type of each transaction
recorded
T1: started before the checkpoint and had an end-transaction record
T2: started after the checkpoint and had an end-transaction record
T3: started before the checkpoint and had no end-transaction record
T4: started after the checkpoint and had no end-transaction record
Scan backwards undoing all transactions of type T3 and T4.
Scan forwards from the most recent checkpoint redoing (rewriting the new values) all committed
transactions of type T1 and T2. Ignore all transactions that were rolled back.
Resume transaction processing, possibly restarting transactions of type T3 and T4.
Incremental log with deferred updates
All writes to the database are deferred until the transaction commits.
Transaction rollback is now trivial: omit the database writes and write an end-transaction record
indicating rollback to the log.
Begin-transaction records, end-transaction records, and change records, but omitting the old value,
are written to the log as before.
The end-transaction record can be written as soon as all change records have been forced out and
before the database writes have been performed.
During restart no transactions need to be undone. Committed transaction must still be redone
because the changes may not have been physically written at the time of failure.
4.4 Two Phase CommitTwo Phase Commit is the process by which a relational database ensures that distributed transactions are
performed in an orderly manner. In this system, transactions may be terminated by either committing them or
rolling them back.
A commit operation is, by definition, an all-or-nothing affair. If a series of operations bound as a transaction
cannot be completed, the rollback must restore the system (or cooperating systems) to the pre-transaction
state. In order to ensure that a transaction can be rolled back, a software system typically logs each operation,
including the commit operation itself. A transaction/recovery manager uses the log records to undo (and
possibly redo) a partially completed transaction.When a transaction involves multiple distributed resources,
for example, a database server on each of two different network hosts, the commit process is somewhat
complex because the transaction includes operations that span two distinct software systems, each with its
own resource manager, log records, and so on. (In this case, the distributed resources are the database
servers.)
Two-phase commit is a transaction protocol designed for the complications that arise with distributed
resource managers. With a two-phase commit protocol, the distributed transaction manager employs a
coordinator to manage the individual resource managers.
The commit process proceeds as follows:
Phase 1
o Each participating resource manager coordinates local operations and forces all log records
out:
o If successful, respond "OK"
o If unsuccessful, either allow a time-out or respond "OOPS"
Phase 2
o If all participants respond "OK":
Coordinator instructs participating resource managers to "COMMIT"
Participants complete operation writing the log record for the commit
o Otherwise:
Coordinator instructs participating resource managers to "ROLLBACK"
Participants complete their respective local undos
In order for the scheme to work reliably, both the coordinator and the participating resource managers
independently must be able to guarantee proper completion, including any necessary restart/redo operations.
The algorithms for guaranteeing success by handling failures at any stage are provided in advanced database
texts.
4.5 Save Points
The SAVEPOINT statement sets a named transaction savepoint with a name of identifier. If the current
transaction has a savepoint with the same name, the old savepoint is deleted and a new one is set. The
ROLLBACK TO SAVEPOINT statement rolls back a transaction to the named savepoint without
terminating the transaction. (The SAVEPOINT keyword is optional as of MySQL 5.0.3.) Modifications that
the current transaction made to rows after the savepoint was set are undone in the rollback, but InnoDB does
not release the row locks that were stored in memory after the savepoint. (For a new inserted row, the lock
information is carried by the transaction ID stored in the row; the lock is not separately stored in memory. In
this case, the row lock is released in the undo.) Savepoints that were set at a later time than the named
savepoint are deleted.
If the ROLLBACK TO SAVEPOINT statement returns the following error, it means that no savepoint with
the specified name exists: The RELEASE SAVEPOINT statement removes the named savepoint from the set
of savepoints of the current transaction. No commit or rollback occurs. It is an error if the savepoint does not
exist. All savepoints of the current transaction are deleted if you execute a COMMIT, or a ROLLBACK that
does not name a savepoint.
Savepoints are useful in situations where errors are unlikely to occur. The use of a savepoint to roll back part
of a transaction in the case of an infrequent error can be more efficient than having each transaction test to
see if an update is valid before making the update. Updates and rollbacks are expensive operations, so
savepoints are effective only if the probability of encountering the error is low and the cost of checking the
validity of an update beforehand is relatively high.
4.6 SQL Facilities for recovery
SQL provides three nalogs of BEGIN TRANSACTION, COMMIT and ROLLBACK, called START
TRANSACTION, COMMIT WOK and ROLLBACK WORK, respectively. Here, the syntax of START
TRANSACTION:
START TRASNACTION <option commalist>;
The <option commalist> specifies an access mode, an isolation mode
The access mode is either READ ONLY or READ WRITE. If neither is specifed, READ
WRITE is assumed, unless READ UNCOMMITED isolation level is specified, in which case
READ ONLY is assumed. IF READ WRITE is specified, the isolation level must not be
READ UNCOMMITTED.
The isolation level takes the form ISOLATION LEVEL <isolation>, where <isolation> is
READ UNCOMMITTED, READ UNCOMMITTED, REPEATABLE READ, or
SERIALIZABLE
SQL also supports save points, The statement
SAVEPOINT <savepoint name>;
creates a savepoint with the specified user chosen name. The statement
ROLLBACK TO <savepoint name>;
Undoes all updates done since the specified savepoint and the statement
RELEASE <savepint name>;
Drops the specified savepoint, meaning it is no longer possible t execute a ROLLBACK to that savepoint.
All savepoints are automatically dropped at transaction termination.
4.7 ConcurrencyConcurrency is the ability of the DBMS to process more than one transaction at a time. This section
overviews briefly several problems that can occur when concurrent transactions execute in an uncontrolled
manner. Concrete examples are given to illustrate the problems in details. The related activities and learning
tasks that follow give you a chance to evaluate the extent of your understanding of the problems. An
important learning objective for this section of the unit is to understand the different types of problems of
concurrent executions in OLTP and appreciate the need for concurrency control.
4.8 Need for Concurrency
If transactions are executed serially, i.e., sequentially with no overlap in time, no transaction concurrency
exists. However, if concurrent transactions with interleaving operations are allowed in an uncontrolled
manner, some unexpected result may occur. Here are some typical examples:
1. The lost update problem: A second transaction writes a second value of a data-item (datum) on top of
a first value written by a first concurrent transaction, and the first value is lost to other transactions
running concurrently which need, by their precedence, to read the first value. The transactions that
have read the wrong value end with incorrect results.
2. The dirty read problem: Transactions read a value written by a transaction that has been later aborted.
This value disappears from the database upon abort, and should not have been read by any transaction
("dirty read"). The reading transactions end with incorrect results.
3. The incorrect summary problem: While one transaction takes a summary over values of a repeated
data-item, a second transaction updated some instances of that data-item. The resulting summary does
not reflect a correct result for any (usually needed for correctness) precedence order between the two
transactions (if one is executed before the other), but rather some random result, depending on the
timing of the updates, and whether a certain update result has been included in the summary or not.
We illustrate some of the problems by referring to a simple airline reservation database in which each record
is stored for each airline flight. Each record includes the number of reserved seats on that flight as a named
data item, among other information. Recal the two transactions T1 and T2 introduced previously.
Transaction T1 cancels N reservations from one flight whose number of reserved seats is stored in the
database item named X, and reserves the same number of seats on another flight whose number of reserved
seats is stored in the database item named Y. A simpler transaction T2 just reserves M seats on the first flight
referenced in transaction T1. To simplify the example, the additional potions of the transactions are not
shown, such as checking whether a flight has enough seats available before reserving additional seats.
When an airline reservation database program is written, it has the flight numbers, their dates, and the
number of seats available for booking as parameters; hence, the same program can be used to execute many
transactions, each with different flights and number of seats to be booked. For concurrency control purpose,
a transaction is a particular execution of a program on a specific date, flight, and number of seats. The
transactions T1 and T2 are specific executions of the programs that refer to the specific flights whose
numbers of seats are stored in data item X and Y in the database. Now let’s discuss the types of problems we
may encounter with these two transactions.
The Lost Update Problem
Lost update problem occurs when two transactions that access the same database items have their operations
interleaved in a way that makes the value of some database item incorrect. That is, interleaved use of a same
data item would cause some problems when an update operation from one transaction overwrites another
update from a second transaction.
An example will explain the problem clearly. Suppose the two transactions T1 and T2 introduced previously
are submitted at approximately the same time. It is possible when two travel agency staff help customers to
book their flights at more or less the same time from different or same office. Suppose that their operations
are interleaved by the operating system as shown in the figure below
The above interleaved operation will lead to an incorrect value for data item X, because at time step3, T2
reads in the original value of X which is before T1 changes it in the database, and hence the updated value
resulting from T1 is lost. For example, if X = 80, originally there were 80 reservations on the flight, N = 5,
T1 cancels 5 seats on the flight corresponding to X and reserves them on the flight corresponding to Y, and
M = 4, T2 reserves 4 seats on X. The final result should be X = 80 – 5 + 4 = 79; but in the concurrent
operations of figure 9.5, it is X = 84 because the update that cancelled 5 seats in T1 was lost. The detailed
value updating in the flight reservation database taking the above example is shown below .
Uncommitted Dependency (or Dirty Read / Temporary Update)
Uncommitted dependency occurs when a transaction is allowed to retrieve or (worse) update a record that
has been updated by another transaction but has not yet been committed by that other transaction. Because it
has not yet been committed, there is always a possibility that it will never be committed but it will be rolled
back instead, in which case, the first transaction will have used some data that is now incorrect, a dirty read
for the first transaction.
The figure below shows an example where T1 updates item X and then fails before completion, so the
system must change X back to its original value. Before it can do so, however, transaction T2 reads the
“temporary” value of X, which will not be recorded permanently in the database because of the failure of T1.
The value of item X that is read by T2 is called dirty data, because it has been created by a transaction that
has not completed and committed yet; hence this problem is also known as the dirty read problem. Since the
dirty data read in by T2 is only a temporary value of X, the problem is sometimes called temporary update
too. .
The rollback of transaction T1 may be due to a system crash and transaction T2 may already have terminated
by that time in which case the crash would not cause a ROLLBACK to be issued for T2. The following
situation is even more unacceptable.
In the above example, not only does transaction T2 becomes dependent on an uncommitted change at time
step 6 but it also loses an update at time step 7, because the ROLLBACK in T1 causes data item X to be
restored to its value before time step 1.
Inconsistent Analysis
The inconsistent analysis occurs when a transaction reads several values but a second transaction updates
some of these values during the execution of the first. This problem is significant for example if one
transaction is calculating an aggregate summary function on a number of records while other transactions are
updating some of these records, the aggregate function may calculate some values before they are updated
and others after they are updated. This causes an inconsistency.
For example, suppose that a transaction T3 is calculating the total number of reservations on all the flights;
meanwhile, transaction T1 is executing. If the interleaving of operations shown below occurs, the result of
T3 will be off by amount of N, because T3 reads the value of X after N seats have been subtracted from it
but reads the value of Y before those N seats have been added to it.
4.9 Locking Protocols
The fundamental properties of a transaction is isolation. When several transactions execute concurrently in
the database, however, the isolation property may no longer be preserved. To ensure that it is, the system
must control the interaction among the concurrent transactions; this control is achieved through one of a
variety of mechanisms called concurrency-control schemes. One way to ensure serializability is to require
that data items be accessed in a mutually exclusive manner; that is, while one transaction is accessing a data
item, no other transaction can modify that data item. The most common method used to implement this
requirement is to allow a transaction to access a data item only if it is currently holding a lock on that item.
Locks
There are various modes in which a data item may be locked. In this section, we restrict our attention to two
modes:
1. Shared. If a transaction Ti has obtained a shared-mode lock (denoted by S) on item Q, then Ti can read,
but cannot write, Q.
2. Exclusive. If a transaction Ti has obtained an exclusive-mode lock (denoted by X) on item Q, then Ti can
both read andwrite Q.
Granting of Locks
When a transaction requests a lock on a data item in a particular mode, and no other transaction has a lock on
the same data item in a conflicting mode, the lock can be granted. However, care must be taken to avoid the
following scenario. Suppose a transaction T2 has a shared-mode lock on a data item, and another transaction
T1 requests an exclusive-mode lock on the data item. Clearly, T1 has to wait for T2 to release the shared-
mode lock. Meanwhile, a transaction T3 may request a shared-mode lock on the same data item. The lock
request is compatible with the lock granted to T2, so T3 may be granted the shared-mode lock. At this point
T2 may release the lock, but still T1 has to wait for T3 to finish. But again, there may be a new transaction T4
that requests a shared-mode lock on the same data item, and is granted the lock before T3 releases it. In fact,
it is possible that there is a sequence of transactions that each requests a shared-mode lock on the data item,
and each transaction releases the lock a short while after it is granted, but T1 never gets the exclusive-mode
lock on the data item. The transaction T1 may never make progress, and is said to be starved. We can avoid
starvation of transactions by granting locks in the following manner: When a transaction Ti requests a lock
on a data item Q in a particular mode M, the concurrency-control manager grants the lock provided that
1. There is no other transaction holding a lock on Q in a mode that conflicts with M.
2. There is no other transaction that is waiting for a lock on Q, and that made its lock request before Ti.
4.10 Two Phase Locking
One protocol that ensures serializability is the two-phase locking protocol. This protocol requires that each
transaction issue lock and unlock requests in two phases:
1. Growing phase. A transaction may obtain locks, but may not release any lock.
2. Shrinking phase. A transaction may release locks, but may not obtain any new locks.
Initially, a transaction is in the growing phase. The transaction acquires locks as needed. Once the transaction
releases a lock, it enters the shrinking phase, and it can issue no more lock requests. For example,
transactions T3 and T4 are two phase. On the other hand, transactions T1 and T2 are not two phase. Note that
the unlock instructions do not need to appear at the end of the transaction. For example, in the case of
transaction T3, we could move the unlock(B) instruction to just after the lock-X(A) instruction, and still
retain the two-phase locking property. Cascading rollbacks can be avoided by a modification of two-phase
locking called the strict two-phase locking protocol. This protocol requires not only that locking be two
phase, but also that all exclusive-mode locks taken by a transaction be held until that transaction commits.
This requirement ensures that any data written by an uncommitted transaction are locked in exclusive mode
until the transaction commits, preventing any other transaction from reading the data. Another variant of
two-phase locking is the rigorous two-phase locking protocol, which requires that all locks be held until
the transaction commits.
Strict two-phase locking and rigorous two-phase locking (with lock conversions) are used extensively in
commercial database systems. A simple but widely used scheme automatically generates the appropriate lock
and unlock instructions for a transaction, on the basis of read and write requests from the transaction:
• When a transaction Ti issues a read(Q) operation, the system issues a lock- S(Q) instruction followed by the
read(Q) instruction.
• When Ti issues a write(Q) operation, the system checks to see whether Ti already holds a shared lock on Q.
If it does, then the system issues an upgrade( Q) instruction, followed by the write(Q) instruction. Otherwise,
the system issues a lock-X(Q) instruction, followed by the write(Q) instruction.
• All locks obtained by a transaction are unlocked after that transaction commits or aborts.
Implementation of Locking
A Lock manager can be implemented as a separate process to which transactions send lock and unlock
requests
The lock manager replies to a lock request by sending a lock grant messages (or a message asking the
transaction to roll back, in case of a deadlock)
The requesting transaction waits until its request is answered
The lock manager maintains a datastructure called a lock table to record granted locks and pending
requests
The lock table is usually implemented as an in-memory hash table indexed on the name of the data item
being locked
Lock Table
Black rectangles indicate granted locks, white ones indicate waiting requests
Lock table also records the type of lock granted or requested
New request is added to the end of the queue of requests for the data item, and granted if it is compatible
with all earlier locks
Unlock requests result in the request being deleted, and later requests are checked to see if they can now
be granted. If transaction aborts, all waiting or granted requests of the transaction are deleted o lock manager
may keep a list of locks held by each transaction, to implement this efficiently
Timestamp-Based Protocols
Each transaction is issued a timestamp when it enters the system. If an old transaction Ti has time-stamp
TS(Ti), a new transaction Tj is assigned time-stamp TS(Tj) such that TS(Ti) <TS(Tj).
The protocol manages concurrent execution such that the time-stamps determine the serializability order.
In order to assure such behavior, the protocol maintains for each data Q two timestamp values:
W-timestamp(Q) is the largest time-stamp of any transaction that executed write(Q) successfully.
R-timestamp(Q) is the largest time-stamp of any transaction that executed read(Q) successfully.
The timestamp ordering protocol ensures that any conflicting read and write operations are executed in
timestamp order.
Suppose a transaction Ti issues a read(Q)
1. If TS(Ti) W-timestamp(Q), then Ti needs to read a value of Q that was already overwritten. Hence, the
read operation is rejected, and Ti is rolled back.
2. If TS(Ti) W-timestamp(Q), then the read operation is executed, and R-timestamp(Q) is set to the
maximum of Rtimestamp(Q) and TS(Ti).
Suppose that transaction Ti issues write(Q).
If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was needed previously, and the
system assumed that that value would never be produced. Hence, the write operation is rejected, and Ti is
rolled back.
If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete value of Q. Hence, this write
operation is rejected, and Ti is rolled back.
Otherwise, the write operation is executed, and W-timestamp(Q) is set to TS(Ti).
4.11 Intent Locks
The Database Engine uses intent locks to protect placing a shared (S) lock or exclusive (X) lock on a
resource lower in the lock hierarchy. Intent locks are named intent locks because they are acquired before a
lock at the lower level, and therefore signal intent to place locks at a lower level.
Intent locks serve two purposes:
To prevent other transactions from modifying the higher-level resource in a way that would
invalidate the lock at the lower level.
To improve the efficiency of the Database Engine in detecting lock conflicts at the higher level of
granularity.
For example, a shared intent lock is requested at the table level before shared (S) locks are requested on
pages or rows within that table. Setting an intent lock at the table level prevents another transaction from
subsequently acquiring an exclusive (X) lock on the table containing that page. Intent locks improve
performance because the Database Engine examines intent locks only at the table level to determine if a
transaction can safely acquire a lock on that table. This removes the requirement to examine every row or
page lock on the table to determine if a transaction can lock the entire table.
Intent locks include intent shared (IS), intent exclusive (IX), and shared with intent exclusive (SIX).
Lock mode Description
Intent shared (IS)Protects requested or acquired shared locks on some (but not all)
resources lower in the hierarchy.
Intent exclusive (IX)
Protects requested or acquired exclusive locks on some (but not all)
resources lower in the hierarchy. IX is a superset of IS, and it also
protects requesting shared locks on lower level resources.
Shared with intent exclusive
(SIX)
Protects requested or acquired shared locks on all resources lower in the
hierarchy and intent exclusive locks on some (but not all) of the lower
level resources. Concurrent IS locks at the top-level resource are allowed.
For example, acquiring a SIX lock on a table also acquires intent
exclusive locks on the pages being modified and exclusive locks on the
modified rows. There can be only one SIX lock per resource at one time,
preventing updates to the resource made by other transactions, although
other transactions can read resources lower in the hierarchy by obtaining
IS locks at the table level.
Intent update (IU) Protects requested or acquired update locks on all resources lower in the
hierachy. IU locks are used only on page resources. IU locks are
converted to IX locks if an update operation takes place.
Shared intent update (SIU)
A combination of S and IU locks, as a result of acquiring these locks
separately and simultaneously holding both locks. For example, a
transaction executes a query with the PAGLOCK hint and then executes
an update operation. The query with the PAGLOCK hint acquires the S
lock, and the update operation acquires the IU lock.
Update intent exclusive (UIX) A combination of U and IX locks, as a result of acquiring these locks
separately and simultaneously holding both locks.
4.12 Deadlock
A system is in a deadlock state if there exists a set of transactions such that every transaction in the set is
waiting for another transaction in the set. More precisely, there exists a set of waiting transactions {T0, T1, . .
., Tn} such that T0 is waiting for a data item that T1 holds, and T1 is waiting for a data item that T2 holds,
and . . ., and Tn−1 is waiting for a data item that Tn holds, and Tn is waiting for a data item that T0 holds.
None of the transactions can make progress in such a situation. The only remedy to this undesirable situation
is for the system to invoke some drastic action, such as rolling back some of the transactions involved in the
deadlock. Rollback of a transaction may be partial: That is, a transaction may be rolled back to the point
where it obtained a lock whose release resolves the deadlock. There are two principal methods for dealing
with the deadlock problem. We can use a deadlock prevention protocol to ensure that the system will never
enter a deadlock state. Alternatively, we can allow the system to enter a deadlock state, and then try to
recover by using a deadlock detection and deadlock recovery scheme. As we shall see, both methods may
result in transaction rollback. Prevention is commonly used if the probability that the system would enter a
deadlock state is relatively high; otherwise, detection and recovery are more efficient. Note that a detection
and recovery scheme requires overhead that includes not only the run-time cost of maintaining the necessary
information and of executing the detection algorithm, but also the potential losses inherent in recovery from
a deadlock.
Consider the following two transactions:
System is deadlocked if there is a set of transactions such that every transaction in the set is waiting for
another transaction in the set.
Deadlock prevention protocols ensure that the system will never enter into a deadlock state. Some
prevention strategies :
Require that each transaction locks all its data items before it begins execution (predeclaration).
Impose partial ordering of all data items and require that a transaction can lock data items only in the order
specified by the partial order (graph-based protocol).
More Deadlock Prevention Strategies
Following schemes use transaction timestamps for the sake of deadlock prevention alone.
(i) wait-die scheme — non-preemptive
older transaction may wait for younger one to release data item. Younger transactions never wait for older
ones; they are rolled back instead.
a transaction may die several times before acquiring needed data item
(ii) wound-wait scheme — preemptive
older transaction wounds (forces rollback) of younger transaction instead of waiting for it. Younger
transactions may wait for older ones.
may be fewer rollbacks than wait-die scheme.
Both in wait-die and in wound-wait schemes, a rolled back transactions is restarted with its original
timestamp. Older transactions thus have precedence over newer ones, and starvation is hence avoided.
(iii) Timeout-Based Schemes :
a transaction waits for a lock only for a specified amount of time. After that, the wait times out and the
transaction is rolled back.Thus deadlocks are not possible
simple to implement; but starvation is possible. Also difficult to determine good value of the timeout
interval.
Deadlock Detection
Deadlocks can be described as a wait-for graph, which consists of a pair G = (V,E),
V is a set of vertices (all the transactions in the system)
E is a set of edges; each element is an ordered pair Ti Tj.
If Ti Tj is in E, then there is a directed edge from Ti to Tj, implying that Ti is waiting for Tj to release a
data item.
When Ti requests a data item currently being held by Tj, then the edge Ti Tj is inserted in the wait-for
graph. This edge is removed only when Tj is no longer holding a data item needed by Ti.
The system is in a deadlock state if and only if the wait-for graph has a cycle. Must invoke a deadlock-
detection algorithm periodically to look for cycles.
Wait-for graph without a cycle Wait-for graph with a cycle
Deadlock Recovery
When deadlock is detected :
Some transaction will have to rolled back (made a victim) to break deadlock. Select that transaction as
victim that will incur minimum cost.
Rollback -- determine how far to roll back transaction
Total rollback: Abort the transaction and then restart it.
More effective to roll back transaction only as far as necessary to break deadlock.
Starvation happens if same transaction is always chosen as victim. Include the number of rollbacks in the
cost factor to avoid starvation
Insert and Delete Operations
If two-phase locking is used :
A delete operation may be performed only if the transaction deleting the tuple has an exclusive lock on the
tuple to be deleted.
A transaction that inserts a new tuple into the database is given an X-mode lock on the tuple
Insertions and deletions can lead to the phantom phenomenon.
A transaction that scans a relation (e.g., find all accounts in Perryridge) and a transaction that inserts a
tuple in the relation (e.g., insert a new account at Perryridge) may conflict in spite of not accessing any tuple
in common.
If only tuple locks are used, non-serializable schedules can result: the scan transaction may not see the
new account, yet may be serialized before the insert transaction.
The transaction scanning the relation is reading information that indicates what tuples the relation
contains, while a transaction inserting a tuple updates the same information.
The information should be locked.
One solution:
Associate a data item with the relation, to represent the information about what tuples the relation
contains.
Transactions scanning the relation acquire a shared lock in the data item,
Transactions inserting or deleting a tuple acquire an exclusive lock on the data item. (Note: locks on the
data item do not conflict with locks on individual tuples.)
Above protocol provides very low concurrency for insertions/deletions.
Index locking protocols provide higher concurrency while preventing the phantom phenomenon, by
requiring locks on certain index buckets.
4.13 Serializability
Serializability is the classical concurrency scheme. It ensures that a schedule for executing concurrent
transactions is equivalent to one that executes the transactions serially in some order. It assumes that all
accesses to the database are done using read and write operations. A schedule is called ``correct'' if we can
find a serial schedule that is ``equivalent'' to it. Given a set of transactions T1...Tn, two schedules S1 and S2
of these transactions are equivalent if the following conditions are satisfied:
Read-Write Synchronization: If a transaction reads a value written by another transaction in one schedule,
then it also does so in the other schedule.
Write-Write Synchronization: If a transaction overwrites the value of another transaction in one schedule, it
also does so in the other schedule.
These two properties ensure that there can be no difference in the effects of the two schedules. As an
example, consider the schedule in Figure 1. It is equivalent to a schedule in which T2 is executed after T1.
There are several approaches to enforcing serializability.
Conflict Serializability
Let us consider a schedule S in which there are two consecutive instructions Ii and Ij, of transactions Ti and
Tj , respectively (i _= j). If Ii and Ij refer to different data items, then we can swap Ii and Ij without affecting
the results of any instruction in the schedule. However, if Ii and Ij refer to the same data item Q, then the
order of the two steps maymatter. Since we are dealing with only read and write instructions, there are four
cases that we need to consider:
1. Ii = read(Q), Ij = read(Q). The order of Ii and Ij does not matter, since the same value of Q is read by Ti
and Tj , regardless of the order.
2. Ii = read(Q), Ij = write(Q). If Ii comes before Ij, then Ti does not read the value of Q that is written by Tj
in instruction Ij. If Ij comes before Ii, then Ti reads the value of Q that is written by Tj. Thus, the order of Ii
and Ij matters.
3. Ii = write(Q), Ij = read(Q). The order of Ii and Ij matters for reasons similar to those of the previous case.
4. Ii = write(Q), Ij = write(Q). Since both instructions are write operations, the order of these instructions
does not affect either Ti or Tj . However, the value obtained by the next read(Q) instruction of S is affected,
since the result of only the latter of the two write instructions is preserved in the database. If there is no other
write(Q) instruction after Ii and Ij in S, then the order of Ii and Ij directly affects the final value of Q in the
database state that results from schedule S.
View Serializability
In this section, we consider a form of equivalence that is less stringent than conflict equivalence, but that,
like conflict equivalence, is based on only the read and write operations of transactions.
Consider two schedules S and S_, where the same set of transactions participates in both schedules. The
schedules S and S_ are said to be view equivalent if three conditions are met:
1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then transaction Ti must,
in schedule S_, also read the initial value of Q.
2. For each data item Q, if transaction Ti executes read(Q) in schedule S, and if that value was produced by a
write(Q) operation executed by transaction Tj , then the read(Q) operation of transaction Ti must, in schedule
S_, also read the value of Q that was produced by the same write(Q) operation of transaction Tj .
3. For each data item Q, the transaction (if any) that performs the final write(Q) operation in schedule S must
perform the final write(Q) operation in schedule S_.
4.14 Recovery Isolation Levels
The isolation level used during the execution of SQL statements determines the degree to which the
activation group is isolated from concurrently executing activation groups.Thus, when activation group P
executes an SQL statement, the isolation level determines:
* The degree to which rows retrieved by P and database changes made by P are available to other
concurrently executing activation groups.
* The degree to which database changes made by concurrently executing activation groups can affect P.
There are many advantages to this approach: read-intensive applications typically want more index
structures, data redundancies, and even other views of data. Transaction processing systems want the best
write throughput while incurring only the most minimal overhead. The access patterns of readers and writers
typically differ: Readers are more prone to larger analysis types of queries, and writers are more prone to
singleton inserts, updates, and deletes. When these activities are separated, the administrator can focus on
recovery strategies for a smaller, more manageable transaction processing system. OLTP databases tend to
be much smaller than data redundant decision-support or analysis-oriented databases.
There are four isolation levels:
1. READ UNCOMMITTED
2. READ COMMITTED
3. REPEATABLE READ
4. SERIALIZABLE
1. Read uncommitted
When it's used, SQL Server not issue shared locks while reading data. So, you can read an uncommitted
transaction that might get rolled back later. This isolation level is also called dirty read. This is the lowest
isolation level. It ensures only that a physically corrupt data will not be read.
2.Read committed
This is the default isolation level in SQL Server. When it's used, SQL Server will use shared locks while
reading data. It ensures that a physically corrupt data will not be read and will never read data that another
application has changed and not yet committed, but it does not ensure that the data will not be changed
before the end of the transaction.
3.Repeatable read
When it's used, the dirty reads and nonrepeatable reads cannot occur. It means that locks will be placed on all
data that is used in a query, and another transactions cannot update the data.
4.Serializable
Most restrictive isolation level. When it's used, then phantom values cannot occur. It prevents other users
from updating or inserting rows into the data set until the transaction is complete.
Summary
A transaction is a unit of program execution that accesses and possibly updates various data items.
Understanding the concept of a transaction is critical for understanding and implementing updates of
data in a database, in such a way that concurrent executions and failures of various forms do not
result in the database becoming inconsistent.
Transactions are required to have the ACID properties: atomicity, consistency, isolation, and
durability.
Atomicity ensures that either all the effects of a transaction are reflected in the database, or none are;
a failure cannot leave the database in a state where a transaction is partially executed.
Consistency ensures that, if the database is initially consistent, the execution of the transaction (by
itself) leaves the database in a consistent state. Isolation ensures that concurrently executing
transactions are isolated from one another, so that each has the impression that no other transaction is
executing concurrently with it. Durability ensures that, once a transaction has been committed, that
transaction’s updates do not get lost, even if there is a system failure.
Concurrent execution of transactions improves throughput of transactions and system utilization, and
also reduces waiting time of transactions. When several transactions execute concurrently in the
database, the consistency of data may no longer be preserved. It is therefore necessary for the system
to control the interaction among the concurrent transactions. Since a transaction is a unit that
preserves consistency, a serial execution of transactions guarantees that consistency is preserved.
A schedule captures the key actions of transactions that affect concurrent execution, such as read and
write operations, while abstracting away internal details of the execution of the transaction. We
require that any schedule produced by concurrent processing of a set of transactions will have an
effect equivalent to a schedule produced when these transactions are run serially in some order.
A system that guarantees this property is said to ensure serializability. There are several different
notions of equivalence leading to the concepts of conflict serializability and view serializability.
Serializability of schedules generated by concurrently executing transactions can be ensured through
one of a variety of mechanisms called concurrencycontrol schemes.
Schedules must be recoverable, to make sure that if transaction a sees the effects of transaction b, and
b then aborts, then a also gets aborted. Schedules should preferably be cascadeless, so that the abort
of a transaction does not result in cascading aborts of other transactions. Cascadelessness is ensured
by allowing transactions to only read committed data.
Key Terms>>Transaction >>ACID properties >>Atomicity
>>Isolation >>Durability >>Concurrent executions
>>Serial execution >>Schedules >>Conflict of operations
>>Conflict equivalence >>Consistency >>Conflict serializability
>>View equivalence >>View serializability
Key Term Quiz1. A --------------- is a unit of program execution that accesses and possibly updates various data items.
2. --------------- ensures that either all the effects of a transaction are reflected in the database, or none
are.
3. In a --------------- property, After a transaction completes successfully, the changes it has made to the
database persist, even if there are system failures.
4. Concurrency is the ability of the DBMS to process more than one transaction at a time.
Objective Type Questions1. A collection of action that preserve consistency is
a. Concurrency b. Transaction c. Recovery d. None of the above
2. A list of actions from a set of transactions is
a. Recovery b. Schedule c. Concurrency d. None of the above
3. Which of the following captures potential conflicts between transactions in a schedule?
a. Precedence Graph b. Serializability Graph
c. Only a d. Only b e. Both a and b
4. Lock Manager track lock and unlock request
a. True b. False
5. Which of the following implements the failure-tolerance of transactions ?
a. Recovery b. Schedule c. Concurrency d. None of the above
6. A schedule is conflict serializable if and only if its precedence graph is cyclic
a. True b. False
7. In ACID property, 'C' stands for
a. Crash Recovery b. Concurrency c. Consistency d. Commit
8. One way of specifying the transaction boundaries is by specifying which type of statements given
below.
a. explicit begin transaction and end transaction
b. explicit before transaction and after transaction
c. explicit start transaction and stop transaction
d. explicit start transaction and end transaction
9. Schedule that does not interleave the actions of different transactions
a. Serializable Schedule b. Equivalent Schedule
c. Serial Schedule a. None of the above
10. Indexed sequential files are composed of an unordered set of sequential records.
a. True b. False
Review QuestionsTwo Marks Questions
1.Define Transaction?
2.What are the properties of transaction?
3.What is serial schedule?
4.Define serializability?
5.What is conflict serializability?
6.What is view serializability?
7.How will you test conflict serializaility?
8.What is timestamp?
9.What timestamp values are associated with a data item?
10. What are the primitives of transaction?
11. What are the two types of lock modes?
12. List out the concurrency control techniques?
13. What are the different types of failure?
14. What is redo and undo operation?
15. Difference between immediate & deferred update recovery schemes?
16. What is rollback?
Big Questions
1. Explain the four important properities of transaction that a DBMS must ensure to maintain database .
2. (i) What is concurrenct control? How is it implemented in DBMS? Explain.
(ii) Explain various recovery techniques during ttansaction in detail.
3. (i) Explain the different fi\orms of serializability.
(ii) What are the different types of schdules are acceptable for recoverability?
4. (i) Explain on two-phase locking protocol and timestemp-based protocol.
(ii) Write short notes on log-based recovery.
------------- END OF FOURTH UNIT ----------------