25
DATABASE MANAGEMENT SYSTEMS UNIT IV- TRANSACTIONS Objectives To know the definition of transaction To learn ACID properties To explain various locking protocols To define deadlock To explain serializability 4.1 Transaction Concepts A transaction is a unit of program execution that accesses and possibly updates various data items. • A transaction must see a consistent database. • During transaction execution the database may be temporarily inconsistent. • When the transaction completes successfully (is committed), the database must be consistent. • After a transaction commits, the changes it has made to the database persist, even if there are system failures. • Multiple transactions can execute in parallel. • Two main issues to deal with: – Failures of various kinds, such as hardware failures and system crashes – Concurrent execution of multiple transactions 4.2 ACID Property To preserve the integrity of data the database system must ensure: Atomicity. Either all operations of the transaction are properly reflected in the database or none are. Consistency. Execution of a transaction in isolation preserves the consistency of the database. Isolation. Although multiple transactions may execute concurrently, each transaction must be unaware of other concurrently executing transactions. Intermediate transaction results must be hidden from other concurrently executed transactions. – That is, for every pair of transactions Ti and Tj, it appears to Ti that either Tj, finished execution before Ti started, or Tj started execution after Ti finished. Durability. After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures. Example of Fund Transfer • Transaction to transfer $50 from account A to account B: 1. read(A)

DATABASE MANAGEMENT SYSTEMS UNIT IV- …chettinadtech.ac.in/storage/11-12-19/11-12-19-10-11-07-1292-sherin.…Either all operations of the transaction are properly reflected in the

  • Upload
    haminh

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

DATABASE MANAGEMENT SYSTEMS

UNIT IV- TRANSACTIONS

Objectives

To know the definition of transaction

To learn ACID properties

To explain various locking protocols

To define deadlock

To explain serializability

4.1 Transaction ConceptsA transaction is a unit of program execution that accesses and possibly updates various data items.

• A transaction must see a consistent database.

• During transaction execution the database may be temporarily inconsistent.

• When the transaction completes successfully (is committed), the database must be consistent.

• After a transaction commits, the changes it has made to the database persist, even if there are system

failures.

• Multiple transactions can execute in parallel.

• Two main issues to deal with:

– Failures of various kinds, such as hardware failures and system crashes

– Concurrent execution of multiple transactions

4.2 ACID PropertyTo preserve the integrity of data the database system must ensure:

• Atomicity. Either all operations of the transaction are properly reflected in the database or none are.

• Consistency. Execution of a transaction in isolation preserves the consistency of the database.

• Isolation. Although multiple transactions may execute concurrently, each transaction must be unaware of

other concurrently executing transactions. Intermediate transaction results must be hidden from other

concurrently executed transactions.

– That is, for every pair of transactions Ti and Tj, it appears to Ti that either Tj, finished execution before Ti

started, or Tj started execution after Ti finished.

• Durability. After a transaction completes successfully, the changes it has made to the database persist,

even if there are system failures.

Example of Fund Transfer

• Transaction to transfer $50 from account A to account B:

1. read(A)

2. A := A – 50

3. write(A)

4. read(B)

5. B := B + 50

6. write(B)

• Atomicity requirement — if the transaction fails after step 3 and before step 6, the system should ensure

that its updates are not reflected in the database, else an inconsistency will result.

• Consistency requirement – the sum of A and B is unchanged by the execution of the transaction.

• Isolation requirement — if between steps 3 and 6, another transaction is allowed to access the partially

updated database, it will see an inconsistent database (the sum A + B will be less than it should be).

• Isolation can be ensured trivially by running transactions serially, that is one after the other.

• However, executing multiple transactions concurrently has significant benefits, as we will see later.

• Durability requirement — once the user has been notified that the transaction has completed (i.e., the

transfer of the $50 has taken place), the updates to the database by the transaction must persist despite

failures.

4.3 System Recovery7.1 An Over view

An Overview

After some failure, we must be able to restore the database to a state that is known to be correct.

A failure may be caused by

Bugs in application programs, operating system,database system ...

Hardware errors on devices, channels, CPU, or memory Operator errors

External causes: power failure, _re, high temperature, lightning ...

To ensure that the database can be recovered after a failure, the following actions are required during

normal operation:

Database backup

Periodically the entire database is copied to archival storage. This copy should be stored in

a safe place.

Database journalizing (logging)

Every time a change is made to the database, a record containing the new value (after-

image, redo part) and, possibly, also the old value (before-image, undo part) is written to a

special file called the journal.The journal is often kept in duplicate.

Checkpointing- is an operation that synchronizes the journal and the database, in the simplest case by

suspending all processing, and performing all the pending writing.

Types of failure and recovery action required:

Transaction-local failure only one transaction affected, database not damaged. Perform a

ROLLBACK, i.e. undo whatever changes the transaction has made to the database.

ROLLBACK issued either by the transaction or by the system.

System-wide failure, database not damaged.

All transactions in progress affected. Undo the changes made by any transaction in

progress at the time of failure. Redo every committed transaction for which it is not

known whether all its changes have physically been written to the database. Possibly,

restart the transactions that were rolled back.

System-wide failure, database damaged

Restore the database from the latest back-up copy and redo all committed transactions.

This may be a very slow process.

Incremental journal with immediate updates

Log structure:

Begin transaction record: transaction number, (input message)

Sequence of change records: transaction number, page address, old value, new value

End transaction record: transaction number, commit/rollback

Before a modified page is physically written to the database, the journal record (at least the undo

part) must be physically written to the journal (not just placed in the journal buffer). Before the buffer

manager writes a modified page to the database it must make sure that all journal records related to

that page have been written to the journal. The simplest solution is to use the journal buffer. Before

the transaction commits, all its journal records and the end-transaction record must be physically

written to the journal. Again the simplest solution is to use the log buffer.

Transaction UNDO

Scan backwards through the journal

For every record related to the transaction to be undone: undo the change by rewriting the old

value (before-image) from the change record until reaching the begin-transaction record.

Cascading rollbacks

When rollback of a transaction forces rollback of another transaction.

If some other transaction has been allowed to read or write a page modified by the transaction

being rolled back, then that transaction must also be rolled back.

May occur as soon as transactions are allowed to see uncommitted changes (dirty reads).

Note that this may force rollback of a transaction that has already committed. NOT

ACCEPTABLE.

To avoid cascading rollback, before a transaction commits, any data written by it are not

allowed to be read by other transactions.

Checkpointing

A checkpoint synchronizes the log and the contents of the database by making sure that all

writes have been physically performed.

Taking a checkpoint consist of the following steps:

1. Suspend transaction processing

2. Physically write out all log bu_ers

3. Physically write out all modi_ed pages in the page bu_er

4. Write a checkpoint record to the log list of active transactions pointers

to their most recent log records (possibly)

5. Record the address of the checkpoint record in a \restart _le"

6. Resume transaction processing

During recovery, only the following transactions need to be considered: transactions that started after

the most recent checkpoint, and transactions that were active at the time of the most recent

checkpoint.

Restart procedure

Scan the log forwards from the most recent checkpoint, identifying the type of each transaction

recorded

T1: started before the checkpoint and had an end-transaction record

T2: started after the checkpoint and had an end-transaction record

T3: started before the checkpoint and had no end-transaction record

T4: started after the checkpoint and had no end-transaction record

Scan backwards undoing all transactions of type T3 and T4.

Scan forwards from the most recent checkpoint redoing (rewriting the new values) all committed

transactions of type T1 and T2. Ignore all transactions that were rolled back.

Resume transaction processing, possibly restarting transactions of type T3 and T4.

Incremental log with deferred updates

All writes to the database are deferred until the transaction commits.

Transaction rollback is now trivial: omit the database writes and write an end-transaction record

indicating rollback to the log.

Begin-transaction records, end-transaction records, and change records, but omitting the old value,

are written to the log as before.

The end-transaction record can be written as soon as all change records have been forced out and

before the database writes have been performed.

During restart no transactions need to be undone. Committed transaction must still be redone

because the changes may not have been physically written at the time of failure.

4.4 Two Phase CommitTwo Phase Commit is the process by which a relational database ensures that distributed transactions are

performed in an orderly manner. In this system, transactions may be terminated by either committing them or

rolling them back.

A commit operation is, by definition, an all-or-nothing affair. If a series of operations bound as a transaction

cannot be completed, the rollback must restore the system (or cooperating systems) to the pre-transaction

state. In order to ensure that a transaction can be rolled back, a software system typically logs each operation,

including the commit operation itself. A transaction/recovery manager uses the log records to undo (and

possibly redo) a partially completed transaction.When a transaction involves multiple distributed resources,

for example, a database server on each of two different network hosts, the commit process is somewhat

complex because the transaction includes operations that span two distinct software systems, each with its

own resource manager, log records, and so on. (In this case, the distributed resources are the database

servers.)

Two-phase commit is a transaction protocol designed for the complications that arise with distributed

resource managers. With a two-phase commit protocol, the distributed transaction manager employs a

coordinator to manage the individual resource managers.

The commit process proceeds as follows:

Phase 1

o Each participating resource manager coordinates local operations and forces all log records

out:

o If successful, respond "OK"

o If unsuccessful, either allow a time-out or respond "OOPS"

Phase 2

o If all participants respond "OK":

Coordinator instructs participating resource managers to "COMMIT"

Participants complete operation writing the log record for the commit

o Otherwise:

Coordinator instructs participating resource managers to "ROLLBACK"

Participants complete their respective local undos

In order for the scheme to work reliably, both the coordinator and the participating resource managers

independently must be able to guarantee proper completion, including any necessary restart/redo operations.

The algorithms for guaranteeing success by handling failures at any stage are provided in advanced database

texts.

4.5 Save Points

The SAVEPOINT statement sets a named transaction savepoint with a name of identifier. If the current

transaction has a savepoint with the same name, the old savepoint is deleted and a new one is set. The

ROLLBACK TO SAVEPOINT statement rolls back a transaction to the named savepoint without

terminating the transaction. (The SAVEPOINT keyword is optional as of MySQL 5.0.3.) Modifications that

the current transaction made to rows after the savepoint was set are undone in the rollback, but InnoDB does

not release the row locks that were stored in memory after the savepoint. (For a new inserted row, the lock

information is carried by the transaction ID stored in the row; the lock is not separately stored in memory. In

this case, the row lock is released in the undo.) Savepoints that were set at a later time than the named

savepoint are deleted.

If the ROLLBACK TO SAVEPOINT statement returns the following error, it means that no savepoint with

the specified name exists: The RELEASE SAVEPOINT statement removes the named savepoint from the set

of savepoints of the current transaction. No commit or rollback occurs. It is an error if the savepoint does not

exist. All savepoints of the current transaction are deleted if you execute a COMMIT, or a ROLLBACK that

does not name a savepoint.

Savepoints are useful in situations where errors are unlikely to occur. The use of a savepoint to roll back part

of a transaction in the case of an infrequent error can be more efficient than having each transaction test to

see if an update is valid before making the update. Updates and rollbacks are expensive operations, so

savepoints are effective only if the probability of encountering the error is low and the cost of checking the

validity of an update beforehand is relatively high.

4.6 SQL Facilities for recovery

SQL provides three nalogs of BEGIN TRANSACTION, COMMIT and ROLLBACK, called START

TRANSACTION, COMMIT WOK and ROLLBACK WORK, respectively. Here, the syntax of START

TRANSACTION:

START TRASNACTION <option commalist>;

The <option commalist> specifies an access mode, an isolation mode

The access mode is either READ ONLY or READ WRITE. If neither is specifed, READ

WRITE is assumed, unless READ UNCOMMITED isolation level is specified, in which case

READ ONLY is assumed. IF READ WRITE is specified, the isolation level must not be

READ UNCOMMITTED.

The isolation level takes the form ISOLATION LEVEL <isolation>, where <isolation> is

READ UNCOMMITTED, READ UNCOMMITTED, REPEATABLE READ, or

SERIALIZABLE

SQL also supports save points, The statement

SAVEPOINT <savepoint name>;

creates a savepoint with the specified user chosen name. The statement

ROLLBACK TO <savepoint name>;

Undoes all updates done since the specified savepoint and the statement

RELEASE <savepint name>;

Drops the specified savepoint, meaning it is no longer possible t execute a ROLLBACK to that savepoint.

All savepoints are automatically dropped at transaction termination.

4.7 ConcurrencyConcurrency is the ability of the DBMS to process more than one transaction at a time. This section

overviews briefly several problems that can occur when concurrent transactions execute in an uncontrolled

manner. Concrete examples are given to illustrate the problems in details. The related activities and learning

tasks that follow give you a chance to evaluate the extent of your understanding of the problems. An

important learning objective for this section of the unit is to understand the different types of problems of

concurrent executions in OLTP and appreciate the need for concurrency control.

4.8 Need for Concurrency

If transactions are executed serially, i.e., sequentially with no overlap in time, no transaction concurrency

exists. However, if concurrent transactions with interleaving operations are allowed in an uncontrolled

manner, some unexpected result may occur. Here are some typical examples:

1. The lost update problem: A second transaction writes a second value of a data-item (datum) on top of

a first value written by a first concurrent transaction, and the first value is lost to other transactions

running concurrently which need, by their precedence, to read the first value. The transactions that

have read the wrong value end with incorrect results.

2. The dirty read problem: Transactions read a value written by a transaction that has been later aborted.

This value disappears from the database upon abort, and should not have been read by any transaction

("dirty read"). The reading transactions end with incorrect results.

3. The incorrect summary problem: While one transaction takes a summary over values of a repeated

data-item, a second transaction updated some instances of that data-item. The resulting summary does

not reflect a correct result for any (usually needed for correctness) precedence order between the two

transactions (if one is executed before the other), but rather some random result, depending on the

timing of the updates, and whether a certain update result has been included in the summary or not.

We illustrate some of the problems by referring to a simple airline reservation database in which each record

is stored for each airline flight. Each record includes the number of reserved seats on that flight as a named

data item, among other information. Recal the two transactions T1 and T2 introduced previously.

Transaction T1 cancels N reservations from one flight whose number of reserved seats is stored in the

database item named X, and reserves the same number of seats on another flight whose number of reserved

seats is stored in the database item named Y. A simpler transaction T2 just reserves M seats on the first flight

referenced in transaction T1. To simplify the example, the additional potions of the transactions are not

shown, such as checking whether a flight has enough seats available before reserving additional seats.

When an airline reservation database program is written, it has the flight numbers, their dates, and the

number of seats available for booking as parameters; hence, the same program can be used to execute many

transactions, each with different flights and number of seats to be booked. For concurrency control purpose,

a transaction is a particular execution of a program on a specific date, flight, and number of seats. The

transactions T1 and T2 are specific executions of the programs that refer to the specific flights whose

numbers of seats are stored in data item X and Y in the database. Now let’s discuss the types of problems we

may encounter with these two transactions.

The Lost Update Problem

Lost update problem occurs when two transactions that access the same database items have their operations

interleaved in a way that makes the value of some database item incorrect. That is, interleaved use of a same

data item would cause some problems when an update operation from one transaction overwrites another

update from a second transaction.

An example will explain the problem clearly. Suppose the two transactions T1 and T2 introduced previously

are submitted at approximately the same time. It is possible when two travel agency staff help customers to

book their flights at more or less the same time from different or same office. Suppose that their operations

are interleaved by the operating system as shown in the figure below

The above interleaved operation will lead to an incorrect value for data item X, because at time step3, T2

reads in the original value of X which is before T1 changes it in the database, and hence the updated value

resulting from T1 is lost. For example, if X = 80, originally there were 80 reservations on the flight, N = 5,

T1 cancels 5 seats on the flight corresponding to X and reserves them on the flight corresponding to Y, and

M = 4, T2 reserves 4 seats on X. The final result should be X = 80 – 5 + 4 = 79; but in the concurrent

operations of figure 9.5, it is X = 84 because the update that cancelled 5 seats in T1 was lost. The detailed

value updating in the flight reservation database taking the above example is shown below .

Uncommitted Dependency (or Dirty Read / Temporary Update)

Uncommitted dependency occurs when a transaction is allowed to retrieve or (worse) update a record that

has been updated by another transaction but has not yet been committed by that other transaction. Because it

has not yet been committed, there is always a possibility that it will never be committed but it will be rolled

back instead, in which case, the first transaction will have used some data that is now incorrect, a dirty read

for the first transaction.

The figure below shows an example where T1 updates item X and then fails before completion, so the

system must change X back to its original value. Before it can do so, however, transaction T2 reads the

“temporary” value of X, which will not be recorded permanently in the database because of the failure of T1.

The value of item X that is read by T2 is called dirty data, because it has been created by a transaction that

has not completed and committed yet; hence this problem is also known as the dirty read problem. Since the

dirty data read in by T2 is only a temporary value of X, the problem is sometimes called temporary update

too. .

The rollback of transaction T1 may be due to a system crash and transaction T2 may already have terminated

by that time in which case the crash would not cause a ROLLBACK to be issued for T2. The following

situation is even more unacceptable.

In the above example, not only does transaction T2 becomes dependent on an uncommitted change at time

step 6 but it also loses an update at time step 7, because the ROLLBACK in T1 causes data item X to be

restored to its value before time step 1.

Inconsistent Analysis

The inconsistent analysis occurs when a transaction reads several values but a second transaction updates

some of these values during the execution of the first. This problem is significant for example if one

transaction is calculating an aggregate summary function on a number of records while other transactions are

updating some of these records, the aggregate function may calculate some values before they are updated

and others after they are updated. This causes an inconsistency.

For example, suppose that a transaction T3 is calculating the total number of reservations on all the flights;

meanwhile, transaction T1 is executing. If the interleaving of operations shown below occurs, the result of

T3 will be off by amount of N, because T3 reads the value of X after N seats have been subtracted from it

but reads the value of Y before those N seats have been added to it.

4.9 Locking Protocols

The fundamental properties of a transaction is isolation. When several transactions execute concurrently in

the database, however, the isolation property may no longer be preserved. To ensure that it is, the system

must control the interaction among the concurrent transactions; this control is achieved through one of a

variety of mechanisms called concurrency-control schemes. One way to ensure serializability is to require

that data items be accessed in a mutually exclusive manner; that is, while one transaction is accessing a data

item, no other transaction can modify that data item. The most common method used to implement this

requirement is to allow a transaction to access a data item only if it is currently holding a lock on that item.

Locks

There are various modes in which a data item may be locked. In this section, we restrict our attention to two

modes:

1. Shared. If a transaction Ti has obtained a shared-mode lock (denoted by S) on item Q, then Ti can read,

but cannot write, Q.

2. Exclusive. If a transaction Ti has obtained an exclusive-mode lock (denoted by X) on item Q, then Ti can

both read andwrite Q.

Granting of Locks

When a transaction requests a lock on a data item in a particular mode, and no other transaction has a lock on

the same data item in a conflicting mode, the lock can be granted. However, care must be taken to avoid the

following scenario. Suppose a transaction T2 has a shared-mode lock on a data item, and another transaction

T1 requests an exclusive-mode lock on the data item. Clearly, T1 has to wait for T2 to release the shared-

mode lock. Meanwhile, a transaction T3 may request a shared-mode lock on the same data item. The lock

request is compatible with the lock granted to T2, so T3 may be granted the shared-mode lock. At this point

T2 may release the lock, but still T1 has to wait for T3 to finish. But again, there may be a new transaction T4

that requests a shared-mode lock on the same data item, and is granted the lock before T3 releases it. In fact,

it is possible that there is a sequence of transactions that each requests a shared-mode lock on the data item,

and each transaction releases the lock a short while after it is granted, but T1 never gets the exclusive-mode

lock on the data item. The transaction T1 may never make progress, and is said to be starved. We can avoid

starvation of transactions by granting locks in the following manner: When a transaction Ti requests a lock

on a data item Q in a particular mode M, the concurrency-control manager grants the lock provided that

1. There is no other transaction holding a lock on Q in a mode that conflicts with M.

2. There is no other transaction that is waiting for a lock on Q, and that made its lock request before Ti.

4.10 Two Phase Locking

One protocol that ensures serializability is the two-phase locking protocol. This protocol requires that each

transaction issue lock and unlock requests in two phases:

1. Growing phase. A transaction may obtain locks, but may not release any lock.

2. Shrinking phase. A transaction may release locks, but may not obtain any new locks.

Initially, a transaction is in the growing phase. The transaction acquires locks as needed. Once the transaction

releases a lock, it enters the shrinking phase, and it can issue no more lock requests. For example,

transactions T3 and T4 are two phase. On the other hand, transactions T1 and T2 are not two phase. Note that

the unlock instructions do not need to appear at the end of the transaction. For example, in the case of

transaction T3, we could move the unlock(B) instruction to just after the lock-X(A) instruction, and still

retain the two-phase locking property. Cascading rollbacks can be avoided by a modification of two-phase

locking called the strict two-phase locking protocol. This protocol requires not only that locking be two

phase, but also that all exclusive-mode locks taken by a transaction be held until that transaction commits.

This requirement ensures that any data written by an uncommitted transaction are locked in exclusive mode

until the transaction commits, preventing any other transaction from reading the data. Another variant of

two-phase locking is the rigorous two-phase locking protocol, which requires that all locks be held until

the transaction commits.

Strict two-phase locking and rigorous two-phase locking (with lock conversions) are used extensively in

commercial database systems. A simple but widely used scheme automatically generates the appropriate lock

and unlock instructions for a transaction, on the basis of read and write requests from the transaction:

• When a transaction Ti issues a read(Q) operation, the system issues a lock- S(Q) instruction followed by the

read(Q) instruction.

• When Ti issues a write(Q) operation, the system checks to see whether Ti already holds a shared lock on Q.

If it does, then the system issues an upgrade( Q) instruction, followed by the write(Q) instruction. Otherwise,

the system issues a lock-X(Q) instruction, followed by the write(Q) instruction.

• All locks obtained by a transaction are unlocked after that transaction commits or aborts.

Implementation of Locking

A Lock manager can be implemented as a separate process to which transactions send lock and unlock

requests

The lock manager replies to a lock request by sending a lock grant messages (or a message asking the

transaction to roll back, in case of a deadlock)

The requesting transaction waits until its request is answered

The lock manager maintains a datastructure called a lock table to record granted locks and pending

requests

The lock table is usually implemented as an in-memory hash table indexed on the name of the data item

being locked

Lock Table

Black rectangles indicate granted locks, white ones indicate waiting requests

Lock table also records the type of lock granted or requested

New request is added to the end of the queue of requests for the data item, and granted if it is compatible

with all earlier locks

Unlock requests result in the request being deleted, and later requests are checked to see if they can now

be granted. If transaction aborts, all waiting or granted requests of the transaction are deleted o lock manager

may keep a list of locks held by each transaction, to implement this efficiently

Timestamp-Based Protocols

Each transaction is issued a timestamp when it enters the system. If an old transaction Ti has time-stamp

TS(Ti), a new transaction Tj is assigned time-stamp TS(Tj) such that TS(Ti) <TS(Tj).

The protocol manages concurrent execution such that the time-stamps determine the serializability order.

In order to assure such behavior, the protocol maintains for each data Q two timestamp values:

W-timestamp(Q) is the largest time-stamp of any transaction that executed write(Q) successfully.

R-timestamp(Q) is the largest time-stamp of any transaction that executed read(Q) successfully.

The timestamp ordering protocol ensures that any conflicting read and write operations are executed in

timestamp order.

Suppose a transaction Ti issues a read(Q)

1. If TS(Ti) W-timestamp(Q), then Ti needs to read a value of Q that was already overwritten. Hence, the

read operation is rejected, and Ti is rolled back.

2. If TS(Ti) W-timestamp(Q), then the read operation is executed, and R-timestamp(Q) is set to the

maximum of Rtimestamp(Q) and TS(Ti).

Suppose that transaction Ti issues write(Q).

If TS(Ti) < R-timestamp(Q), then the value of Q that Ti is producing was needed previously, and the

system assumed that that value would never be produced. Hence, the write operation is rejected, and Ti is

rolled back.

If TS(Ti) < W-timestamp(Q), then Ti is attempting to write an obsolete value of Q. Hence, this write

operation is rejected, and Ti is rolled back.

Otherwise, the write operation is executed, and W-timestamp(Q) is set to TS(Ti).

4.11 Intent Locks

The Database Engine uses intent locks to protect placing a shared (S) lock or exclusive (X) lock on a

resource lower in the lock hierarchy. Intent locks are named intent locks because they are acquired before a

lock at the lower level, and therefore signal intent to place locks at a lower level.

Intent locks serve two purposes:

To prevent other transactions from modifying the higher-level resource in a way that would

invalidate the lock at the lower level.

To improve the efficiency of the Database Engine in detecting lock conflicts at the higher level of

granularity.

For example, a shared intent lock is requested at the table level before shared (S) locks are requested on

pages or rows within that table. Setting an intent lock at the table level prevents another transaction from

subsequently acquiring an exclusive (X) lock on the table containing that page. Intent locks improve

performance because the Database Engine examines intent locks only at the table level to determine if a

transaction can safely acquire a lock on that table. This removes the requirement to examine every row or

page lock on the table to determine if a transaction can lock the entire table.

Intent locks include intent shared (IS), intent exclusive (IX), and shared with intent exclusive (SIX).

Lock mode Description

Intent shared (IS)Protects requested or acquired shared locks on some (but not all)

resources lower in the hierarchy.

Intent exclusive (IX)

Protects requested or acquired exclusive locks on some (but not all)

resources lower in the hierarchy. IX is a superset of IS, and it also

protects requesting shared locks on lower level resources.

Shared with intent exclusive

(SIX)

Protects requested or acquired shared locks on all resources lower in the

hierarchy and intent exclusive locks on some (but not all) of the lower

level resources. Concurrent IS locks at the top-level resource are allowed.

For example, acquiring a SIX lock on a table also acquires intent

exclusive locks on the pages being modified and exclusive locks on the

modified rows. There can be only one SIX lock per resource at one time,

preventing updates to the resource made by other transactions, although

other transactions can read resources lower in the hierarchy by obtaining

IS locks at the table level.

Intent update (IU) Protects requested or acquired update locks on all resources lower in the

hierachy. IU locks are used only on page resources. IU locks are

converted to IX locks if an update operation takes place.

Shared intent update (SIU)

A combination of S and IU locks, as a result of acquiring these locks

separately and simultaneously holding both locks. For example, a

transaction executes a query with the PAGLOCK hint and then executes

an update operation. The query with the PAGLOCK hint acquires the S

lock, and the update operation acquires the IU lock.

Update intent exclusive (UIX) A combination of U and IX locks, as a result of acquiring these locks

separately and simultaneously holding both locks.

4.12 Deadlock

A system is in a deadlock state if there exists a set of transactions such that every transaction in the set is

waiting for another transaction in the set. More precisely, there exists a set of waiting transactions {T0, T1, . .

., Tn} such that T0 is waiting for a data item that T1 holds, and T1 is waiting for a data item that T2 holds,

and . . ., and Tn−1 is waiting for a data item that Tn holds, and Tn is waiting for a data item that T0 holds.

None of the transactions can make progress in such a situation. The only remedy to this undesirable situation

is for the system to invoke some drastic action, such as rolling back some of the transactions involved in the

deadlock. Rollback of a transaction may be partial: That is, a transaction may be rolled back to the point

where it obtained a lock whose release resolves the deadlock. There are two principal methods for dealing

with the deadlock problem. We can use a deadlock prevention protocol to ensure that the system will never

enter a deadlock state. Alternatively, we can allow the system to enter a deadlock state, and then try to

recover by using a deadlock detection and deadlock recovery scheme. As we shall see, both methods may

result in transaction rollback. Prevention is commonly used if the probability that the system would enter a

deadlock state is relatively high; otherwise, detection and recovery are more efficient. Note that a detection

and recovery scheme requires overhead that includes not only the run-time cost of maintaining the necessary

information and of executing the detection algorithm, but also the potential losses inherent in recovery from

a deadlock.

Consider the following two transactions:

System is deadlocked if there is a set of transactions such that every transaction in the set is waiting for

another transaction in the set.

Deadlock prevention protocols ensure that the system will never enter into a deadlock state. Some

prevention strategies :

Require that each transaction locks all its data items before it begins execution (predeclaration).

Impose partial ordering of all data items and require that a transaction can lock data items only in the order

specified by the partial order (graph-based protocol).

More Deadlock Prevention Strategies

Following schemes use transaction timestamps for the sake of deadlock prevention alone.

(i) wait-die scheme — non-preemptive

older transaction may wait for younger one to release data item. Younger transactions never wait for older

ones; they are rolled back instead.

a transaction may die several times before acquiring needed data item

(ii) wound-wait scheme — preemptive

older transaction wounds (forces rollback) of younger transaction instead of waiting for it. Younger

transactions may wait for older ones.

may be fewer rollbacks than wait-die scheme.

Both in wait-die and in wound-wait schemes, a rolled back transactions is restarted with its original

timestamp. Older transactions thus have precedence over newer ones, and starvation is hence avoided.

(iii) Timeout-Based Schemes :

a transaction waits for a lock only for a specified amount of time. After that, the wait times out and the

transaction is rolled back.Thus deadlocks are not possible

simple to implement; but starvation is possible. Also difficult to determine good value of the timeout

interval.

Deadlock Detection

Deadlocks can be described as a wait-for graph, which consists of a pair G = (V,E),

V is a set of vertices (all the transactions in the system)

E is a set of edges; each element is an ordered pair Ti Tj.

If Ti Tj is in E, then there is a directed edge from Ti to Tj, implying that Ti is waiting for Tj to release a

data item.

When Ti requests a data item currently being held by Tj, then the edge Ti Tj is inserted in the wait-for

graph. This edge is removed only when Tj is no longer holding a data item needed by Ti.

The system is in a deadlock state if and only if the wait-for graph has a cycle. Must invoke a deadlock-

detection algorithm periodically to look for cycles.

Wait-for graph without a cycle Wait-for graph with a cycle

Deadlock Recovery

When deadlock is detected :

Some transaction will have to rolled back (made a victim) to break deadlock. Select that transaction as

victim that will incur minimum cost.

Rollback -- determine how far to roll back transaction

Total rollback: Abort the transaction and then restart it.

More effective to roll back transaction only as far as necessary to break deadlock.

Starvation happens if same transaction is always chosen as victim. Include the number of rollbacks in the

cost factor to avoid starvation

Insert and Delete Operations

If two-phase locking is used :

A delete operation may be performed only if the transaction deleting the tuple has an exclusive lock on the

tuple to be deleted.

A transaction that inserts a new tuple into the database is given an X-mode lock on the tuple

Insertions and deletions can lead to the phantom phenomenon.

A transaction that scans a relation (e.g., find all accounts in Perryridge) and a transaction that inserts a

tuple in the relation (e.g., insert a new account at Perryridge) may conflict in spite of not accessing any tuple

in common.

If only tuple locks are used, non-serializable schedules can result: the scan transaction may not see the

new account, yet may be serialized before the insert transaction.

The transaction scanning the relation is reading information that indicates what tuples the relation

contains, while a transaction inserting a tuple updates the same information.

The information should be locked.

One solution:

Associate a data item with the relation, to represent the information about what tuples the relation

contains.

Transactions scanning the relation acquire a shared lock in the data item,

Transactions inserting or deleting a tuple acquire an exclusive lock on the data item. (Note: locks on the

data item do not conflict with locks on individual tuples.)

Above protocol provides very low concurrency for insertions/deletions.

Index locking protocols provide higher concurrency while preventing the phantom phenomenon, by

requiring locks on certain index buckets.

4.13 Serializability

Serializability is the classical concurrency scheme. It ensures that a schedule for executing concurrent

transactions is equivalent to one that executes the transactions serially in some order. It assumes that all

accesses to the database are done using read and write operations. A schedule is called ``correct'' if we can

find a serial schedule that is ``equivalent'' to it. Given a set of transactions T1...Tn, two schedules S1 and S2

of these transactions are equivalent if the following conditions are satisfied:

Read-Write Synchronization: If a transaction reads a value written by another transaction in one schedule,

then it also does so in the other schedule.

Write-Write Synchronization: If a transaction overwrites the value of another transaction in one schedule, it

also does so in the other schedule.

These two properties ensure that there can be no difference in the effects of the two schedules. As an

example, consider the schedule in Figure 1. It is equivalent to a schedule in which T2 is executed after T1.

There are several approaches to enforcing serializability.

Conflict Serializability

Let us consider a schedule S in which there are two consecutive instructions Ii and Ij, of transactions Ti and

Tj , respectively (i _= j). If Ii and Ij refer to different data items, then we can swap Ii and Ij without affecting

the results of any instruction in the schedule. However, if Ii and Ij refer to the same data item Q, then the

order of the two steps maymatter. Since we are dealing with only read and write instructions, there are four

cases that we need to consider:

1. Ii = read(Q), Ij = read(Q). The order of Ii and Ij does not matter, since the same value of Q is read by Ti

and Tj , regardless of the order.

2. Ii = read(Q), Ij = write(Q). If Ii comes before Ij, then Ti does not read the value of Q that is written by Tj

in instruction Ij. If Ij comes before Ii, then Ti reads the value of Q that is written by Tj. Thus, the order of Ii

and Ij matters.

3. Ii = write(Q), Ij = read(Q). The order of Ii and Ij matters for reasons similar to those of the previous case.

4. Ii = write(Q), Ij = write(Q). Since both instructions are write operations, the order of these instructions

does not affect either Ti or Tj . However, the value obtained by the next read(Q) instruction of S is affected,

since the result of only the latter of the two write instructions is preserved in the database. If there is no other

write(Q) instruction after Ii and Ij in S, then the order of Ii and Ij directly affects the final value of Q in the

database state that results from schedule S.

View Serializability

In this section, we consider a form of equivalence that is less stringent than conflict equivalence, but that,

like conflict equivalence, is based on only the read and write operations of transactions.

Consider two schedules S and S_, where the same set of transactions participates in both schedules. The

schedules S and S_ are said to be view equivalent if three conditions are met:

1. For each data item Q, if transaction Ti reads the initial value of Q in schedule S, then transaction Ti must,

in schedule S_, also read the initial value of Q.

2. For each data item Q, if transaction Ti executes read(Q) in schedule S, and if that value was produced by a

write(Q) operation executed by transaction Tj , then the read(Q) operation of transaction Ti must, in schedule

S_, also read the value of Q that was produced by the same write(Q) operation of transaction Tj .

3. For each data item Q, the transaction (if any) that performs the final write(Q) operation in schedule S must

perform the final write(Q) operation in schedule S_.

4.14 Recovery Isolation Levels

The isolation level used during the execution of SQL statements determines the degree to which the

activation group is isolated from concurrently executing activation groups.Thus, when activation group P

executes an SQL statement, the isolation level determines:

* The degree to which rows retrieved by P and database changes made by P are available to other

concurrently executing activation groups.

* The degree to which database changes made by concurrently executing activation groups can affect P.

There are many advantages to this approach: read-intensive applications typically want more index

structures, data redundancies, and even other views of data. Transaction processing systems want the best

write throughput while incurring only the most minimal overhead. The access patterns of readers and writers

typically differ: Readers are more prone to larger analysis types of queries, and writers are more prone to

singleton inserts, updates, and deletes. When these activities are separated, the administrator can focus on

recovery strategies for a smaller, more manageable transaction processing system. OLTP databases tend to

be much smaller than data redundant decision-support or analysis-oriented databases.

There are four isolation levels:

1. READ UNCOMMITTED

2. READ COMMITTED

3. REPEATABLE READ

4. SERIALIZABLE

1. Read uncommitted

When it's used, SQL Server not issue shared locks while reading data. So, you can read an uncommitted

transaction that might get rolled back later. This isolation level is also called dirty read. This is the lowest

isolation level. It ensures only that a physically corrupt data will not be read.

2.Read committed

This is the default isolation level in SQL Server. When it's used, SQL Server will use shared locks while

reading data. It ensures that a physically corrupt data will not be read and will never read data that another

application has changed and not yet committed, but it does not ensure that the data will not be changed

before the end of the transaction.

3.Repeatable read

When it's used, the dirty reads and nonrepeatable reads cannot occur. It means that locks will be placed on all

data that is used in a query, and another transactions cannot update the data.

4.Serializable

Most restrictive isolation level. When it's used, then phantom values cannot occur. It prevents other users

from updating or inserting rows into the data set until the transaction is complete.

Summary

A transaction is a unit of program execution that accesses and possibly updates various data items.

Understanding the concept of a transaction is critical for understanding and implementing updates of

data in a database, in such a way that concurrent executions and failures of various forms do not

result in the database becoming inconsistent.

Transactions are required to have the ACID properties: atomicity, consistency, isolation, and

durability.

Atomicity ensures that either all the effects of a transaction are reflected in the database, or none are;

a failure cannot leave the database in a state where a transaction is partially executed.

Consistency ensures that, if the database is initially consistent, the execution of the transaction (by

itself) leaves the database in a consistent state. Isolation ensures that concurrently executing

transactions are isolated from one another, so that each has the impression that no other transaction is

executing concurrently with it. Durability ensures that, once a transaction has been committed, that

transaction’s updates do not get lost, even if there is a system failure.

Concurrent execution of transactions improves throughput of transactions and system utilization, and

also reduces waiting time of transactions. When several transactions execute concurrently in the

database, the consistency of data may no longer be preserved. It is therefore necessary for the system

to control the interaction among the concurrent transactions. Since a transaction is a unit that

preserves consistency, a serial execution of transactions guarantees that consistency is preserved.

A schedule captures the key actions of transactions that affect concurrent execution, such as read and

write operations, while abstracting away internal details of the execution of the transaction. We

require that any schedule produced by concurrent processing of a set of transactions will have an

effect equivalent to a schedule produced when these transactions are run serially in some order.

A system that guarantees this property is said to ensure serializability. There are several different

notions of equivalence leading to the concepts of conflict serializability and view serializability.

Serializability of schedules generated by concurrently executing transactions can be ensured through

one of a variety of mechanisms called concurrencycontrol schemes.

Schedules must be recoverable, to make sure that if transaction a sees the effects of transaction b, and

b then aborts, then a also gets aborted. Schedules should preferably be cascadeless, so that the abort

of a transaction does not result in cascading aborts of other transactions. Cascadelessness is ensured

by allowing transactions to only read committed data.

Key Terms>>Transaction >>ACID properties >>Atomicity

>>Isolation >>Durability >>Concurrent executions

>>Serial execution >>Schedules >>Conflict of operations

>>Conflict equivalence >>Consistency >>Conflict serializability

>>View equivalence >>View serializability

Key Term Quiz1. A --------------- is a unit of program execution that accesses and possibly updates various data items.

2. --------------- ensures that either all the effects of a transaction are reflected in the database, or none

are.

3. In a --------------- property, After a transaction completes successfully, the changes it has made to the

database persist, even if there are system failures.

4. Concurrency is the ability of the DBMS to process more than one transaction at a time.

Objective Type Questions1. A collection of action that preserve consistency is

a. Concurrency b. Transaction c. Recovery d. None of the above

2. A list of actions from a set of transactions is

a. Recovery b. Schedule c. Concurrency d. None of the above

3. Which of the following captures potential conflicts between transactions in a schedule?

a. Precedence Graph b. Serializability Graph

c. Only a d. Only b e. Both a and b

4. Lock Manager track lock and unlock request

a. True b. False

5. Which of the following implements the failure-tolerance of transactions ?

a. Recovery b. Schedule c. Concurrency d. None of the above

6. A schedule is conflict serializable if and only if its precedence graph is cyclic

a. True b. False

7. In ACID property, 'C' stands for

a. Crash Recovery b. Concurrency c. Consistency d. Commit

8. One way of specifying the transaction boundaries is by specifying which type of statements given

below.

a. explicit begin transaction and end transaction

b. explicit before transaction and after transaction

c. explicit start transaction and stop transaction

d. explicit start transaction and end transaction

9. Schedule that does not interleave the actions of different transactions

a. Serializable Schedule b. Equivalent Schedule

c. Serial Schedule a. None of the above

10. Indexed sequential files are composed of an unordered set of sequential records.

a. True b. False

Review QuestionsTwo Marks Questions

1.Define Transaction?

2.What are the properties of transaction?

3.What is serial schedule?

4.Define serializability?

5.What is conflict serializability?

6.What is view serializability?

7.How will you test conflict serializaility?

8.What is timestamp?

9.What timestamp values are associated with a data item?

10. What are the primitives of transaction?

11. What are the two types of lock modes?

12. List out the concurrency control techniques?

13. What are the different types of failure?

14. What is redo and undo operation?

15. Difference between immediate & deferred update recovery schemes?

16. What is rollback?

Big Questions

1. Explain the four important properities of transaction that a DBMS must ensure to maintain database .

2. (i) What is concurrenct control? How is it implemented in DBMS? Explain.

(ii) Explain various recovery techniques during ttansaction in detail.

3. (i) Explain the different fi\orms of serializability.

(ii) What are the different types of schdules are acceptable for recoverability?

4. (i) Explain on two-phase locking protocol and timestemp-based protocol.

(ii) Write short notes on log-based recovery.

------------- END OF FOURTH UNIT ----------------