36
Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Embed Size (px)

Citation preview

Page 1: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Chapter 16Recovery

Yonsei University1st Semester, 2015

Sanghyun Park

Page 2: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Outline Failure Classification Recovery and Atomicity Log-Based Recovery Shadow Paging Recovery with Concurrent Transactions Buffer Management

Page 3: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Failure Classification Transaction failure

Logical errors: transaction cannot complete due to some logical error condition

System errors: the database system must terminate an active transaction due to an error condition (e.g., deadlock)

System crash: a power failure or other hardware or software failure causes the system to crash

Disk failure: a head crash or similar disk failure destroys all or part of disk storage

Page 4: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Recovery Algorithms Recovery algorithms are techniques to ensure

DB consistency, transaction atomicity, and durabilitydespite failures

Recovery algorithms have two parts Actions taken during normal transaction processing

to ensure enough information exists to recover from failures Actions taken after a failure to recover the database contents

to a state that ensures atomicity, consistency, and durability

Page 5: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Data Access (1/2) Physical blocks are those blocks residing on the disk

Buffer blocks are those blocks residing temporarily inmain memory

Block movements between disk and main memory are initiated through input(B) and output(B) operations

Each transaction Ti has its private work-area in which local copies of all data items accessed and updated by Ti are kept

We assume, for simplicity, that each data item fits in a single block

Page 6: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Data Access (2/2) Transaction transfers data items between buffer blocks

and its private work-area using read(X) and write(X) operations

Transactions Perform read(X) when accessing X for the first time All subsequent accesses are to the local copy After last access, transaction executes write(X)

Output(BX) does not need to follow write(X) immediately

Page 7: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Example of Data Access

X

Y A

B

x1

y1

bufferBuffer Block A

Buffer Block B

input(A)

output(B)

read(X) write(Y)disk

work areaof T1

work areaof T2

memory

x2

Page 8: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Recovery and Atomicity (1/2) Modifying the database without ensuring that the

transaction will commit may leave the databasein an inconsistent state

Consider transaction Ti that transfers $50 from account A to account B; goal is either to perform all database modifications made by Ti or none at all

Several output operations may be required for Ti.A failure may occur after one of these modifications have been made but before all of them are made

Page 9: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Recovery and Atomicity (2/2) To ensure atomicity despite failures, we first output

information describing the modifications to stable storage without modifying the database itself

We study two approaches Log-based recovery Shadow-paging

We assume (initially) that transactions run serially,that is, one after the other

Page 10: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Log-Based Recovery (1/2) The log is a sequence of log records, recording all the

update activities in the database

When transaction Ti starts, it registers itself by writing a <Ti start> log record

Before Ti executes write(X), a log record <Ti, X, V1, V2> is written, where V1 is an old value and V2 is a new value

Page 11: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Log-Based Recovery (2/2)

When Ti finishes its last statement, the log record<Ti commit> is written

We assume for now that log records are written directly to stable storage (that is, they are not buffered)

Two approaches using logs:deferred or immediate database modification

Page 12: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Deferred Database Modification (1/4) The deferred database modification scheme records all

database modifications in the log, but defers all the writes until the transaction partially commits

Transaction starts by writing <Ti start> record to log

A write(X) operation by Ti results in the writing of a new record to the log

The write is not performed on X at this time, but deferred

Page 13: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Deferred Database Modification (2/4)

When Ti partially commits, a record <Ti commit> is written to the log

Finally, the log records are read and used to actually execute the previously deferred writes

During recovery after a crash, a transaction needs to be redone if both <Ti start> and <Ti commit> are in the log

Redoing a transaction Ti (redo Ti) sets the value of all data items updated by the transaction to the new values

Page 14: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Deferred Database Modification (3/4) Crashes can occur while

The transaction is executing the original updates While recovery action is being taken

Example transactions T0 and T1 (T0 executes before T1)

T0: read(A) A := A – 50 write(A) read(B) B := B + 50 write(B)

T1: read(C) C := C – 100 write(C)

Page 15: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Deferred Database Modification (4/4) Below is the log as it appears at three time instances

If log on stable storage at time of crash is as in case:(a) No redo actions need to be taken

(b) Redo(T0) must be performed since <T0 commit> is present

(c) Redo(T0) must be performed followed by redo(T1) since both <T0 commit> and <T1 commit> are present

Page 16: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Immediate Database Modification (1/4)

The immediate modification scheme allows database modifications to be output to the database while the transaction is still in the active stage

Update log record must be written before database item is written

Output of updated buffer blocks can take place at any time before or after transaction commit

Order in which blocks are output can be different from the order in which they are written

Page 17: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Immediate Database Modification (2/4)

Example (BX denotes block containing X)

Log Write Output

<T0 start><T0, A, 1000, 950><T0, B, 2000, 2050>

<T0 commit><T1 start><T1, C, 700, 600>

<T1 commit>

A = 950B = 2050

C = 600

BB, BC

BA

Page 18: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Immediate Database Modification (3/4)

Recovery procedure has two operations instead of one: Undo(Ti) restores the value of all data items updated by Ti to

their old values, going backward from the last log record for Ti

Redo(Ti) sets the value of all data items updated by Ti to the new values, going forward from the first log record for Ti

Both operations must be idempotent That is, even if the operation is executed multiple times,

the effect is the same as if it is executed once

Page 19: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Immediate Database Modification (4/4)

When recovering after failure: Transaction Ti needs to be undone if the log contains the record

<Ti start>, but does not contain the record <Ti commit>

Transaction Ti needs to be redone if the log contains both the record <Ti start> and the record <Ti commit>

Undo operations are performed first, then redo operations

Page 20: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Immediate Database ModificationRecovery Example

Below is the log as it appears at three time instances

Recovery actions in each case above are:(a) Undo(T0): B is restored to 2000 and A to 1000

(b) Undo(T1) and redo(T0): C is restored to 700, and then A and B are set to 950 and 2050 respectively

(c) Redo(T0) and redo(T1): A and B are set to 950 and 2050 respectively, then C is set to 600

Page 21: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Checkpoints (1/2) Problems in recovery procedure discussed earlier:

Searching the entire log is time-consuming Most of the transactions that need to be redone have already

written their updates into the database

To reduce these types of overhead, the system periodically performs checkpoints, which require the following sequence of actions to take place Output onto stable storage all log records currently residing in

main memory Output to the disk all modified buffer blocks Output onto stable storage a log record <checkpoint>

Page 22: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Checkpoints (2/2) During recovery we need to consider only the most

recent transaction Ti that started before the checkpoint, and transactions that started after Ti

Scan backward from end of log to find the most recent <checkpoint> record

Continue scanning backwards till a record <Ti start> is found

Need only consider the part of log following above start record For all transactions starting from Ti or later with no <Ti

commit>, execute undo(Ti)

Scanning forward in the log, for all transactions starting from Ti or later with <Ti commit>, execute redo(Ti)

Page 23: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Example of Checkpoints

T1 can be ignored(updates already output to disk due to checkpoint)

T2 and T3 redone

T4 undone

TcTf

T1

T2

T3

T4

checkpoint system failure

Page 24: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Page And Page Table The database is partitioned into some number of fixed-

length blocks, which are referred to as pages

The pages do not need to be stored in any particular order on disk

We use a page table to find the ith page of the database for a given i

Page 25: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Sample Page Table

Page 26: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Shadow Paging (1/6) Shadow paging is an alternative to log-based recovery;

this scheme is useful if transactions execute serially The key idea is to maintain two page tables during the

lifetime of a transaction – the current page table andthe shadow page table

Both page tables are identical when the transaction starts The shadow table is never changed over the duration of

the transaction; the current page table may be changed when a transaction performs a write operation

All input and output operations use the current page table to locate database pages on disk

Page 27: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Shadow Paging (2/6)

Suppose that the transaction Tj performs a write(X),and that X resides on the ith page

The system executes the write operation as follows:

If the ith page is not already in main memory, then the system issues input(X)

Page 28: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Shadow Paging (3/6) If this is the write first performed on the ith page by this

transaction, then the system modifies the current page tableas follows:

It finds an unused page on disk

It copies the contents of the ith page to the page found in above step

It modifies the current page table so that the ith entry points to the page found in above step

It assigns the value of xj to X in the buffer page

Page 29: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Shadow Paging (4/6) Example: shadow and current page tables after write to

page 4

Page 30: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Shadow Paging (5/6) To commit a transaction:

Flush all modified pages in main memory to disk Output current page table to disk Make the current page table the new shadow page table

Once pointer to shadow page table has been written, transaction is committed

No recovery is needed after a crash – new transactions can start right away, using the shadow page table

Pages not pointed to from current/shadow page table should be freed (garbage collected)

Page 31: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Shadow Paging (6/6) Advantages of shadow-paging over log-based schemes

No overhead of writing log records Recovery is trivial

Disadvantages Copying the entire page table is very expensive Commit overhead is high Data gets fragmented After every transaction completion, the database pages

containing old versions of modified data need to be garbage collected

Hard to extend algorithm to allow transactions to run concurrently

Page 32: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Recovery WithConcurrent Transactions (1/3)

We modify the log-based recovery schemes to allow multiple transactions to execute concurrently;they share a single buffer and a single log

Logging is done as described as earlier

The checkpoint technique and actions taken on recovery have to be changed

Page 33: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Recovery WithConcurrent Transactions (2/3)

Checkpoints are performed as before, except that checkpoint log record is now of the form: <checkpoint L>(L: list of transactions active at the time of checkpoint)

When the system recovers, it first does the following: Initialize undo-list and redo-list to empty Scan the log backward from the end, stopping when the first

<checkpoint L> record is found For each record found during the backward scan:

if the record is <Ti commit>, add Ti to redo-list;If the record is <Ti start> and Ti is not in redo-list, add Ti to undo-list

For every Ti in L, if Ti is not in redo-list, add Ti to undo-list

Page 34: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Recovery WithConcurrent Transactions (3/3)

At this point, undo-list consists of incomplete transactions that must be undone, and redo-list consists of finished transactions that must be redone

Recovery now continues as follows: Scan log backward from the end, and perform undo for each log

record that belongs to transaction Ti on undo-list, stopping when <Ti start> records have been found for every Ti in undo-list

Locate the most recent <checkpoint L> record Scan log forward from the <checkpoint L> record, and perform

redo for each log record that belongs to transaction T i on redo-list

It is important to undo the transaction in undo-list before redoing transactions in redo-list (example in textbook)

Page 35: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Log Record Buffering (1/2) Log record buffering: log records are buffered in main

memory, instead of being output directly to stable storage Log records are output to stable storage when a block of log

records in the buffer is full, or a log force operation is executed

Log force is performed to commit a transaction by forcing all its log records to stable storage

Several log records can thus be output using a single output operation, reducing the I/O cost

Page 36: Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Log Record Buffering (2/2) The rules below must be followed if log records are

buffered

Log records are output to stable storage in the order in which they are created

Transaction Ti enters the commit state only when the log record <Ti commit> has been output to stable storage

Before a block of data in main memory is output to the database, all log records pertaining to data in that block must have been output to stable storage This rule is called the write-ahead logging or WAL rule