Database Backup Recovery

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.)

Database Backup & Recovery Database Management System - 2

Prepared By :- Ajay A. Ardeshana

Email :- [email protected]

Mobile :- 9558820298

Page # 1

� Introduction :-

� Concurrency control and Database Recovery both are a part of the transaction management.

� Recovery is requires to protect the database from data inconsistencies and data loss. � It ensures the atomicity and durability properties of transactions. � This characteristic of DBMS helps to recover from the failure and restore the database

to a consistent state.

� Database Recovery Concepts :-

� Database recovery is the process of restoring the database to a correct state in the event of the failure.

� It is the process of restoring the database in to the most recent consistent state that exists shortly before the time of system failure.

� The failure may be the result of system crash due to hardware or software errors, a media failure such as a head crash, or software in the application such as a logical error in a program that is accessing the database.

� The numbers of recovery techniques that are used are based on the atomicity property of transaction.

� A transaction is considered as a single unit of work in which all operations must be applied and completed to produce a consistent database.

� If, for some reason, any transaction operation cannot be completed, the transaction must be aborted and any changes to the database must be rolled back(undone).

� Thus, transaction recovery reverses all the changes that the transaction has made to the database before it was aborted.

� If entire database needs to be recovered to a consistent state, the recovery uses the most recent backup copy of the database in a known consistent state.

� The backup copy is then rolled forward to restore all subsequent transactions by using the transaction log information.

� If the database needs to be recovered but the committed portion of the database is still unstable, the recovery process uses the transaction log to undo all the transactions that were not committed.

� Types of Database Failure :-

� There many types of failure that can affect the database processing � Some failures affect the main memory only, while others involve secondary storage.





Mobile :- 9558820298

Page # 2

1. Hardware Failure :-

• Hardware failure may include memory errors, disk crashes, bad disk sectors, disk full error and so on.

• Hardware failure can also be attributed to design errors, poor quality control during fabrication, overloading and wear out of mechanical parts.

2. Software Failure :-

• Software failure may include a failure a failures related to software such as, operating systems, DBMS software, application Programs and so on.

3. System Crash :-

• System crashes are due to hardware or software errors, resulting in the loss of main memory.

• This could be the situation that the system has entered an undesirable state, such as Dead Lock, which prevent the program form continuing with normal processing.

4. Network Failure :-

• Network failure can occur while using a Client-server configuration or distributed database system where multiple database servers are connected y common network.

• Network failure such as communication software failure or aborted asynchronous connections will interrupt the normal operation of the database system.

5. Media Failure :-

• Such failures are due to head crashes or unreadable media, resulting in the loss of parts of secondary storage.

• They are the most dangerous failures.

6. Application Software Error :- • These are logical errors in the program that is accessing the database, which

cause one or more transactions to fail.

7. Natural Physical Disasters :- • These are failure such as fire, floods, earthquake or power failure.

8. Carelessness :- • This is the failure due to unintentional destruction of data or facilities by

operators or users.





Mobile :- 9558820298

Page # 3

9. Sabotages :-

• These are failures due to international corruption or destruction of data, hardware or users.

� Types of Database Recovery :-

� In case of any type of failure a transaction must be either aborted or committed to maintain data integrity.

� Transaction log plays an important role for database recovery and bring the database in a consistent state.

� During recovery from failure, the recovery manager ensures that either all the effects of a given transaction are permanently recorded in the database or none of them are recorded.

� A transaction begins with a successful execution of BEGIN TRANSACTION statement and it ends with successful execution of COMMIT statement.

� Following two types of transaction recovery are used : � Forward Recovery. � Backward Recovery

.

1. Forward Recovery :- � Forward Recovery is the recovery procedure, which is used in case of a

physical damage, for example failure of secondary storage, failures during writing of data to database buffers or failure during transferring buffers to secondary storage.

� The intermediate result of the transaction are written in the buffer. The database buffer occupy an area in the main memory. From this buffer, data are transfer to the secondary storage of database.

� The update operation is regarded as permanent only when the buffers are flushed to the secondary storage of the database.

� The flushing operation can be triggered by the COMMIT operation of the transaction or automatically in the event of buffers becoming full.

2. Backward Recovery :-

� Backward Recovery is a recovery procedure, which is used in case an error occurs in the middle of normal operation on the database.

� If the transaction had not committed at the time of failure, it will cause an inconsistency in the database, because of this other program may read incorrect data and made use of it.





Mobile :- 9558820298

Page # 4

� Then the recovery manager must undo (rollback) any effect of the transaction database.

� The backward recovery guarantees the atomicity property of the transactions. � In case of backward recovery, the recovery is started with the database in its

current state and the transaction log is positioned at the last entry that was made in it.

� Then a program reads ‘backward’ through log, resetting each updated data value in the database to it previous value as recorded in the transaction log, until it reach the point where the error was made.

� Thus the program undoes each transaction in the reverse order from that in which it was made.





Mobile :- 9558820298

Page # 5

� ts � Starting Time of Transaction � tc � Time for Disk Crash � tf � Time for Transaction Failure. � In this example all the transactions T1,T2,…T6 are executing concurrently. � Let us assume that the data for transaction T2 and T4 are already written to the

disk before failure at time tf. � It can be observed the transaction T1 and T6 had not committed at the point of

the disk crash. Therefore the recovery manager must undo the transaction T1 and T6 at restart time.

� However, it is not clear that to what extent the changes made by the other already committed transactions T1 and T6 have been propagated to the database on secondary storage.

� This uncertainty is done because the buffers may or may not been flushed to secondary storage.

� Thus, the recovery manager would be forced to redo transactions T2, T3, T4 and T5.

� Recovery Techniques :-

� The database recovery techniques depend on the type and extent of damage that has

occurred to the database. � These techniques are based on atomic transaction property. � Following two types of damages can take place to the database.

� Physical Damage :-

• If the database has been physically damaged, for example disk crash has

occurred, then the last backup copy of the database is restored and update operation of committed transactions are reapplied using the transaction log file.

• It is to be noted that restoration in this case is possible only if the transaction log has not been damaged.

� Non-Physical or Transaction Failure :

• If the database has become inconsistent due to a system crash during execution of transactions, then the changes that caused the inconsistency are rolled-backward (undo).

• It may also be necessary to roll-forward (redo) some transactions to ensure that the updates performed by them have reached secondary storage.





Mobile :- 9558820298

Page # 6

• In this case the database is restored to a consistent state using the before-images and after-images held in the transaction log file.

• This technique is also known as log-based recovery technique. • For this following two techniques are used :

� Deferred Update :- � Immediate Update :-

1. Deferred Update :-

• In case of deferred update technique, updates are not written to the database

until after a transaction has reached its COMMIT point. In other words, the updates to the database are deferred (postponed) until the transaction complete its execution successfully and reached its commit point.

• During transaction execution the updates are recorded only in the transaction log and in the cache buffer.

• After the transaction reached its commit point and the transaction log is forced-written to disk, the updates are recorded in the database.

• If a transaction failed before it reaches this point, it will not have modified the database and so on undoing of changes will be necessary. However, it may be necessary to redo the updates of committed transactions as their effect may not have reached the database.

• In the case of deferred update, the transaction log file is used in the following ways :

� When a transaction T begins, transaction begin (or <T, BEGIN>) is written to the transaction log.

� During the execution of transaction T, a new log record containing all log data specified previously. E.g. new value ai for attribute A is written as “<WRITE(A,ai)>”. Each record consist of the transaction name T, the attribute name A and new value ai for attribute A.

� When all comprising transactions T are committed successfully, we say that the transaction T partially commits and the record “<T,COMMIT>” are written to the transaction log. After transaction T partially commits, the records associated with transaction T in the transaction log are used in executing the actual updates by writing to the appropriate records in the database. If a transaction T aborts, the transaction log record is ignored for the transaction T and write is not performed.





Mobile :- 9558820298

Page # 7

• Example :- • The transaction which updates an attribute called employee’s loan balance

(EMP_LOAN_BAL) in table EMPLOYEE. • Assume that the current balance of EMP_LOAN_BAL = 70000 and

CUR_LOAN_CASH_BAL = 80000. • Now transaction took place for making a loan payment of 20000 to employee Time Transaction Action

Time – 1 READ (A,a1) Read Current Loan Balance Time – 2 a1 := a1+20000 Increase Loan Balance by 2000 Time – 3 WRITE (A,a1) Write New(Updated) Loan Balance Time – 4 READ (B,b1) Read Current Loan Cash Balance Time – 5 b1 := b1– 20000 Reduce Loan Cash Balance by 20000 Time – 6 WRITE (B,b1) Write New(Updated) Loan Cash Balance

• After a failure has occurred, the DBMS examines the transaction log to

determine which transactions need to be redone. • If the transaction log contains both the start record <T,BEGIN> and commit

record <T,COMMIT> for transaction T, the transaction T must be redone. • That means, the database may have been corrupted, but the transaction

execution was completed and the new values for the relevant data items are contain in the transaction log.

• Therefore the transaction is needed to be reprocess. • Redo set the value of all data items updated by transaction T to the new values

that are recorded in the transaction log.

Time Log Entry Database Stored Value Before Start of Transaction

� A = 70000 B = 80000

Time – 1 <T, BEGIN> Time – 2 <T, A, 90000> Time – 3 <T, B, 60000> Time – 4 <T, COMMIT>

After Transaction � A = 90000 B = 60000

Database Update Log Entries for Transaction T

� Now let us assume that the database failure has occurred in the following conditions :

� Just after the COMMIT record is entered in the transaction log and before the updated records is written to the database.

� Just before the execution of WRITE operation.





Mobile :- 9558820298

Page # 8

Time Log Entry Database Stored Value Before Start of Transaction

� A = 70000 B = 80000

Time – 1 <T, BEGIN> Time – 2 <T, A, 90000> Time – 3 <T, B, 60000> Time – 4 <T, COMMIT>

Failure occurs just after the COMMIT record entered and before the updated records are written into the database.

• If the failure occurred just after the <T, COMMIT> record is enter into the transaction log and before the updated records are written into the database.

• When the system comes backup, no transaction is necessary because no COMMIT record for transaction T appears in the transaction Log.

• The REDO operation is executed, resulting in the values 90000 and 60000 being written to the database as the updated values of A, B.

• In this case when the system comes backup, no action is necessary because no COMMIT record for transaction T appears in the transaction log.

• So the value of A and B in database remains 70000 and 80000. • In this case transaction must be restarted. 2. Immediate Update :- • In case of immediate update technique, all updates to the database are applied

immediately as they occur without waiting to reach the COMMIT point and a record of all changes is kept in the transaction log.

• In this technique, when the transaction begins, a record <T,BEGIN> and update operations are written to the transaction log on disk before it is applied to the database.

• This type of recovery method requires two procedures namely : • Redoing transaction T(REDO,T) and, • Undoing of transaction T(UNDO, T). • First procedure redoes the same operation as Deferred Update. • Second one restore the values of all attributes updated by transaction T to their

old values Time Log Entry Database Stored Value

Before Start of Transaction � A = 70000 B = 80000

Time – 1 <T, BEGIN> Time – 2 <T, A, 70000, 90000>

A = 90000 Time – 3 <T, B, 80000, 60000>

A = 60000 Time – 4 <T, COMMIT>

Immediate Update Log Entries for Transaction T.





Mobile :- 9558820298

Page # 9

• In case of immediate update the transaction log file is used in following way :

� When a transaction T begins <T, BEGIN> is written to log file. � When write operation is performed, a record containing the necessary

data is written to the transaction log file. � Once the transaction log is written, the update is written to the database

buffers. � The updates to the database itself are written when the buffer are next

flushed to the secondary storage. � When the transaction T commits, <T,COMMIT> record is written to the

transaction log file. � If the transaction log contain the record <T, BEGIN> but does not

contain <T,COMMIT> transaction T is undone. The old value of affected data items is restored and transaction T is restarted. If transaction T contains both the records T will be redone.

• Now suppose that database failure occurred in the following conditions :

� Just before the WRITE action : “WRITE (B, b1)” � Just after “<T, COMMIT>” is written to the transaction log but before the

new values are written to the database. \

Time Log Entry Database Stored Value



A = 90000

Transaction T fail before the WRITE action to the Database In Immediate Update

• When failure occurs just before the execution of WRITE operation, system comes backs up and it find the record <T, BEGIN> but no corresponding <T, COMMIT>.

• This means that the transaction T must be undone. Thus an UNDO(T) operation is executed. This restores the value of A to 70000 and the transaction can be restarted.





Mobile :- 9558820298

Page # 10

Time Log Entry Database Stored Value



A = 90000 Time – 3 <T, B, 80000, 60000>

A = 60000 Time – 4 <T, COMMIT>

Immediate Update Log Entries for T when failure occurs after COMMIT action

• Above given table shows the transaction log when a failure has occurred just after the execution of <T, COMMIT> is written to the transaction log but before the new values are written to the database.

• When the system comes back again, a scan of the transaction log shows corresponding <T, BEGIN> and <T, COMMIT> records.

• Thus a REDO (T) operation is executed. • This results into the values of A and B as 90000 and 60000 respectively.

3. Shadow Paging :-

• The Shadow Paging was technique does not requires the use of transaction log

in a single user environment. • However in a multiuser environment a transaction log may be needed for

concurrency method. • In the Shadow Page scheme, the database is consider to be made up of logical

unit of storage fixed-size disk pages (or block). • The pages are mapped into physical blocks of storage by means of a page

table, with one entry for each logical page of database. • This entry contains the block number of the physical storage where this page is

storage. • Thus, the shadow paging scheme one possible form of the indirect page

allocation. • The shadow paging scheme is similar to the one which is used by the operating

system for virtual memory management. • In case of virtual memory management, the memory is divided into pages that

are assumed to be of a certain size. • The virtual and logical pages are mapped onto a physical memory blocks of the

same size as the page.





Mobile :- 9558820298

Page # 11

• The mapping is provided by means of table known as Page Table. • The page table contains one entry for each logical page of the process’s virtual

address space. • The shadow paging technique maintain the two page tables during the life of a

transaction namely current page table and shadow page table for a transaction that is going to modify the database.

• The shadow page is the original page table and the transaction addresses the

database using the current page table. • At the start of transaction the two tables are same and both point to the same

blocks of physical storage. • The shadow page table is never changed thereafter, and is used to restore the

database in the event of system failure. • However current page table entries may change during execution of

transaction.





Mobile :- 9558820298

Page # 12

• The current page table is used to record all updates to the database. When the transaction complete, the current page become the shadow page table.

• The pages that are affected by the transaction are copied to new blocks of physical storage and these blocks, along with the block not modified, are accessible to the transaction via the current page table.

• The old version of the changed pages remains unchanged and these pages continue with to be access via the shadow page table.

• The shadow page table contain the entries that existed in the page table before the start of the transaction and point to the blocks that were never changed by the transaction.

• The shadow page table remains the unaltered by the transaction and is used for undoing the transaction.

• Advantages :- • Overhead of maintaining the transaction log file is eliminated. • Since there is no need for UNDO or REDO operation, recovery is significantly

faster. • Disadvantages :- • Data fragmentation or scattering. • Need of periodic garbage collection to reclaim inaccessible block

� Checkpoints :-

� The point of synchronization between the database and transaction log file is called checkpoint.

� General method of database recovery is using information in the transaction log file. But the main difficulty in this recovery is of knowing how far to go back in the transaction log to search in case of failure.

� In the absence of this exact information, we may end up redoing transactions that have already been safely written to the database. Also this is very time-consuming and wasteful.

� A batter way is to find a point that sufficiently far back to ensure that any time written before that point has been done correctly and stored safely.

� This method is called checkpointing. � In checkpointing, all buffers are force-written to secondary storage. � The checkpoint technique is used to limit :

• The volume of log information • Amount of searching • Subsequent processing that is need to carry out on the transaction log file.





Mobile :- 9558820298

Page # 13

� During the execution of transaction, the DBMS maintain the transaction log but periodically perform the checkpoints.

� Checkpoints are scheduled at predetermined intervals and involve the following operations :

• Writing the start-of-checkpoint record along with the time and date to the log on the stable storage device giving the identification that it is a checkpoint.

• Writing all transaction log file records in main memory to secondary storage (SS).

• Writing the modified blocks in the database buffer to SS. • Writing a checkpoint record to the transaction log file. This record contains the

identifier of all transactions that are active at the time of the checkpoint. • Writing an end-of-checkpoint record and saving of the address of the

checkpoint record on the file accessible to the recovery routine on start-up after a system crash.

� At the time of check point all the identifiers, and their database modifications which reflected at that time only in the database buffer will be propagated to the appropriate storage.

� A checkpoint can be taken at fixed interval of time. � In case of failure during the serial operation of transactions, the transaction log file is

checked to find the last transaction that started before the last check point. � Any earlier transactions would have committed previously, would have written to the

database at the checkpoint. � Therefore it is needed to redo only :

• The one that was active at the checkpoint, • Any subsequent transactions for which both started and commit records appear

in the transaction log. � If the transactions are active at the time of failure, the transaction must be undone. � If transactions are performed concurrently, redo all transactions that have committed

since the checkpoint and undo all transactions that were active at the time of failure.





Mobile :- 9558820298

Page # 14

� Only transaction T1 is ok. � Transaction T2 and T4 will be redo and T3 and T5 will be undo.

� Buffer Management :-

� The buffers are the reserved blocks of the main memory. � DBMS application programs require I/O operations, which are performed by a

components of OS. � These I/O operations normally use buffers to match the speed of the processor and

relatively fast main memories with the slower secondary storages and also to minimize the number of I/O operations between the main and secondary memories.

� The assignment and management of memory block are called buffer management and the components of the OS that perform this task are called buffer manager.

� The buffer manager is responsible for the efficient management of the database buffers that are used to transfer pages between buffer and secondary storages.

� It ensure that as many data requests made by programs as possible are satisfied from data copied from secondary storage into the buffer.

� Buffer manager takes care of reading of pages from the disk into the buffer until the buffers become full and then using a replacement strategy to decide which buffer to force-write to disk to make space for new pages that need to be read from disk.

� Some of the replacement strategy used by the buffer manager are: • First-In-First-Out (FIFO) and • Least Recently Used (LRU).

� A computer system uses buffers that are in fact virtual memory buffers. Thus, a mapping is required between a virtual memory buffer and physical memory.

� The physical memory is managed by memory management component of OS. � In a virtual memory management, the buffers containing pages of the database

undergoing modification by a transaction could be written out to secondary storage. � The timing of this premature writing of buffer is decided by memory management

components of OS and it independent of the state of the transaction.





Mobile :- 9558820298

Page # 15

� To decrease the buffer fault, the LRU algorithm is used for buffer replacement.

� The buffer management effectively provides a temporary copy of a database page. � Therefore, it is used in database recovery system in which the modifications are done

in this temporary copy and original page remain unchanged in the secondary storage. � Bothe transaction log and database page are written to the buffer pages into virtual

memory. � The COMMIT transaction operation takes in two phases, and thus it called a two-

phase commit. � In the first phase of COMMIT operation, the transaction log buffers are written out and

in second phase of COMMIT operation, the data buffers are written out. � Thus it does not cause any problem because the log is always forced during the first

phase of COMMIT.

Documents

Database Backup Recovery