Haisley - Backup and Recovery Optimization

Preview:

DESCRIPTION

Stephan Haisley Oracle Corporation Center Of Expertise Copyright © Oracle Corporation, 2005. All rights reserved. Manager (RMAN) make backup and recovery faster influence speed of: – Backups – Recoveries – Restorations Copyright © Oracle Corporation, 2005. All rights reserved. 2 2

Citation preview

Copyright Oracle Corporation, 2005. All rights reserved.

Backup and RecoveryOptimization

Stephan HaisleyCenter Of ExpertiseOracle Corporation

22 Copyright Oracle Corporation, 2005. All rights reserved.

Objectives

• Provide a short introduction to Recovery Manager (RMAN)

• Explain and demonstrate factors that influence speed of:– Backups– Restorations– Recoveries

• Give you some ideas at what to look at to make backup and recovery faster

33 Copyright Oracle Corporation, 2005. All rights reserved.

What is RMAN?

• Introduced in 8.0• Allows DBA to manage backup and recovery

operations with ease• Built into RDBMS kernel so can take

advantage of features (e.g. block checking)• Can back up datafiles, controlfile, archivelogs

and SPFILE• Offers image copy or backupset

– Image copy: byte for byte copy– Backupset: multiplexed files together into

proprietary format

44 Copyright Oracle Corporation, 2005. All rights reserved.

Incremental Backups

0 1 01 1 1 1

Day of the week

Sun Mon Tues Wed Thr Fri Sat

Incremental backup level0 1 01 1 1 1

Day of the week

Sun Mon Tues Wed Thr Fri Sat

Incremental backup level

DifferentialDifferential CumulativeCumulative

55 Copyright Oracle Corporation, 2005. All rights reserved.

Differential vs Cumulative

• Backup speeds:Backup# Type Level #blocks Time (secs) CPU (secs)

1 Base 0 778112 626 227.20

2 Diff 1 42375 312 82.93

3 Diff 1 42370 312 82.65

4 Diff 1 42369 312 82.45

5 Base 0 778112 628 226.09

6 Cumu 1 42371 314 80.61

7 Cumu 1 49605 315 83.70

8 Cumu 1 60176 321 85.33

66 Copyright Oracle Corporation, 2005. All rights reserved.

Differential vs Cumulative

• Restore speeds:

Type #Backup Sets restored Time (secs) CPU (secs)

Base level 0 1 626.67 210.85

Differential 3 98.67 23.00

Base level 0 1 629.33 209.21

Cumulative 1 43.00 11.05

• Extra time on backup can save significant time on recovery!

77 Copyright Oracle Corporation, 2005. All rights reserved.

Backup and Restore Performance

• Backup & Restore times can be influenced by:– Channel configuration– Size of memory buffers (read & write)– Speed of backup devices– Amount of data being backed up– Amount of block checking features

enabled– Use of compression

88 Copyright Oracle Corporation, 2005. All rights reserved.

Channel Configuration

• Match up the number of channels to each backup device– Manually allocate channels– Use automatic channel parallelism

• Avoid Media Management Layer (MML) multiplexing of backup sets– Increase restore times

• Leave some devices available for emergency restorations which won’t upset the other backup schedules

99 Copyright Oracle Corporation, 2005. All rights reserved.

Channel Configuration

• Reducing filesperset can decrease speed of single file restores:

Filesperset BS Size (blks)

Restored file (blks) Time (secs) CPU (secs)

8 702320 97727 132 39.42

4 658221 97727 110 36.92

2 132773 97727 82 29.92

1 97730 97727 74 25.62

1010 Copyright Oracle Corporation, 2005. All rights reserved.

Read and Write Memory Buffers

Datafiles

input Buffers(4 per datafile)

Output Buffers(4 per channel)

Backup Device

1111 Copyright Oracle Corporation, 2005. All rights reserved.

Size of Read Buffers• Allocated according to MAXOPENFILES

channel parameter:

MAXOPENFILES Buffer Size

MAXOPENFILES 4 Each buffer = 1Mb, total buffer size for channel is up to 16Mb

4 > MAXOPENFILES 8Each buffer = 512Kb, total buffer size for channel is up to 16Mb. Numbers of buffers per file depends on number of files

MAXOPENFILES > 8 Each buffer = 128Kb, 4 buffers per file, so each file will have 512Kb buffer

• Let’s see how that looks in real life…

1212 Copyright Oracle Corporation, 2005. All rights reserved.

Size of Read Buffers

• Read buffer allocation for backups:

MAXOPENFILES Buffer Size (Kb) #Buffers per file Total Buffer size (Mb)

2 1024 8 16

4 512 8 16

8 512 4 16

10 128 4 5

• Default values seem adequate, and will also limit the amount of memory used for input buffer memory

1313 Copyright Oracle Corporation, 2005. All rights reserved.

Size of Write Buffers• Allocates four buffers per channel

– Disk = 1Mb per buffer– SBT = 256Kb per buffer

• SBT is smaller due to slower speed of tape devices

• Can see increased performance when increasing size of tape buffers…

Total buffer size (Kb) I/O Count I/O Time (secs)

128 60564 617.4

1024 (default) 7571 595.9

2048 3786 505.3

1414 Copyright Oracle Corporation, 2005. All rights reserved.

Where is buffer memory allocated from?

• PGA if not using I/O slaves (use async I/O)– tape_asynch_io– disk_asynch_io

• Shared Pool if using I/O slaves (use if OS does not support async I/O)– backup_tape_io_slaves– dbwr_io_slaves

• Large Pool if size > 0 and using I/O slaves

1515 Copyright Oracle Corporation, 2005. All rights reserved.

Speed of Backup Devices

• Maximum speed of backup:min(disk read Mb/s, tape write Mb/s)

• Monitor v$backup_async/sync_io for effective_bytes_per_second where input is output or input– If transfer rate slower than device is

capable of, look at OS level data, CPU statistics, MML settings (compression?), device settings (block size)

• Can slow down speed of backup to reduce loading on I/O system:RMAN> configure channel device type sbt rate=1M;

1616 Copyright Oracle Corporation, 2005. All rights reserved.

Amount of data being backed up

• Put static data into Read-Only tablespace and backup one time only – Make sure backup not purged from MML

catalog• Use differential incrementals and monitor

v$backup_datafile to identify files not changing frequently– Reduce their backup frequency

• Avoid using datafiles with large amounts of freespace– The whole datafile is scanned for a backup

1717 Copyright Oracle Corporation, 2005. All rights reserved.

Block Change Tracking• Fast Incremental backups introduced in 10g• Uses change tracking file to store bitmaps

representing ranges of blocks in datafiles• Size of tracking file ~1/30,000 size of database• Overhead on database performance ~3% (in

my TPCC tests)• Performance gain for backups make this

bearable:

Fast Incrementals? #Blocks in DB #Blocks read #Blocks in

backup Time (secs)

No 404160 404160 36567 156

Yes 404160 72832 37215 35

1818 Copyright Oracle Corporation, 2005. All rights reserved.

Amount of Block Checking Features Enabled

• Each type of block checking will increase time and CPU usage for backup and restoration:– Head and Tail sanity check

– Makes sure key structures in head match tail– Block Checksums

– Calculated and compared with existing checksum

– Logical structure checks– Checks various block structures for consistency

• Tests showed time for database backup increased ~1% and CPU usage by ~8%– BUT extra checks confirm if database good on

backup and then on restore

1919 Copyright Oracle Corporation, 2005. All rights reserved.

Backup Compression

• Backupset compression introduced in 10g• Can reduce size of backupset by 80-90%

– Saves space on backup media space– Reduces amount of network traffic if

backup device not local• Increases CPU and time (as expected) for

backup and restore• Do NOT use along with MML compression

– Time both types of compression and use most suitable

2020 Copyright Oracle Corporation, 2005. All rights reserved.

Recovery Performance

• Recovery times can be influenced by:– Number of archivelogs/incrementals

being applied– Number of datafiles needing

recovery– If archivelogs available on disk– If using parallel recovery– General database performance

2121 Copyright Oracle Corporation, 2005. All rights reserved.

Number of archivelogs/incrementals being applied

• RMAN will choose to use incrementals over archivelogs– My tests showed restoring the incremental

was ~17 times quicker than applying 20 archivelogs

– Mileage will vary depending on backup / restore speeds as previously discussed

• Previous slide showed cumulative being faster than differentials

• The higher the number of logfiles / incrementals required, the slower the recovery

2222 Copyright Oracle Corporation, 2005. All rights reserved.

Number of datafiles needing recovery

• For each datablock that needs recovery, it first needs to be read into the buffer cache and then written back to disk by DBWR after redo is applied to it

• By reducing the number of files that are recovered, reduce overall work in the database = speed up recovery– Only restore and recover the files that

NEED recovering• If recovery due to corruption, consider Block

Media Recovery…

2323 Copyright Oracle Corporation, 2005. All rights reserved.

Block Media Recovery (BMR)• RMAN will restore and apply recovery to the

specified blocks only, leaving rest of datafile in tact for normal use

• Significant increase in recovery time over the whole datafile:

#Corrupt Blocks Datafile recovery time (secs) BMR Time (secs)

10 941 145

99 925 155991 937 219

5000 922 61610000 938 1156

• Can be too much of a good thing!

2424 Copyright Oracle Corporation, 2005. All rights reserved.

Archiveslogs available on disk?

• Avoid the RMAN restore times for archivelogs and keep n days worth on disk– Depends on incremental strategy– Depends on available disk space

• Backup most recent archivelogs to disk and then to tape at a later time– Take a backup of a backup (from 9i

onwards)

2525 Copyright Oracle Corporation, 2005. All rights reserved.

Parallel Recovery

• By default Oracle will use a single process to carry out recovery, unless using parallel_automatic_tuning– Oracle will decide if best to use parallel

recovery and how many slave processes• Single coordinator process reads the archivelogs• Reading of datablocks and applying redo is split

up amongst slave processes, each working on a range of blocks

• Will increase CPU usage and need for DBWR to perform well

• Watch for waits on ‘PX Deq’ events

2626 Copyright Oracle Corporation, 2005. All rights reserved.

General Database Performance

• Recovery happens within the database, so a badly performing database will not help with recovery times

• Areas to look for improvement:– I/O read and write intensive– DBWR performance look for ‘free buffer

waits’ – use async. IO or DBWR slaves– CPU make sure it doesn’t become

starved during recovery – parallelism won’t help you!

2727 Copyright Oracle Corporation, 2005. All rights reserved.

Helpful views

• v$session_longops shows currently running backup, restore, recovery with RMAN

• v$backup_async/sync_io shows RMAN performance information

• v$session_wait session wait information• v$backup_set, v$backup_piece,

v$backup_datafile etc. shows sizing information for backups

2828 Copyright Oracle Corporation, 2005. All rights reserved.

Summary

• Explained factors that influence speed of:– Backups– Restorations– Recoveries

• Gave you something to think about when looking at backup, restore and recovery time windows

• Make sure you test any alterations with production volume FIRST!

Recommended