1
Have you ever tried to recover from a disaster involving SQL Server and just not known where to start or how to proceed? Which backups do you use? What if you don’t have any backups? Using all his experience (within and outside of Microsoft) helping customers recover using SQL Server, Paul S. Randal of SQLskills.com has produced this disaster recovery poster for SQL Server Magazine. The poster shows you the steps you need to follow and the decisions you need to make to ensure your disaster recovery proceeds smoothly and successfully. You’ll see how to work out which path to follow no matter what your SQL Server environment, using SQL Server 2005 onward. Apart from showing you what you should do, the poster also helps you avoid doing things you should not do, like trying to recover a SUSPECT database through detach/attach. The information provided on this poster should help you save downtime, save data, and potentially save your job! Get shockingly fast, reliable, disaster-proof SQL Server backup and recovery from Idera. Start point End point KEY Decision point Next step Application Problem Can connect to database? Can connect to SQL Server? Redundant server available? Can connect to Windows? No No No Bring Windows Server online No Failover to redundant server Redirect clients to new server Application Online Succeeds Yes Fails Yes Yes Check backup jobs. Check replication topologies. Perform disaster recovery on the original server. Perform root cause analysis and take preventative measures. Always best to have a DR plan to follow. Failover to redundant server or restore from backups may be considered at any point as best plan. May involve a bare-metal install of Windows on a new server. Could also be network connectivity issues, power failure, boot drive failure May choose to restore database to another server, which may involve copying over logins, SSIS packages, Agent jobs, SPs, reconfiguring replication/log shipping, etc There may be multiple databases in the application ecosystem Does SQL Server start? Is master damaged? Is resource db damaged? Look in SQL Server error log for error messages Did tempdb creation fail? No No No Resolve error based on error log messages No Resolve file system issue or change tempdb location (21) Copy in correct version of resource database (20) Rebuild master database (18) Try to restart SQL Server Yes Yes Yes The system databases may be damaged by a broken I/O subsystem, in which case SQL Server may need to be moved to a different I/O subsystem Can connect to database? Yes Yes Is database attached? Do all database files exist? No Yes No Choose Extract or repair Succeeds Missing log files only? Yes Succeeds Succeeds No Fails Succeeds Fails No Application is online Succeeds Fails Extract to new database Emergency mode repair Update resume . Leave town :-) With either extraction or repair in emergency mode, there is no guarantee of transactional consistency as the log could not be properly recovered. You may need to run DBCC CHECKCONSTRAINTS, check business logic, reinitialize replication. See note (30). Yes Is problem data deleted by user? Does database snapshot exist? Application online Extract data from snapshot, or revert to snapshot (22) Log shipping secondary with enough load delay to allow data recovery? Extract data from log shipping secondary (23) Is 3 rd -party backup software that supports single-table restore available? Yes Yes Yes Is part of database missing? Restoring entire database? Restore deleted table, or deleted rows from restored copy of table (25) Yes Point-in-time restore as close to data deletion as possible and extract as much data as possible (26) No No No No Did CHECKDB complete? i Nonclustered ndex corruption only? Non-repairable errors? Yes DBCC CHECKDB (dbname) WITH ALL_ERRORMSGS, NO_INFOMSGS (24) 8992 or 2570 errors only? Zero data-loss SLA? Choose repair or restore Back up repaired database. Potentially restore most recent backups and extract as much data as possible (30) No No Choose manual repair or restore Offline rebuild affected indexes (28) May need to drop/create the index to force a scan of the table on SQL Server 2008 onwards. Do not use drop/ create first, as the index may be enforcing a constraint. Update resume. Leave town :-) Yes No Are valid backups available? Yes Repair No Yes Update resume. Leave town :-) Yes Restore Yes Repair Application is online Yes Succeeds Restore Are valid backups available? Application is online Fails Succeeds No Performing partial restore? Restore primary filegroup from most recent full file, filegroup, or database backup using WITH PARTIAL (04) Restored desired secondary filegroups from most recent full file and/or filegroup and/or database backups (05) Restore all necessary transaction log backups to bring all in-restore portions of the database to the same desired point in time (09) Restore page(s) and/or file(s) and/or filegroup(s) from most recent full file, and/or filegroup and/or database backups (08) Application is online No Yes No Restore most recent full backups of all portions of the database, starting with primary filegroup (03) Restore most recent differential file and/or filegroup and/or database backups (06) Yes Perform regular tail-of- the-log backup if possible and required (01) Back up database prior to repair, just in case something goes wrong Back up database prior to repair, just in case something goes wrong No Back up repaired database to be starting point for any further disaster Back up repaired database to be starting point for any further disaster This is not a situation you want to be in as these corruptions will persist in the database for ever, causing DBCC CHECKDB to fail. Next steps could be recreating the database from scratch or extracting data into a new database. At any point a corrupt backup may be encountered. If no alternative exists, the restore can be forced using WITH CONTINUE_AFTER_ERROR from SQL Server 2005 onwards, but this ends the restore sequence. Think carefully before forcing a corrupt log backup to restore as database corruption will likely be the result. If doing a point-in-time restore, it is safest to use WITH STOPAT on each restore command to ensure you do not accidentally go beyond the desired point. This would mean restarting the restore operation from the beginning. If the restore sequence involves more than one restore operation, you must use WITH NORECOVERY on each restore command to allow the restore sequence to continue. It is safest to use WITH NORECOVERY on all restore commands, and then complete the restore sequence manually using RESTORE DATABASE MyDbName WITH RECOVERY. This is not a situation you want to be in. Next steps could be recreating the database from scratch or extracting data into a new database. You may need to run DBCC CHECKCONSTRAINTS, reinitialize replication subscribers and check integrity of data relationships after any DBCC repair operation. See note (30). Possibly a data recovery service could help, but this is usually complete data loss. If a partial restore was performed, additional filegroups can be restored as required with the already restored portions of the database remaining online. When restoring the entire database in Enterprise Edition, consider whether to restore filegroups in a prioritized manner using a partial restore and allowing the database to come online earlier with partial database availability, followed by online piecemeal restores of the remaining filegroups later. Try to perform a hack attach (14) Try to perform attach with log rebuild (13) Try to perform regular attach (12) No (i.e. missing data files) Try to perform data extraction (17) Try to perform EMERGENCY mode repair (16) Try to repair 8992/ 2570 errors (31) Is log explorer tool available? Use tool to reverse user error that deleted data (27) Yes Are regular backups available? No Yes Set damaged portion of database offline to allow online restore (07) Yes Try to repair database using DBCC CHECKDB (29) Switch to EMERGENCY mode if possible (15) Are backups available? Fails Yes If backups are available at this point, at any time jump to the restore sequence Fails Succeeds Yes Try to set database ONLINE (11) Perform a hack- attach tail-of-the- log backup if only log files exist (02) Whether it works or not Restore tail-of-the- log backup if required (10) Fails No No No Update resume. Leave town :-) This is not a situation you want to be in. At this point there is no way to recover the deleted data. Bring the database online (32) Perform regular tail-of-the-log backup (01) Restore master database (if backup exists) and/or reattach all necessary databases, recreate user/logins (19) Back up repaired database to be starting point for any further disaster Succeeds Fails Fails Not required for failover clustering, database mirroring with transparent client redirection, or an Availability Group with applications configured to use the Listener virtual name Many steps on this poster require more in-depth explanation. Look for a two-digit code of the form (XX), which means go to www.SQLskills.com/DRPoster.asp and find that code for more information and differences between SQL Server versions.

Get shockingly fast, reliable, disaster-proof SQL Server ...t-sql.ru/file.axd?file=2013/2/recovery_step_by_step.pdf · Updateresume. Leavetown:-) Witheitherextractionorrepairin emergencymode,thereisno

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Get shockingly fast, reliable, disaster-proof SQL Server ...t-sql.ru/file.axd?file=2013/2/recovery_step_by_step.pdf · Updateresume. Leavetown:-) Witheitherextractionorrepairin emergencymode,thereisno

Have you ever tried to recover from a disaster

involving SQL Server and just not known

where to start or how to proceed?

Which backups do you use? What if you don’t have

any backups? Using all his experience (within and outside of Microsoft) helping

customers recover using SQL Server, Paul S. Randal of SQLskills.com has produced

this disaster recovery poster for SQL Server Magazine. The poster shows you the

steps you need to follow and the decisions you need to make to ensure your disaster

recovery proceeds smoothly and successfully. You’ll see how to work out which path

to follow no matter what your SQL Server environment, using SQL Server 2005

onward. Apart from showing you what you should do, the poster also helps you avoid doing things

you should not do, like trying to recover a SUSPECT database through detach/attach. The information

provided on this poster should help you save downtime, save data, and potentially save your job!

Get shockingly fast, reliable, disaster-proof SQL Server backup and recovery from Idera.

Start point

End point

KEY

Decision point

Next step

ApplicationProblem

Canconnect todatabase?

Canconnect to SQL

Server?

Redundantserver available?

Canconnect toWindows?

No

No

No

Bring WindowsServer online

No

Failover toredundant server

Redirect clients tonew server

ApplicationOnline

Succeeds

Yes

Fails

Yes

Yes

Check backup jobs.Check replication topologies.Perform disaster recovery on

the original server.Perform root cause analysis

and take preventativemeasures.

Always best to have a DRplan to follow.

Failover to redundant serveror restore from backups maybe considered at any point

as best plan.

May involve a bare-metal install ofWindows on a new server.

Could also be network connectivity issues,power failure, boot drive failure

May choose to restore database to anotherserver, which may involve copying overlogins, SSIS packages, Agent jobs, SPs,reconfiguring replication/log shipping, etc

There may be multipledatabases in the application

ecosystem

Does SQLServer start?

Is masterdamaged?

Is resourcedb damaged?

Look in SQLServer error log for

error messages

Did tempdbcreation fail?

No

No

No

Resolve errorbased on error log

messages

No

Resolve filesystem issue orchange tempdb

location (21)

Copy in correctversion of

resource database(20)

Rebuild masterdatabase (18)

Try to restart SQLServer

Yes

Yes

Yes

The system databases maybe damaged by a broken I/O

subsystem, in which caseSQL Server may need to be

moved to a different I/Osubsystem

Canconnect todatabase?

Yes

Yes

Is databaseattached? Do all database

files exist?No

Yes

No

ChooseExtract or repair

Succeeds

Missing logfiles only?

Yes

Succeeds

Succeeds

No

Fails

Succeeds

Fails

No

Application isonlineSucceeds

Fails

Extract to new database

Emergency mode repair

Update resume.Leave town :-)

With either extraction or repair inemergency mode, there is no

guarantee of transactionalconsistency as the log could not be

properly recovered.You may need to run DBCC

CHECKCONSTRAINTS, checkbusiness logic, reinitializereplication. See note (30).

Yes

Is problem datadeleted by user?

Does databasesnapshot exist?

Application online

Extract data fromsnapshot, or revertto snapshot (22)

Log shippingsecondary with enoughload delay to allow data

recovery?

Extract data fromlog shipping

secondary (23)

Is3rd-party backup

software that supportssingle-table restore

available?

Yes

Yes Yes

Is part ofdatabasemissing?

Restoring entiredatabase?

Restore deletedtable, or deleted

rows from restoredcopy of table (25)

Yes

Point-in-time restore asclose to data deletion

as possible and extractas much data as

possible (26)

NoNo No

No

DidCHECKDBcomplete?

iNonclustered

ndex corruptiononly?

Non-repairableerrors?

Yes

DBCC CHECKDB(dbname) WITH

ALL_ERRORMSGS,NO_INFOMSGS

(24)

8992 or 2570errors only?

Zero data-lossSLA?

Chooserepair or restore

Back up repaireddatabase. Potentiallyrestore most recent

backups and extract asmuch data as possible

(30)

No

No

Choosemanual repair or

restore

Offline rebuildaffected indexes

(28)

May need to drop/create theindex to force a scan of thetable on SQL Server 2008onwards. Do not use drop/

create first, as the index maybe enforcing a constraint.

Update resume.Leave town :-)

Yes

No

Arevalid backups

available?

Yes

Repair

No

Yes

Update resume.Leave town :-)

Yes

Restore

Yes Repair

Application isonline

Yes

Succeeds

Restore

Arevalid backups

available?Application is

online

Fails

Succeeds

No

Performingpartial restore?

Restore primaryfilegroup from most

recent full file,filegroup, or databasebackup using WITH

PARTIAL (04)

Restored desiredsecondary filegroupsfrom most recent fullfile and/or filegroup

and/or databasebackups (05)

Restore all necessarytransaction log backups

to bring all in-restoreportions of the databaseto the same desired point

in time (09)

Restore page(s) and/orfile(s) and/or filegroup(s)from most recent full file,and/or filegroup and/ordatabase backups (08)

Application isonline

No

Yes

No

Restore most recent fullbackups of all portions of

the database, startingwith primary filegroup (03)

Restore most recentdifferential file and/or

filegroup and/ordatabase backups (06)

Yes

Perform regular tail-of-the-log backup if

possible and required(01)

Back up databaseprior to repair, justin case something

goes wrong

Back up databaseprior to repair, justin case something

goes wrong

No

Back up repaireddatabase to bestarting point for

any furtherdisaster

Back up repaireddatabase to bestarting point for

any furtherdisaster

This is not a situation youwant to be in as these

corruptions will persist in thedatabase for ever, causingDBCC CHECKDB to fail.

Next steps could berecreating the database from

scratch or extracting datainto a new database.

At any point a corrupt backup may beencountered. If no alternative exists, the

restore can be forced using WITHCONTINUE_AFTER_ERROR from SQLServer 2005 onwards, but this ends therestore sequence. Think carefully beforeforcing a corrupt log backup to restore as

database corruption will likely be the result.

If doing a point-in-time restore, it is safestto use WITH STOPAT on each restore

command to ensure you do not accidentallygo beyond the desired point. This would

mean restarting the restore operation fromthe beginning.

If the restore sequence involves more thanone restore operation, you must use WITHNORECOVERY on each restore commandto allow the restore sequence to continue. Itis safest to use WITH NORECOVERY onall restore commands, and then complete

the restore sequence manually usingRESTORE DATABASE MyDbName WITH

RECOVERY.

This is not a situation youwant to be in. Next stepscould be recreating the

database from scratch orextracting data into a new

database.

You may need to run DBCCCHECKCONSTRAINTS,

reinitialize replicationsubscribers and check

integrity of data relationshipsafter any DBCC repair

operation. See note (30).

Possibly a data recoveryservice could help, but this isusually complete data loss.

If a partial restore wasperformed, additional

filegroups can be restored asrequired with the alreadyrestored portions of the

database remaining online.

When restoring the entire database inEnterprise Edition, consider whether to

restore filegroups in a prioritized mannerusing a partial restore and allowing the

database to come online earlier with partialdatabase availability, followed by online

piecemeal restores of the remainingfilegroups later.

Try to perform ahack attach (14)

Try to performattach with log

rebuild (13)Try to perform

regular attach (12)

No (i.e. missingdata files)

Try to performdata extraction

(17)

Try to performEMERGENCY

mode repair (16)

Try to repair 8992/2570 errors (31)

Islog explorer tool

available?

Use tool to reverseuser error that

deleted data (27)

Yes

Areregular backups

available?No

Yes

Set damagedportion of

database offline toallow onlinerestore (07)

Yes

Try to repairdatabase using

DBCC CHECKDB(29)

Switch toEMERGENCY

mode if possible(15)

Are backupsavailable?

Fails

Yes

If backups are available at thispoint, at any time jump to the

restore sequence

Fails

Succeeds

Yes

Try to setdatabase ONLINE

(11)

Perform a hack-attach tail-of-the-log backup if onlylog files exist (02)

Whether itworks or not

Restore tail-of-the-log backup ifrequired (10)

Fails

No

No

No

Update resume.Leave town :-)

This is not a situation youwant to be in. At this pointthere is no way to recover

the deleted data.

Bring the databaseonline (32)

Perform regulartail-of-the-logbackup (01)

Restore masterdatabase (if backup

exists) and/or reattachall necessary

databases, recreateuser/logins (19)

Back up repaireddatabase to bestarting point for

any furtherdisaster

Succeeds

Fails Fails

Not required for failover clustering, database

mirroring with transparent client redirection, or an Availability Group with

applications configured to use the Listener virtual

name

Many steps on this poster require more in-depth explanation.

Look for a two-digit code of the form (XX), which means go to

www.SQLskills.com/DRPoster.asp and find that code for more information and differences

between SQL Server versions.