31
© 2008 Kroll Ontrack Inc. | www.krollontrack.com Recovering Your Virtual Data April 29, 2009 David Logue Sr. Data Recovery Engineer

Kroll Ontrack Recovering Your Virtual Data

Embed Size (px)

DESCRIPTION

Discussion of Data Recovery for VMware systems and other virtual environments

Citation preview

Page 1: Kroll Ontrack Recovering Your Virtual Data

© 2008 Kroll Ontrack Inc. | www.krollontrack.com

Recovering Your Virtual Data

April 29, 2009

David LogueSr. Data Recovery Engineer

Page 2: Kroll Ontrack Recovering Your Virtual Data

2

Learning Objectives

Identify common data loss scenarios in virtual environments

Challenges with recovering virtual data

Recommendations when virtual data loss occurs

Design recommendations for a virtual environment with data loss prevention in mind

Page 3: Kroll Ontrack Recovering Your Virtual Data

3

BSOD or PSODDo They Give You Chills?

Page 4: Kroll Ontrack Recovering Your Virtual Data

4

Common Data Loss Scenarios

Hardware failures RAID Disk

Software failures File System Data Corruption Database Corruption VMware Metadata Corruption

Human error Deleted Overwritten Formatted (Guest and Host level)

Page 5: Kroll Ontrack Recovering Your Virtual Data

5

Common Data Loss Scenarios Failure Types

Page 6: Kroll Ontrack Recovering Your Virtual Data

6

Common Data Loss Scenarios Failure Types

Source: Over 100 Kroll Ontrack Virtual Data Recovery Jobs Over the Past 12 Months

Page 7: Kroll Ontrack Recovering Your Virtual Data

7

Learning Objectives

Common data loss scenarios in virtual environments

Challenges with recovering virtual data

Recommendations when virtual data loss occurs

Design recommendations for a virtualenvironment with data lossprevention in mind

Page 8: Kroll Ontrack Recovering Your Virtual Data

8

Challenges with Recovering Virtual Data

Recovery of multiple guests on a single volume

Snapshots, logs and swap files add complexity

Virtual file system fragmentation

Size of the recovery

Lack of a good backup that has been tested

Using traditional methods of recovery, such as restore, may make the problem worse

Page 9: Kroll Ontrack Recovering Your Virtual Data

9

Case Study – Hospital in CrisisInitial Facts

Hospital had a 5 drive RAID 5 array attached to their VMware ESX server (1.2TB volume)

The array hosted 4 MS Windows 2003 Server virtual machines running MS SQL 2005 which contained their patient medical records

The RAID controller failed

Hospital replaced the RAID controller and rebooted

All of the drives stayed offline after the reboot

Page 10: Kroll Ontrack Recovering Your Virtual Data

10

Case Study – Hospital in CrisisCustomer Plan

Force the drives online and rebuild

If that failed, restore from backup

If that failed, recreate the missing patient data from other sources

Page 11: Kroll Ontrack Recovering Your Virtual Data

11

Case Study – Hospital in CrisisAdditional Options

Customer contacted Kroll Ontrack for a free Data Recovery consultation.

Kroll Ontrack’s recommendations:

Image the drives before starting the restore/rebuild process

If the restore or rebuild fails: Start a Remote Data Recovery or Ship drives to Kroll Ontrack for recovery

Page 12: Kroll Ontrack Recovering Your Virtual Data

12

Case Study – Hospital in CrisisAdditional Customer Challenges

The customer imaged the drives

The customer forced the drives online and determined:

The RAID configuration was damaged and One of the drives was out of date (degraded)

Forcing a rebuild with a degraded would cause additional damage

Backups did not include the SQL data

Time to recreate data – 3 months to 2 years

Page 13: Kroll Ontrack Recovering Your Virtual Data

13

Case Study – Hospital in CrisisKroll Ontrack to the Rescue

The customer contacted Kroll Ontrack

Kroll Ontrack connected the customer remotely and started the evaluation and recovery

Page 14: Kroll Ontrack Recovering Your Virtual Data

14

Case Study – Hospital in Crisis VMware ® recovery overview

Page 15: Kroll Ontrack Recovering Your Virtual Data

15

Case Study – Hospital in Crisis VMware ® recovery overview

Locally attached drives, SANs, iSCSI, NFS Storage.

Page 16: Kroll Ontrack Recovering Your Virtual Data

16

Case Study – Hospital in Crisis VMware ® recovery overview

Locally attached drives, SANs, iSCSI, NFS Storage.

Software RAID manager used to replace RAID controllers that are no longer presenting the LUNs correctly. Supports all the types of RAID configurations.

Page 17: Kroll Ontrack Recovering Your Virtual Data

17

Case Study – Hospital in Crisis VMware ® recovery overview

Locally attached drives, SANs, iSCSI, NFS Storage.

Software RAID manager used to replace RAID controllers that are no longer presenting the LUNs correctly. Supports all the types of RAID configurations.

Virtual device presented by the RAID manager. It is seen by the tools as if it was the original device.

Page 18: Kroll Ontrack Recovering Your Virtual Data

18

Case Study – Hospital in Crisis VMware ® recovery overview

Locally attached drives, SANs, iSCSI, NFS Storage.

Software RAID manager used to replace RAID controllers that are no longer presenting the LUNs correctly. Supports all the types of RAID configurations.

Virtual device presented by the RAID manager. It is seen by the tools as if it was the original device.

Specialized recovery tools are used to recover from corruption inside most any file system.

Page 19: Kroll Ontrack Recovering Your Virtual Data

19

Case Study – Hospital in CrisisInside the RAID

Disk 0 Disk 1 Disk 2 Disk 3 Disk 4

Kroll Ontrack Raid Manager

KO Rollback Layer

The RAID failure was causing VMware data to be inaccessible, Ontrack replaced the RAID controller with software to get to the data.

Page 20: Kroll Ontrack Recovering Your Virtual Data

20

Case Study – Hospital in CrisisInside the RAID

Disk 0 Disk 1 Disk 2 Disk 3 Disk 4

Kroll Ontrack Raid Manager

KO Rollback Layer

VMFS Metadata

VM1-VMDK1 VM1-VMDK2 VM2-VMDK1 VM2-VMDK2VM3-VMDK1 VM3-VMDK2

VM1-MetaData

VM2-MetaData

VM3-MetaData

Ontrack engineers mapped out the data to determine the original RAID configuration and present ed the array to our recovery tools.

This virtual Raid is then accessed like the original array for the rest of the recovery process

Page 21: Kroll Ontrack Recovering Your Virtual Data

21

Case Study – Hospital in CrisisInside the RAID

VMFS Metadata

VM1-VMDK1 VM1-VMDK2 VM2-VMDK1 VM2-VMDK2VM3-VMDK1 VM3-VMDK2

VM1-MetaData

VM2-MetaData

VM3-MetaData

VM1 VM2 VM3

Once the array was presented, individual virtual machines were recovered from the VMFS volume

Page 22: Kroll Ontrack Recovering Your Virtual Data

22

Case Study – Hospital in CrisisInside the RAID

VMFS Metadata

VM1-VMDK1 VM1-VMDK2 VM2-VMDK1 VM2-VMDK2VM3-VMDK1 VM3-VMDK2

VM1-MetaData

VM2-MetaData

VM3-MetaData

VM1 VM2 VM3

Once the array was presented, individual virtual machines were recovered from the VMFS volume

Proprietary NTFS and SQL recovery tools were then used to recover critical databases

Page 23: Kroll Ontrack Recovering Your Virtual Data

23

Case Study – Hospital in CrisisConclusion

Ontrack used four levels of recovery to get to the customer data Raid recovery tools to re-assemble the original Raid configuration VMFS recovery tools to repair damage to the file system and copy

out the VMDK files NTFS recovery tools to repair the NT file system and copy out the

SQL files MS SQL recovery tools to extract the tables into a new database

Kroll Ontrack was able to get a full recovery of the critical SQL data

Page 24: Kroll Ontrack Recovering Your Virtual Data

24

Learning Objectives

Common data loss scenarios in virtual environments

Challenges with recovering virtual data

Recommendations when virtual data loss occurs

Design recommendations for a virtualenvironment with data lossprevention in mind

Page 25: Kroll Ontrack Recovering Your Virtual Data

25

Recommendations When Data Loss Occurs

Don’t panic and don’t update your resume

When troubleshooting, do not write any data to the storage array or change storage configurations.

Don’t format the volume that has missing data

Use the support system offered by the software provider

Restore data to an alternate location and contact a data recovery company with extensive virtual data recovery experience including the ability to perform remote recoveries

Page 26: Kroll Ontrack Recovering Your Virtual Data

26

Recommendations When Data Loss Occurs

Definition of Data Recovery (DR) DR gets back files from corrupted or inaccessible storage (directly from

the failed system, not from a backup) DR gets back most recent files vs most recent backup In some cases, DR is faster than restoring from the last backup DR fits well as part of an overall disaster recovery plan

Page 27: Kroll Ontrack Recovering Your Virtual Data

27

Learning Objectives

Common data loss scenarios in virtual environments

Challenges with recovering virtual data

Recommendations when virtual data loss occurs

Design recommendations for a virtual environment with data lossprevention in mind

Page 28: Kroll Ontrack Recovering Your Virtual Data

28

Design Recommendations

Implement naming conventions for hosts, guests, physical servers and virtual file system volume

Control who has access to the environment

Document the backup and recovery plan and include the contact information of your preferred data recovery vendor in the plan

Test your backups on a regular basis

Use the tools to manage your virtual environment; don’t take shortcuts

Be careful how you use snapshots and do your housekeeping

Monitor the data stores, logs and swaps

Page 29: Kroll Ontrack Recovering Your Virtual Data

29

Learning ObjectivesSummary

Common data loss scenarios in virtual environments

Challenges with recovering virtual data

Recommendations when virtual data loss occurs

Design recommendations for a virtualenvironment with data lossprevention in mind

Page 30: Kroll Ontrack Recovering Your Virtual Data

30

Conclusion

Thank you!

Dave LogueSr. Remote Data Recovery Engineer

Kroll Ontrack, a Marsh & McLennan [email protected]

Page 31: Kroll Ontrack Recovering Your Virtual Data

© 2008 Kroll Ontrack Inc. | www.krollontrack.com