19
Backup Compression and Storage Deduplication: A perfect match? Hosted by David Gugick & David Swanson, Dell Software June 27, 2013

Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

  • Upload
    lekiet

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

Backup Compression and Storage Deduplication: A perfect match?

Hosted by David Gugick & David Swanson, Dell Software

June 27, 2013

Page 2: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

2

Agenda

• Speaker Introductions

• Deduplication Explained

• Deduplication and Backup Compression Benefits

• Ingest Rates

• Backup Recommendations

• Real-World Performance

• Takeaways

• Q & A

• Resources

Page 3: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

3

David Swanson

• Database Systems Consultant, Dell Software

[email protected]

David Gugick

• Product Management, Data Protection, Dell Software

[email protected] com

• @davidgugick

Your Hosts

Page 4: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

4

Deduplication Explained • Eliminates the need to save duplicate data

• Connections – CIFS, NFS, Proprietary (DD Boost, Dell RDA)

• Inline vs post-process – Max ingest rate (single stream vs aggregate)

• Find matches – Chunking – sliding windows / variable block size

• Compress

• Target vs source-side deduplication

• Software vs hardware solutions

• Read Speed (Rehydration) – Overhead varies by vendor

• Replication

Page 5: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

5

Deduplication Effectiveness Varies

• Variables that influence the dedupe ratio for a given workload include: – The type of data being backed up: Not all data sets have the same amount of duplicate data or compressibility – The frequency of backups: More frequent backups will build the dedupe dictionary more quickly – The retention period for backup jobs: Longer retention yields higher ratios – The types of backups: Full backups will dedupe better than differential or transaction log backups

• Estimated deduplication ratio – Ratio estimates range from 9-12:1 for databases – same as 90-92% compression – Not a lot of duplicate data between databases – Benefits with databases are largely due to chunk matches within a single database

• Retention recommendations – Keep only what you need: Don’t keep more backups simply to raise the ratios

• Full or differential backups? – Most vendors will estimate logical storage (dedupe ratios) based on whether the customer performs full backups or leverages differential /

incremental backups

Page 6: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

6

Deduplication Benefits

• Storage is reduced

• Replication speeds improve

• Processing is moved from servers to storage

Page 7: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

7

Backup Compression Benefits

• Reduces or eliminates disparity between source and target disk speeds – Backup speeds improve – Restore speeds improve

• Storage is reduced

• Network utilization is reduced

• Replication speeds improve

• Dump to and restore from tape speeds improve

• Helps with initializing Log Shipping / Mirroring / AlwaysOn Availability Groups

Page 8: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

8

Ingest Rates

• Max ingest rate determines how fast the device can consume data – Many times stats are based on multiple backup streams – Single stream performance may be lower – Varies widely by how much you spend

• Network plays an important part – In practice, limits are lower – 1 Gb = 125 MB / Sec – 10 Gb = 1.25 GB / Sec – Fibre Channel (8 GFC) = 1.6 GB / Sec

Page 9: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

9

• Offset backups to avoid network and ingest rate contention

• Test with and without compression – Try a low-CPU compressor – 85% compression gives you close to 7X the write

bandwidth

• Consider using differential backups to reduce storage and backup time

– 70% reduction in data backed up means backups run on average 3.3X faster

Backup Recommendations

Page 10: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

10

Backup Speed – 1 Backup

Page 11: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

11

Backup Speed – 2 Parallel Backups

Page 12: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

12

Backup Speed – 3 Parallel Backups

Page 13: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

13

Restore Speed

Page 14: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

14

Storage Footprint

Page 15: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

15

Takeaways

• Backup compression and deduplication are a good match • Test your environment

– Your results will vary based on many factors including: Rated speed of appliance, network design, backup job coordination, compressibility of the database, database data change rate

– Don’t expect much deduplication between different databases – most of the benefits are gained from backups of the same database

• Deduplication storage appliances are almost always shared in an environment – A single test on a single database in the lab is not representative of production – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of production either – Even with exclusive access to deduplication storage by DBA team, there will usually be contention from parallel backup streams

• You won’t know the full effect of performing uncompressed backups until you test – Maintenance windows and RTOs may be affected

Page 16: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

16

Takeaways

• Test using lightweight backup compression – Avoids CPU load on the database server – Allows the deduplication storage the opportunity for some extra dedupe – Avoid Adaptive Compression to maximize deduplication

• If backup and restore times are most important, don’t be concerned with actual storage consumed – At worst, it’s a wash. At best, you’re saving space with compression – Don’t be overly concerned with final deduplication ratios - don’t keep 30 days of backups for each db just to get better deduplication ratios

if you only need 14 days

• Consider reducing data backed up using differential backups – Reduces the data read from SQL Server, sent over the network, and processed by the storage – Reduced backup windows – Can be compressed just the same

Page 17: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

17

Q & A

Page 18: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

18

Resources - References • Some Deduplication Resources

– Demystifying Deduplication White Paper: http://i.dell.com/sites/doccontent/shared-content/data-sheets/en/Documents/demystifying-deduplication.pdf

– Why Dedupe is a Bad Idea for SQL Server Backups: http://www.brentozar.com/archive/2009/11/why-dedupe-is-a-bad-idea-for-sql-server-backups/

– Backup Compression and Deduplication blog posts: http://communities.quest.com/community/data-protection/blog/2012/04/05/backup-compression-and-deduplication-good-or-bad

• LiteSpeed – LiteSpeed Landing Page: http://www.quest.com/litespeed-for-sql-server/ – Tech Brief: Top 7 LiteSpeed Features DBAs Should Know About: http://www.quest.com/techbrief/top-6-litespeed-features-dbas-should-know-

about815805.aspx – Webcasts and Events: http://www.quest.com/events/list.aspx?contenttypeid=15&prod=192

• Dell DR 4100 – http://www.dell.com/us/business/p/dell-dr4100/pd

Page 19: Backup Compression and Storage Deduplication: A perfect …€¦ · – Furthermore, running full backups on the same database 30 times in a row as a test is not representative of

19

Thanks