19
Identifying Trends in Enterprise Data Protection Systems George Amvrosiadis, University of Toronto Medha Bhadkamkar, Symantec Research Labs

Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

Identifying Trends in Enterprise Data Protection Systems

George Amvrosiadis, University of Toronto

Medha Bhadkamkar, Symantec Research Labs

Page 2: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

2

June 10, 2015

June 24, 2015

Copyright © 2015 Symantec Corporation

June 25, 2015 [1]

[2]

Page 3: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

We need to fix Backup Systems

• Too many parameters to fine-tune – Top 3 commercial products come with 1000-page admin guides

– Businesses experience problems recovering from backups 1 in 6 times [3]

• Too much data to backup – 94% of businesses backup more than just mission-critical data,

and 40% backup everything [4]

– Only 28% of businesses complete all backups on time [5]

• Complexity and missed goals lead to frustration – 55% of businesses plan to change backup tools within 24 months

3

Study goal: Use customer data to help researchers understand and improve data protection systems

Copyright © 2015 Symantec Corporation

Page 4: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

Storage servers

• Data protection systems are multi-tiered domains – Master server: job scheduling,

backup image metadata

– Clients: transmit backup data

– Storage servers (optional): backup storage management

• Backup policies specify clients’ backup schedules – E.g. “weekly full, daily incremental backups”

– Policy types tailored to applications

e.g. Oracle, VMware, Microsoft Exchange

4 Copyright © 2015 Symantec Corporation

Anatomy of Modern Data Protection Systems Master server

Clients

Domain

Page 5: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

• Customer domains periodically transmit telemetry – Collected from consenting Symantec NetBackup customers

– Weekly reports of runtime and configuration statistics

• Telemetry allows us to study how domains evolve – Reports can be grouped and analyzed as time series

• Dataset represents large, diverse domain population – 1M telemetry reports from 40,000 domains, collected over 3 years

– 35% of domains 3-tiered, rest 2-tiered

– 31% of domains use dedicated backup appliances

5

Study Dataset

Copyright © 2015 Symantec Corporation

Page 6: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

1 Domain configuration

2 Job scheduling

3 Backup data growth

4 Avenues for future research

6 Copyright © 2015 Symantec Corporation

Outline

Analysis results

Page 7: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

1 Domain configuration

2 Job scheduling

3 Backup data growth

4 Avenues for future research

7 Copyright © 2015 Symantec Corporation

Outline

Page 8: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

0 10 20 30 40 50 60 70 80

5

10

15

20

25

30

35

40

Week

Num

be

r o

f clients

• Client populations rarely shrink – Client population reaches

stable state after first 3 weeks

– 93% of changes attributed to growth

• Clients are introduced every 3 months, in groups – Low variability across changes, growth bursts 5% of the time

8

Domain configuration: Clients

Copyright © 2015 Symantec Corporation

change point

change size

configuration period

Takeaway: For resource provisioning, keep in mind that clients are added in bursts

Page 9: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

• Domain components remain unprotected at times – 16% of clients spend time unprotected

– Only 32% of domains use a policy to protect master server state

• Domains typically protect fewer than 3 application types – 36% of domains deploy

policies targeted to a single application

– Number of policies stays fixed after first 3 weeks

9

Domain configuration: Backup policies

Copyright © 2015 Symantec Corporation

0

10

20

30

40

50

60

70

80

90

100

1 2 4 6 8 10 12 14 16

Number of policy types per domain

Pe

rce

nta

ge

of d

om

ain

s

CDF

Median: 1.9 policy types

Mean: 2.6 policy types

Takeaway: Domains are homogeneous wrt. client policies, making policy auto-configuration a feasible goal

Page 10: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

1 Domain configuration

2 Job scheduling

3 Backup data growth

4 Avenues for future research

10 Copyright © 2015 Symantec Corporation

Outline

Page 11: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

11

• Recoveries are rare and sparse – Occur in 1275 domains (3.1%)

– 337 domains (0.8%) recover frequently as part of testing

• Frequent full backups are preferred to incremental ones – Full backups are rarely

weekly events

– Only 33% of frequent full backups are complemented by incremental ones

Copyright © 2015 Symantec Corporation

Job scheduling: Frequency

Domains Recovery

events Avg. event frequency

938 < 5 2 months

337 ≥ 5 2 weeks

Takeaway: Recoverability of images is rarely tested, and frequent full backups are preferred to incremental ones

28%

44%

17% 11%

0%

10%

20%

30%

40%

50%

1-3 3-6 6-8 8+

% o

f d

om

ain

s

Full backup frequency (days)

Literature assumption

Page 12: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

• Default scheduling windows are popular

– Activity spikes at beginning of scheduling windows (6pm, 12am)

– Administrators schedule fewer jobs during the weekend

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

Mo

n0

0:0

0

Tu

e0

0:0

0

We

d0

0:0

0

Th

u0

0:0

0

Fri

00

:00

Sat

00

:00

Su

n0

0:0

0

Mo

n0

0:0

0

Hour of the week

Sche

d.

pro

ba

bili

ty (

%)

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

Mo

n0

0:0

0

Tu

e0

0:0

0

We

d0

0:0

0

Th

u0

0:0

0

Fri

00

:00

Sat

00

:00

Su

n0

0:0

0

Mo

n0

0:0

0

Hour of the week

Sche

d.

pro

ba

bili

ty (

%)

12 Copyright © 2015 Symantec Corporation

Job scheduling: Timing

Mon Tue Wed Thu Fri Sat Sun

…unless it’s the weekend!

Takeaway: Consistently using the same/default scheduling window creates job bursts

Page 13: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

1 Domain configuration

2 Job scheduling

3 Backup data growth

4 Avenues for future research

13 Copyright © 2015 Symantec Corporation

Outline

Page 14: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

• Incremental backups resemble full backups in size

– Deduplication reduces full backup sizes by 89% on average

– Incremental backups can be larger due to misconfigurations, timestamp modifications by maintenance tasks

14 Copyright © 2015 Symantec Corporation

Backup data growth: Job sizes

0102030405060708090

100

1 2 4 8 16 32 64 128 256 512 1024 2048

Average GB transferred per job

Pe

rce

nta

ge

of

do

main

s

Incremental backups

Full backups

Recovery operations

Recovering a few files

Recovering a volume

Takeaway: Deduplication may obsolete incremental backups. Recovering only a few files is not uncommon.

Page 15: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

0102030405060708090

100

1 d

ay 2 3

1 w

ee

k 2 3

1 m

o 2 3 6

1 y

ea

r 2 3 5

Retention period

Pe

rce

nta

ge

of jo

bs

Incremental backups

(Mean: 25 days)

Full backups

(Mean: 40 days)

• 94% of retention periods picked from preset values

• Retention time is proportional to backup frequency

– Less frequent full backups are retained longer (ρx,y = 0.53)

15 Copyright © 2015 Symantec Corporation

Backup data growth: Retention periods

Incremental backups

retained for weeks Full backups

retained for months

Takeaway: Retention periods are selected with backup storage capacity in mind

Page 16: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

1 Domain configuration

2 Job scheduling

3 Backup data growth

4 Avenues for future research

16 Copyright © 2015 Symantec Corporation

Outline

Page 17: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

17 Copyright © 2015 Symantec Corporation

In summary: Avenues for future research

• Auto-configuration and self-healing backup systems – Clients are introduced in bursts, but may be left unprotected

– Domains are homogeneous wrt. policies protecting clients

– Default scheduling windows are preferred, causing job bursts

• Improve rehydration time of deduplicated backup images – Deduplicated full backups are preferred to incremental ones

• Revisit backup retention as a need-based feature – Dedicated backup appliances are widely used

– Retention periods are picked with storage capacity in mind

• Re-examine techniques for instant recovery – Recovery events made up of few files are not uncommon

Page 18: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

Copyright © 2015 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners.

This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice.

Thank you:

Symantec Research Labs, Symantec Backup and Recovery group, University of Toronto SysNet and CSL groups, Fred Douglis and anonymous reviewers

Page 19: Identifying Trends in Enterprise Data Protection Systemsgamvrosi/assets/slides_atc15.pdf · Identifying Trends in ... –Top 3 commercial products come with 1000-page admin guides

References

[1] M. Jordan, “U.S. Visa-Processing Glitch Is Partially Fixed”. Wall Street Journal. Updated June 24, 2015.

[2] Bureau of Consular Affairs, “Technological Systems Issue”. U.S. Department of State. Updated June 25, 2015.

[3] Veeam Software, “Virtualization Data Protection Report 2013”. Analyst report, 2013.

[4] Iron Mountain, “Data Backup and Recovery Benchmark Report”. Analyst report, 2013.

[5] Dimensional Research, “The state of IT recovery for SMBs”. Analyst report, 2014.