17
HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS Balaji Panchanathan EMC - Avamar - Engineer [email protected]

HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

  • Upload
    others

  • View
    24

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

Balaji PanchanathanEMC - Avamar - Engineer [email protected]

Page 2: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 2

Table of Contents

Introduction ................................................................................................................................ 3

Data collection ........................................................................................................................... 3

Data Protection Advisor .......................................................................................................... 3

Backup and Recovery Manager.............................................................................................. 4

Avamar ................................................................................................................................... 6

Enterprise Manager ............................................................................................................ 6

MCGUI ................................................................................................................................ 7

Data Analysis ............................................................................................................................. 7

Range..................................................................................................................................... 7

Variation ................................................................................................................................. 8

Coefficient of Variation ........................................................................................................... 8

Time series Analysis ..............................................................................................................11

Regression Analysis ..............................................................................................................12

Storage Usage trend ..........................................................................................................12

Input data ...........................................................................................................................12

Regression output ..............................................................................................................13

CPU usage Disk I/O ...........................................................................................................14

Visualization ..............................................................................................................................14

Conclusion ................................................................................................................................15

Reference .................................................................................................................................16

Disclaimer: The views, processes, or methodologies published in this article are those of the

author. They do not necessarily reflect EMC Corporation’s views, processes, or methodologies.

Page 3: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 3

Introduction

This article will focus on how analytics can make solve problems and make a backup

administrator’s job easier and more fruitful.

Backup administrators typically face a couple of problems.

1. Backup failures

2. Ever increasing datasets and need for increasing backup window

Their jobs will become more fruitful if they:

1. Improve backup efficiency, robustness

2. Improve the reliability

3. Periodically report to the management which types of systems/database are backed

up. (This will help management determine percentage usage of each department

and whether the appropriate things are backed up)

Backup administrator’s life will be made easier by doing an analytics project, which usually

three stages.

1. Data Collection

2. Data Analysis

3. Data Reporting

Data collection

Customers using EMC Avamar® backup products can collect backup data collected from:

1. Data Protection Advisor (DPA)

2. Backup and Recovery Manager (BRM)

3. Enterprise Manager (EM) in Avamar

4. MCGUI – Avamar Administrator GUI

Data Protection Advisor

Data Protection Advisor monitors, analyzes, and reports the backup environment, can manage

multiple backup products, and list the details in the data set.

Page 4: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 4

DPA features which will help in improving the backup administrator’s life include:

1. Backup reports – How many clients are backed up, backup failures across client, etc.

2. Capacity planning – capacity reports

3. Utilization – CPU/Memory. This report will help find bottlenecks in case of problems. The

previous section mentioned that if there are backup failures or the backup speed is low

during the particular time, then the CPU/memory utilization at that time can be checked

using Data Protection Advisor.

Backup and Recovery Manager

Backup and Recovery Manager (BRM) can be considered a miniature version of Data

Protection Advisor. DPA can monitor backup environments, storage devices, and also backups

from different vendors, whereas BRM can monitor EMC backup devices; Avamar®, NetWorker®,

and Data Domain®.

The BRM tool can be used to forecast capacity usage. In BRM under Reports tabs, under

System Summary report.

The reports section has options to run the backup summary report from which analysis can be

done (explained in the data analysis section of this article). The backup report has details about

the time zone, duration, dataset, and domain which can be used to perform further analysis

related to max, range, variance, etc, explained in the data analysis section of the article. Below

is a snapshot of the backup summary report exported in Excel.

sys

te

mT

ype

client system group sta

tus

startTim

e

endTim

e

du

ra

tio

n

dat

aCh

ang

ed

pl

u

gi

n

datase

t

tota

lSiz

e

de

du

pR

atio

Av

am

ar

vcente

ribis.br

svblr.c

om

HMSP

1.BRS

VBLR.

COM

vcente

ribis.br

svblr.c

om

co

mp

let

ed

2014-

01-

01T18:1

2:48.63

1-08:00

2014-

01-

01T18:1

5:58.83

3-08:00

3 226

889

398

2

3

0

0

1

/Client

On-

Dema

nd

Data

286

983

037

9

0.2

09

39

78

8

Av rhel64 GVSP1 rhel64 fail 2014- 2014- 4 0 1 //?/MO 0 0

Page 5: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 5

am

ar

dtlt -

117.BR

SVBLR

.COM

dtlt ed 01-

06T04:2

8:36.90

9-08:00

01-

06T04:3

3:08.77

8-08:00

0

0

1

D-

13890

11316

855

Av

am

ar

rhel64

dtlt

GVSP1

-

117.BR

SVBLR

.COM

rhel64

dtlt

fail

ed

2014-

01-

06T03:5

5:33.71

4-08:00

2014-

01-

06T04:0

0:34.87

9-08:00

5 0 1

0

0

1

/lindat

a

114

094

201

17

1

Av

am

ar

rhel64

dtlt

GVSP1

-

117.BR

SVBLR

.COM

rhel64

dtlt

fail

ed

2014-

01-

06T04:3

5:23.01

2-08:00

2014-

01-

06T04:3

9:54.77

7-08:00

4 0 1

0

0

1

//?/MO

D-

13890

11722

947

0 0

Av

am

ar

rhel64

dtlt

GVSP1

-

117.BR

SVBLR

.COM

rhel64

dtlt

fail

ed

2014-

01-

06T03:2

6:52.19

3-08:00

2014-

01-

06T03:5

5:28.20

6-08:00

28 114

094

201

17

1

0

0

1

/lindat

a

114

094

201

17

0

Av

am

ar

rhel64

dtlt

GVSP1

-

117.BR

SVBLR

.COM

rhel64

dtlt

fail

ed

2014-

01-

05T23:0

7:21.90

1-08:00

2014-

01-

05T23:1

1:54.95

2-08:00

4 0 1

0

0

1

/lindat

a

0 0

Av

am

ar

rhel64

dtlt

GVSP1

-

117.BR

SVBLR

.COM

rhel64

dtlt

fail

ed

2014-

01-

06T01:0

4:34.17

0-08:00

2014-

01-

06T01:0

9:06.03

6-08:00

4 0 1

0

0

1

/lindat

a

0 0

Page 6: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 6

A snapshot of where the capacity forecast can be done is shown below.

Avamar

Enterprise Manager

Enterprise Manager can be used to manage multiple Avamar servers and provide capacity

forecast reports. Below is a snapshot of one of the UI windows where reports can be exported.

Page 7: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 7

MCGUI

MCGUI is an administrative tool provided by Avamar for managing the backup environment. In

MCGUI, capacity reports can be run to check when the capacity will be reached. Below is the

snapshot from where you can run the capacity report.

Similarly in Data Domain, if AutoSupport feature is enabled, a report can be sent to a central

server where regression analysis will be done to predict when capacity will be reached.

Data Analysis

In this section we will see how the data collected in the previous section can be put to use. We

will start with simple analytics functions and gradually move on to complex analytical tools and

how they can be used to solve problems faced by backup administrators.

Range

Suppose that we measure the backup throughput of different backups taken and check the

maximum and minimum throughput. If the range is very limited, i.e. minimum is 100Mb/hr and

maximum is 105Mb/hr, the scope of analysis will be limited. In other words, the benefit of doing

the analysis will be less. If the range is very wide, i.e. minimum is 10Mb/hr and 1Gb/hr, the

Page 8: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 8

range is very high and it makes sense to analyze further. The next part of analysis will start with

measuring variation.

Variation

The range could be very high if just one of the backups took a long time to take or one of the

backups took a very short time to complete. Hence, calculating the variation will give more

information variability of the backup throughput at various times. If the variance is high, further

analysis needs to be done to find which factors cause the variance to be high. If we export the

report to Excel, variance can be calculated easily. Functions in the Excel sheet is displayed

below.

Coefficient of Variation

Just as looking at variation might be misleading, the best approach to find the coefficient of

variation is standard deviation/mean. The variation or standard deviation might be misleading

because depending on the unit of measurement or the range of values, the variation might give

a wrong picture. For example, if the unit of backup speed is in Kbps and values are in a range of

1000Kbps-1500Kbps, then standard deviation can be in the range of 400. If the unit is in Mbps

and value is in the range of 1Mbps-1.5Mbps, then the standard deviation will be in the range of

0.5. Clearly, we cannot come to a conclusion directly from the value. However, a conclusion can

be easily reached from the value of co-efficient of variation. The snapshot below of an Excel

sheet with both standard deviation and coefficient of variation make it clear why co-efficient of

variation is a better measure.

1000 4

1015 5

1020 7

1030 3

1000 6

1000 4

Variation 164.17 2.17

Standard Deviation 12.81 1.47

Coeeficient of variation 0.01 0.30

As shown, even though the variation in column 1 is low, the standard deviation is high

compared to column 2, due to its higher values. However, the coefficient of variation reflects the

variation properly. Thus, with coefficient of variation we could correctly conclude that the

variation is greater in column 2 than in column 1.

Page 9: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 9

To measure the variation of a range of values the best thing is to measure the co-efficient of

variation. The Excel commands that can be used to calculate these values are shown below.

=VAR(D4:D9)

=STDEV(D4:D9)

=STDEV(D4:D9)/AVERAGE(D4:D9)

First, filter the backup speed of the various backups by different factors, i.e. client, time,

geography, etc., then calculate the variance under each category to get more clues.

Step 1 Calculate the average of the backup speed by categories such as client, time period,

geography, etc.

Step 2 Compare the averages among the clients for the backup speed and find the coefficient

of variation of those averages. If the coefficient of variation is high, look for the outliers, i.e. for

which client the speed is low. In a similar fashion, take the average backup speed for each time

period (different time periods such as 9AM – 10 AM, 10AM – 11AM, etc.) and find the coefficient

of variation among these averages. If the coefficient of variation is high, conduct further

analysis. This type of variation calculation will be done for different categories; client, time

period, geography, etc.

Step 3 The next step for each category where the coefficient of variation is high is to look at

each category where the average backup speed is very low and frame rules so that the average

backup speed increases. For instance, first look at time period (9AM – 10 AM) to determine if

the backup speed is slow during that particular time.

Step 4 The next step is a repeat of step 2. That is, in that time period, take all the backups and

see the coefficient of variation. If it is very low, backups from this time period for all clients will

be moved to another time period where the backup speeds are high. This will be one rule.

The second rule will be to perform the steps below if the coefficient of variation is high.

For each client, check the backup speed

For clients with lower backup speed, check whether the backup speed is better

for the same client in another time period

Page 10: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 10

If it is better, frame a rule such that the backups for that client are triggered only

during the second time period where it was found that the backup speeds are

better

Step 5 After framing a set of rules from Step 4, the Avamar server backup scheduler will

schedule the backups using those rules and monitor the backup speeds for a period of time

(configurable).

Step 6 After monitoring it will again go to Step 1 and continue. The ideal is to have very low

coefficient of variation across all categories.

Next, we will look at some of the simple analytic methods that can be used to analyze the

backup errors.

Sort the backup errors by error/codes. Then look at the error codes which contribute to

errors most and start analyzing those backup failures. .

After the first step, determine whether the majority of backup errors occur for a particular

client, time zone, geography, etc. This type of analysis by time zone, etc. can be done

easily if we export the data to an Excel spreadsheet.

Page 11: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 11

Flow chart

Time series Analysis

This type of analysis will help predict the backup speeds going forward and dataset growth

which will help guide backup administrators for their planning purposes. The time series

analysis also will help in predict when storage capacity will be exhausted

The time series can be done using Excel. First, we will focus on the capacity management.

To predict backup speed over a period of time, find the trend of average backup speed over a

period of time. Some of the trend could be decreasing linearly or exponentially. If it is going

down, further analysis can be done whether the backup speed has gone down for all

clients/time period or a particular set of time period/client. Based on that, appropriate action can

be taken.

Analyse the data using stat functions,

i.e range/standard deviation

Pick the outliers (where standard deviation is greater)

Derive hypothesis from the outliers based on

time/client/domain, etc.

Test the hypothesis

Page 12: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 12

Regression Analysis

Regression analysis is used to find the factors on which the result depends. There will be

several independent variables and one dependent variable. In our case, the dependent variable

is the result and the factors are independent variables. The results for the backup administrator

could be

Backup speed for a client/domain, etc.

Backup failures time period

Storage usage trend

CPU usage or Average disk I/O

In this section, we will look at results for backup failures, storage usage trend, and CPU

usage/disk I/O

Storage Usage trend

Storage usage depends on:

Number of clients

Retention policy

Time for which the system is up

An equation can be framed like the one below.

Y (storage usage trend) = a + b*no of clients + c*time for which system is up. Excel contains

functions to perform this analysis

A sample analysis is shown below:

Input data

Time in days

No of clients

Capacity in GB

1 100 10

2 123 10.5

3 126 10.6

4 129 10.9

5 135 11

6 140 11.5

7 146 11.6

8 151 11.7

9 141 11.8

10 152 12

Page 13: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 13

Regression output

SUMMARY OUTPUT

Regression Statistics

Multiple R 0.9917658

94

R Square 0.9835995

88 Adjusted R

Square 0.9789137

56 Standard

Error 0.0956386

12 Observation

s 10

ANOVA

df SS MS F Significan

ce F

Regression 2 3.83997

3 1.91998

6 209.909

3 5.65E-07

Residual 7 0.06402

7 0.00914

7 Total 9 3.904

Coefficients Standard Error t Stat P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept 8.3784040

36 0.53066

1 15.7886

3 9.91E-

07 7.123591 9.63321

7 7.12359

1 9.63321

7

X Variable 1 0.1436901

79 0.02511

8 5.72059 0.00072 0.084295 0.20308

5 0.08429

5 0.20308

5

X Variable 2 0.0148272

52 0.00485

5 3.05391

4 0.01848

1 0.003347 0.02630

8 0.00334

7 0.02630

8

In the above output, first look at the value R Square and if value is only greater than 0.8, the

regression model is correct. In other words, the prediction error is less.

The equation would be capacity required = 8.3 + 0.14 * no of days + 0.014 * no of clients.

Now you can predict the storage required if you predict that the number of clients will be 200 by

the end of 50 days.

The storage required according to the above equation would be = 8.3 + 0.14 * 50 + 0.014*200 =

18.1GB. Thus, the backup administrator would be able to determine when the capacity might be

exhausted and plan accordingly.

Page 14: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 14

CPU usage Disk I/O

The factors on which CPU usage and disk I/) might depend on

1. Backed up data per day

2. Number of clients

The equation would be CPU usage = a + b*backed up data per day (in GB) + c*number of

clients.

If we follow the steps in the section above under storage usage trend, the administrator will be

able to predict the CPU usage or disk I/O usage over a period of time.

This data will be helpful in the below scenarios

1. Backup speed is decreasing over a period of time

2. Backup failures are increasing

If we see the above failures and if disk or CPU usage is very high or has increased dramatically,

that would have caused these failures. Corrective steps can be taken, i.e. adding capacity

(adding more disks will lower the disk I/O and most likely increase the backup speed)

Visualization

There are a number of tools which can be used to visualize the data we have. One such popular

tool is Tableau. Using Tableau software, one can connect to different databases, transfer data

from Excel spreadsheets and then do visualization. The screenshots below lists some of what

can be achieved using Tableau.

Tableau enables graphs to be seamlessly printed from Excel spreadsheet or any database.

Some other features of the tableau software are options to forecast, calculate the variation, and

standard deviation.

Page 15: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 15

Conclusion

With the set of data analysis shown above, backup administrators can perform the following

activities in a better way.

1. Discover why backups are failing and take correcting actions

2. Forecast capacity and budgeting

3. Report the backup data used by department or by domain

Findings and benefits accrued because of these activities can be represented to management in

a visual format with tools such as Tableau.

Page 16: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 16

Reference

http://www.wikihow.com/Run-Regression-Analysis-in-Microsoft-Excel

http://users.wfu.edu/cottrell/ecn215/regress.pdf

http://www.cengage.com/resource_uploads/downloads/113318765X_342117.pdf

http://www.spiderfinancial.com/products/numxl

https://www.usenix.org/legacy/events/lisa11/tech/full_papers/Chamness.pdf

http://searchdatabackup.techtarget.com/news/1322981/The-true-role-of-a-backup-administrator

http://www.tableausoftware.com/

http://www.emc.com/collateral/white-papers/h11363-data-protection-advisor-6-overview-wp.pdf

http://www.emc.com/collateral/analyst-reports/esg-data-protection-advisor-6-raise-dp-visibility-

ar.pdf

http://www.emc.com/collateral/hardware/white-papers/h9569-vmware-brs-wp.pdf

http://www.emc.com/collateral/software/white-papers/h6112-enabling-cost-control-operational-

efficiency-data-protect-advisor-wp.pdf

http://www.emc.com/collateral/software/white-papers/h6108-avamar-dpa-wp.pdf

http://www.emc.com/collateral/customer-profiles/h8692-cp-emc-it-dpa.pdf

Page 17: HOW ANALYTICS CAN HELP BACKUP ADMINISTRATORS

2014 EMC Proven Professional Knowledge Sharing 17

EMC believes the information in this publication is accurate as of its publication date. The

information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION

MAKES NO RESPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO

THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED

WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an

applicable software license.