View
214
Download
0
Category
Tags:
Preview:
Citation preview
Backup Methods For a Hot Site
Dieter W. Storr Los Angeles Times
23 August 2005
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
2
B/R Methods Existing Backup Method Experiences Mirroring or Replicating Fast Copy of Data Proposals and Costs Future Technology Lessons learned
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
3
Existing Backup Method
•From disk (databases)
•Copy to
•3490 / 3590-1 / VTS
•Then, copy to
•3590-1 (cartridge)
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
4
ADABAS 6.2.2 Back-up at LA Times
W e e k l y 2 1 : 0 0 - 2 1 : 3 0
A D A P n B K F O n l i n e S A V E
A D A P n P L C F E O F P L
A D A P n P L C P L O G S w i t c h
D F D S S F u l l V o l u m e B a c k - u p
A D A P n B K O C o p y O n l i n e S A V E s
B R M / A B A R S S e v e r a l J o b s
P D S , G D G s , e t c . D i s k P o o l
2 : 0 0 3 : 0 0 8 : 0 0 - 1 1 : 0 0
P i c k - u p b y R e c a l l
2 1 : 3 0 - 1 : 1 5
J o b 3 4 9 0 t a p e s ( 3 5 9 0 - 1 )
D A P 1 B K O 2 ( 1 ) A D A P 2 B K O 3 5 ( 1 ) A D A P 3 B K O 1 6 ( 1 ) A D A P 4 B K O 8 ( 1 ) A D A P 5 B K O 4 ( 1 ) 6 5 ( 5 ) D F D S S / o n e t a p e p e r v o l u m e
5 9 ( ? )
B R M / A B A R S 2 2 ( ? ) T O T A L 2 1 1 ( ? ) ( O n l y f o r A D A B A S )
S t a t u s : 2 7 J a n 2 0 0 5
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
5
B/R Methods
Source: http://www.drj.com/articles/spr02/1502-07.html
“Companies that relied on
tape or on third-party provider
found in many cases they had
difficulty meeting their recovery
time objectives.”
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
6
B/R Methods
Source: 15 Apr 2004 | SearchSecurity.com
“Flaws in tape-based data backup may be
leaving enterprises without key information
and could lead to legal exposure under
emerging laws such as Sarbanes-Oxley, say
data backup and recovery experts. “
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
7
B/R Methods In a survey of 500 IT departments completed …
found that as many as 20% of routine, nightly backups fail to capture all data.
40% of IT managers had been unable to recover data from a tape when they needed it
More than 23% sought to use data stored on tape backups more than 20 times in a year
Source: 15 Apr 2004 | SearchSecurity.com
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
8
B/R Methods
Are tapes really so bad?LA Times experiences?
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
9
Tape Problems
1 November 2002:
Six tape drive errors
Delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
10
Tape Problems
24 March 2003:
Only two channel paths per
tape controller were provided
Slow restore time
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
11
Tape Problems
5 October 2003:
3590 tape drives were not
defined to DFSMS (SMS)
ADABAS restore and
application test cancelled
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
12
Tape Problems
6 December 2003:
VTS problems with GDG
datasets
End-user functions
couldn’t be tested
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
13
Tape Problems
5 August 2004:
Restore jobs had to wait for an input
tape that was being used by another
restore job
Delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
14
Tape Problems
30 October 2004:
Packages didn’t arrive in time,
due to a thunderstorm that
affected FedEx delivery
Major delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
15
Tape Problems30 October 2004:
Automated tape library experienced
unit address problems during the
restore process
Delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
16
Tape Problems
30 October 2004:
VTS logical tapes were not shipped
to Wood Dale (HSM level 2, SAR
level 2)
Delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
17
Tape Problems30 October 2004:
Confusion about
when to load DRP1
and DRP2 tapes,
before or after IPL
Delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
18
Tape Problems
30 October 2004:
ICIS libraries were not
backed up to tape
Application tests were not
possible
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
19
Tape Problems
8 December 2004:
Load problems
Tapes were loaded before IPL and
not after IPL
Major delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
20
Tape Problems
8 December 2004:
Experienced problems when
trying to restore MIG1 data,
e.G. DRADABC0 job
Major delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
21
Tape Problems
8 December 2004:
Recall sent by FedEx tapes to SunGard
One damaged package arrived without
tapes Restored DATA one generation back (-1)
System was generation (0)
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
22
Tape Problems
21 March 2005:
Level 2 tapes for VTS not
being sent off-site (but have
been on the list)
Application team couldn’t
test all data
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
23
Tape Problems
5 August 2005:
3590-1 cartridges ejected,
not found DSS8370W - TMS SHOWS TAPE
N00318 OUT OF AREA “DRP1”,SLOT
00031
Delay
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
24
Time Warner employee data missing
May 2, 2005: 5:51 PM EDT
NEW YORK (CNN) - Time Warner Inc. said Monday that data on 600,000 current and former employees stored on computer backup tapes was lost by an outside storage company and that the Secret Service is now investigating.
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
25
Lost Backup Tape Held Ameritrade Client Data
Wednesday, April 20, 2005 - LA Times
… package was damaged during shipping between vendors ….. fourth tape is still missing…… The tapes may have included customers’ Social Security numbers …..
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
26
Info On 3.9M Citigroup Customers Lost
Monday, June 6, 2005 – CNN.COM
Citigroup, the nation's biggest financial services company, said that UPS lost the tapes while shipping them to a credit bureau in Texas.
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
27
Costs for Tape Backups SunGard recovery services Offsite tape storage Tape handling Shipping per test Special extra pick-ups
Yearly $150,000
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
28
Costs
Not capable to restore one day $$ ???
Last December: 2 weeks to rebuild manually (?) customer tables
Does it make sense to restore more than 2 days back ??
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
29
Costs
Example:
20 employees x $140 per day x 10 days
= $28,000
And they couldn’t work on other projects
$140 is based on $51,100 yearly income
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
30
Quantitative Risk Analysis Single Loss Expectancy
SLE = Single Loss Expectancy EF = Exposure Factor, for
example 50% or .50 AV = Asset Value, for example
$1,000,000
SLE = EF * AV
SLE = .5 x $1,000,000 = $500,000
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
31
B/R Methods
Reducing tapes
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
32
B/R Methods
Reducing tapes Stacking datasets to
3590-1 cartridges Using Delta Save Facility
from ADABAS
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
33
B/R Methods
Reducing tapes Using Forward Index
Compression (FIC) from ADABAS
Using larger block size for 3590 tapes = 256K, supported by ADABAS
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
34
ASSO
ADASAV
DLOG
Delta Save
changed blocks
NUCLEUS
DDPLOGR1
DATA
ASSO
ASSO
DATADATA
Buffer Pool Delta Log (RABN) changed RABN
ADARES
PLCOPY
DSIM
DDPLOGR2
SAVE
DELTA
PLOG copy
DDSAVE1
DDDSIM
DSF=YES
DDSIAUS1
DSF=YES
DSF=YES
Dual Protection Log
Extracted
Blocks
Delta Save Facility (DSF)
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
35
Delta SaveADASAV
RESTORE
DSIM
DDDSIM
DSF=YES
DATADDDELT1-8
DDREST1
Full Image
Save
Online/Offline
Online Images
RABN
extracted
ASSO
RABN
from PLOG
Delta Save Facility
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
36
B/R Methods
Forward Index CompressionRochester Gas & Electric
Space savings: Normal Index: 37% - 55% Upper Index: 21% - 69%
Within an index block the part of the index value that is identical to the forward part of the previous index value is suppressed.
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
37
B/R Methods
IBM Magstar 3494 / Virtual Tape Server (VTS)
LA Times
SunGard
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
38
B/R Methods
VTS problemsLA Times: Completion code A78 RC 18 We switched from VTS to 3590-1
cartridges
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
39
B/R Methods
VTS problemsVirginia Information Technologies Agency: Ran 2003/2004 into the same problem system
completion code A78 RC 18 We … converted … to 3490/3590 physical
tapes Problem solved
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
40
B/R Methods
Disk to Disk Mirroring
Hardware Software
Replicating Software
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
41
B/R Methods – Enterprise Server
Enterprise Server
UNIX
NT / 2000 / XP
Hot Site
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
42
B/R Methods – Open System
Hot Site
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
43
B/R Methods
Marty StewartDisaster Recovery Manager
AnMed Health:
“…we’d rather have a server that’s running slower than having no server at all.”
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
44
Disk Mirroring
Benefits Asynchronous disk mirroring can
provide better physical protection by supporting extended physical distances.
No loss of committed transactions in synchronous storage (mirroring/RAID) on a CPU failure
ASSO
DATA
ASSO
DATA
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
45
Limitations No protection from data corruption Secondary site is not guaranteed to be
transitionally consistent, in the case of asynchronous mirroring.
Client application must be re-started after failure and need to be aware of failure
ASSO
DATA
ASSO
DATA
Disk Mirroring
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
46
Disk Mirroring
Limitations Synchronous mirroring and RAID
devices can add overhead to application performance.
Redundant/specialized high availability hardware/software can be expensive and restricted to use for backup purposes only.
ASSO
DATA
ASSO
DATA
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
47
Limitations Secondary copy of data is not
available for use – low hardware utilization.
Need to replicate everything on disk, no selectivity of data replication
ASSO
DATA
ASSO
DATA
Disk Mirroring
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
48
Example For Disk Mirroring
S/390 UNIX
S/390 UNIX
12-15 miles
OC-3 link
EMC 5700
EMC 5700
SRDF remote mirroredsynchronized
Back Up / Hot Site
SRDF remote mirroredsynchronized
Main Platform
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
49
B/R Methods Can we buy used
Enterprise Servers? Yes…..and inexpensive OP system is free for D/R
Search for “selling used mainframes,” for example:
http://www.used-line.com/fdc3236-find-dealer.htm
http://www.azure.co.uk/
etc.
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
50
Dedicated line broadband speeds and prices
T-1 - 1.544 megabits per second (24 DS0 lines) Ave. cost $400.-$650./mo.
T-3 - 43.232 megabits per second (28 T1s) Ave. cost $6,000.-$16,000./mo.
OC-3 - 155 megabits per second (100 T1s) Ave. cost $20,000.-$45,000./mo.
OC-12 - 622 megabits per second (4 OC3s) no price OC-48 - 2.5 gigabits per seconds (4 OC12s) no price OC-192 - 9.6 gigabits per second (4 OC48s) no price
Source: http://www.infobahn.com/research-information.htm
prices updated: 12 May 2005
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
51
Peer-to-Peer Remote Copy Extended Distance (PPRC-XD)
PPRC = 60 miles - PPRC-XD = continent
ESS Shark
- IBM ESS DASD - HDSalso support PPRC
ESS Shark
FlashCopy
Also see TimeFinder from EMC
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
52
External Back-up Systems
Fast Copy of Data Snapshot
No data movement A virtual copy by copying pointers
Copy Process Physical copy async. from the log. copy No impact on applic. on the original data
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
53
Fast Copy of Data Specific Hardware Required
Software works only with the hardware Work on Volume Level
Some snapshot only tools work also on dataset level
External Back-up Systems
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
54
Snapshot & Physical Copy
IBM Hardware: Enterprise Storage Server Software: FlashCopy
http://www.share.org/proceedings/sh98/data/S3087.PDF
EMC2
Hardware: Symmetrix Remote Data Facility Software: EMC TimeFinderhttp://www.emc.com/interactive_center/media/timefinder/tf_noRC.html
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
55
Flash Copy
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
56
How It Works
Read / update
PhysicalBackup
PhysicalBackup
SnapshotSnapshot
Read / updateRead only
snap
Pre-defined time window
Suspend Resume
SourceData
SourceData
Read only: update requests are queued
Source: SAG
ADADBS TRANSACTIONS SUSPEND,TTSYN=60,TRESUME=120
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
57
Replication
Benefits Warm standby systems can be
configured over a Wide Area Network, providing protection from site failures.
Ability to more quickly swap to the standby system in the event of failure, as backup database is already on-line.
Data corruption is typically not replicated as transactions are logically reproduced rather than I/O blocks mirrored.
ASSO
DATA
WORK
ASSO
DATA
WORK
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
58
ReplicationBenefits Automatic switch over for clients using
a switching mechanism, no client restart needed.
Originating applications are minimally impacted as replication takes place asynchronously after commit of the originating transaction.
The warm standby database is available for read-only operations, allowing better utilization of backup systems.
ASSO
DATA
WORK
ASSO
DATA
WORK
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
59
Benefits Ability to resynchronize and easily
switch back to primary system when it becomes available without loss of data.
ASSO
DATA
WORK
ASSO
DATA
WORK
Replication
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
60
Limitations Warm standby system will be out-of-
date by transactions committed at the active database that have not been applied to the standby.
Protection is limited to components supporting Warm Standby (e.g. DBMS data sources may be protected but file systems may not be supported).
ASSO
DATA
WORK
ASSO
DATA
WORK
Replication
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
61
Entire Transaction Propagator The Entire Transaction
Propagator allows for
asynchronous data
replication.
Replicated data can be
updated and
synchronized with
master data at user
specified intervals.
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
62
ADABAS Data Replication Logical dissemination of ADABAS Data to
homogeneous or heterogeneous targets Near real time propagation Event driven at the Transaction level Implemented at the Database/file level for Store,
Delete and Update commands Define Replication rules through subscriptions Minimal Impact on normal nucleus activity Strategic for Enterprise Data Sharing Replace Entire Transaction Propagator
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
63
ADABAS Data Replication
Origin
DBMS
File File
Target
Target
Field
Target
Field
Target
DBMS
Target
Tablez/OS
Image B
Unix
Server D
z/OS
Image Cz/OS Image A
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
64
Possible Hot Site Solutions
Enterprise Server Los Angeles
Own Enterprise Server Hot Site
Shark
Shark
OC3OC3
Shark
EMC
OC3OC3
EMC
EMC
OC3OC3
Converter ESCON
FICON
Fiber Optic
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
65
Costs for Tape Backups SunGard recovery services Offsite tape storage Tape handling Shipping per test Special extra pick-ups
Yearly $150,000
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
66
Costs for Real Disaster SunGard Declaration Fee D/R Site Daily Usage Fee Office Space Daily Usage Fee Work Group Declaration Fee Work Group Daily Usage Fee LAN Bridge Declaration Fee LAN Bridge Daily Usage Fee
30 Days $475,000
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
67
Costs for Own Hot Site Used IBM Z800-0X2 Mainframe Used IBM 2105-F20 Shark Storage Used IBM 3494 Library, VTS and Tape Drives 3rd Party Next Day HW Maintenance Printer and Terminal Controller Re-location Costs 3490 Tape Drive and Controller Re-location Costs Other Costs
1st Year $520,000After 5 Years Total $735,000
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
68
Costs for Own Hot Site
5 Years SunGard = $750,000
30 Days Real Disaster = $475,000
5 Years Own Facility = $735,000
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
69
Restore Times (Min)
0
100
200
300
400
500
600
IPL done DB restore
IPL done 0 330 425 540 365 420 420 330 270 360
DB restore 229 129 165 221 0 104 63 90 78 136
Feb-02
Jun-02
Nov-02
Mar-03
Oct-03
Dec-03
Oct-04
Dec-04
Mar-05
Aug-05
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
70
Benefits of Own Hot Site Financial savings > $150,000
annually providing an almost 5% ROI
Reduced recovery time Reduced impact due to road and airport
closures Elimination of reliance on external vendors Mainframe and open system can use the
same facility
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
71
Grid Computing
virtual machine
virtual memory
virtual storage
virtual I/O
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
72
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
73
Grid Computing
ResourceBroker
ReplicaCatalog
Information Service
ComputerElement(s)
StorageElement(s)
HardwareSoftware
Locations
User
Interface
Passkey
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
74
Grid Computing
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
75
Grid Computing BBC builds distributed
grid for content sharing
(Gridcast).
62.73.167.57/publicdocs/ppt/prompeg307.pptfile:///C:/My%20Documents/Dieter/my%20presentations/prompeg307.ppt
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
76
Grid Computing For Backup?
Intra or Extra Grid?
Pull or Push?
Grid Software
http://www.gridcomputing.com/
http://www.gridforum.org/ggf_grid_understand.htm
http://gridcafe.web.cern.ch/gridcafe/animations.html
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
77
Backup Methods
Mostly used by other companiesSource: DRJ Magazine
VTS Disk to disk
Is more and more common for enterprise storage servers and AIX server technology, for example.
Source: @server Magazine
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
78
B/R Methods
Problems for other companies High third-party hot site costs, approx.
$10,000 - $70,000 per month Restore time
24-30 hours
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
79
How Far is ‘Far Enough?’(http://www.drj.com/articles/spr03/1602-02.html)
Alternate Facility
Offsite Storage
Facility
Answer = 105 miles
…so the survey
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
80
Lessons Learned (http://www.drj.com/articles/spr02/1502-07.html)
Distance is keyStreets, bridges, tunnels, airports are closed
Tape recovery is not effective All applications are critical Inconsistent back-up is no back-up at all
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
81
Lessons Learned (http://www.drj.com/articles/spr02/1502-07.html)
People-dependent processes do not suffice
Two sites are not enough People are hard to replace but
information is irreplaceable
23 August 2005 Dieter W. Storr -- www.storrconsulting.com
82
…..we should have an excellent HOT SITE!
Recommended