Upload
ibm-india-smarter-computing
View
217
Download
0
Embed Size (px)
Citation preview
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 1/12
IBM Systems and Technology
Thought Leadership White Paper
July 2011
Combining IBM Real-time
Compression andIBM ProtecTIER DeduplicationBenchmark tests show that combining storage optimization technologies achieves compelling results
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 2/12
2 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Contents
2 Introduction
3 Landmarks in the data optimization landscape
5 The need for data optimization in database backup
and recovery
7 The test environment: An overview
10 Test 1: ProtecTIER deduplication only
10 Test 2: IBM Real-time Compression and ProtecTIER
deduplication
11 Summary
11 For more information
Introduction As the capacity and overhead of powering, cooling and manag-
ing larger amounts of storage continues to outpace the growth
of storage budgets, IT decision makers are increasingly looking
to optimization technologies to meet capacity demands while
minimizing capital expenditures. Recently, two storage optimiza-
tion approaches in particular have been receiving significant
attention in the industry: real-time compression for primary and
secondary data, and data deduplication for highly redundant backup data sets. Although sometimes viewed as mutually
exclusive, the two technologies are, in fact, very complementary.
This paper discusses the compelling financial and operational
advantages of deploying real-time compression and data dedupli-
cation in conjunction, as demonstrated by the results of tests in
which IBM Real-time Compression and IBM® ProtecTIER ®
Deduplication solutions were combined to optimize Oracle
database physical backups in a Network File Storage (NFS)
environment.
The compelling results of combining IBM solutions for real-
time compression and data deduplication in the Oracle database
environment include:
● Greater than 82 percent immediate savings on initial write
to disk.● Greater than 96 percent overall data reduction when com-
bined with deduplication.
● Up to 71 percent reduction in backup time.● Less CPU utilization on the deduplication engine.● Less disk activity in the deduplication subsystem.●
Less network traffic on the deduplication backup network.
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 3/12
3IBM Systems and Technology
Figure 1: The data optimization landscape
Detailed test results are included later in this document. The
bottom line is that combining real-time compression and data
deduplication optimizes your overall storage footprint by reduc-
ing data on your primary NAS devices as well as throughout
your data life cycle. By combining these two technologies, you
can achieve maximum data reduction, which maximizes your
return on investment and dramatically improves your data pro-
tection performance and capabilities.
Landmarks in the data optimization
landscape To fully appreciate the benefits of combining real-time compres-
sion and data deduplication, it’s important first to understand
how each technology works, the differences between them, and
where they fit in the overall storage architecture, as shown
in Figure 1.
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 4/12
4 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Real-time compression
Data compression reduces the size of data files so that less space
is required to store them. Real-time compression, as the name
implies, is the ability to compress data in real time—before it is
written to the hard disk rather than after—without any notice-
able performance degradation.
Designed to sit transparently in front of primary Network
Attached Storage (NAS), IBM Real-time Compression offersthe unique advantage of making it possible to shrink primary,
online data in real time with no loss in speed. With over
30 patents, it reduces the size of every file you create by up to
five times, depending upon the file type. It significantly reduces
the physical capacity required to store a file (or copies and per-
mutations of a file) through the entire data life cycle, including
backup. IBM Real-time Compression also has a feature called
the Compression Accelerator that enables the non-disruptive
compression of data that has already been saved to disk—while
applications continue to have random, read-write access to the
data. IBM Real-time Compression can also significantly enhance
overall network and storage performance, since less data is writ-ten to disk and more data can be stored in the storage cache.
As more server workloads become virtualized, real-time com-
pression becomes increasingly valuable as a tool for storage
optimization in virtualized environments. The technology works
particularly well given the compression rates associated with
virtualized files (see Table 1). As a result, many companies that
have adopted file virtualization technologies are also exploring
deployment of IBM Real-time Compression in conjunction
with file virtualization. IBM Real-time Compression solutions
transparently integrate with file virtualization solutions andcan dramatically extend the cost reductions that file virtualiza-
tion enables.
File type Compression rate
Database Up to 85 percent
Microsoft Office Up to 20 - 60 percent
VMware VMDK (virtualized files) Up to 72 percent
CAD/CAM Up to 70 percent
Oil and Gas Up to 50 percent
Table 1: Compression rates
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 5/12
5IBM Systems and Technology
Data deduplication
Data deduplication is designed to reduce the physical storage
required to store redundant data. The deduplication process
removes duplicate data and replaces it with a pointer to the main
copy, leaving only one copy of the data that actually has to be
stored. This is why it is well suited for backup data where there
are typically multiple data sets (daily/weekly, for example) of
mostly redundant data. The more copies of redundant data you
have, the higher your effective deduplication rate.
IBM ProtecTIER Deduplication solutions feature revolutionary
and patented HyperFactor® data deduplication technology.
They provide enterprise-class performance, scalability and
proven enterprise-level data integrity to meet disk-based data
protection needs while enabling significant infrastructure cost
reductions. They specifically provide improved backup
performance, up to 2000 MB/sec (7.2 TB/hour) sustained
inline deduplication, and even faster restores at up to
2800 MB/sec (10 TB/hour). It also provides:
● The ability to scale to 1 PB of physical storage.● A reduction in storage capacity consumption of up to 25 times
or more.● A non-hash-based approach that protects data integrity by
reducing the risk of data loss due to hash collision.
Technology differences
As described above and illustrated in Figure 1, real-time com-
pression and data deduplication technologies address different
problems and sit at different points in the data life cycle. But
more importantly, the two technologies are complementary;
in particular, deploying real-time compression significantly
enhances the value and performance of data deduplication. This
conclusion has been demonstrated in a series of performance
tests, which are described in detail in this paper.
The need for data optimization in
database backup and recoveryIn general, backup and recovery refers to the various strategies
and procedures involved in protecting a database against data
loss and allowing for the reconstruction of the database after any
kind of disaster. The performance and reliability of backup and
recovery operations are critical to effective database operation.
Physical backups are backups of the physical files used in storing
and recovering your database, such as data files, control files and
archived redo logs.
Ultimately, every physical backup is a copy of files storing data-
base information to some other location, whether on disk or
some offline storage media such as tape. Backup performed after
a database is properly shut down is called cold database backup.
Conversely, backup performed when a database is online and
fully functional is called hot database backup. During a cold
backup, the database is shut down and unavailable; obviously, any
technology that reduces the period of time that the database is
offline is advantageous. In either case, due to the tremendous
growth in data, it is becoming increasingly difficult for backups
to complete within designated backup windows.
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 6/12
6 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
IBM TS7610 ProtecTIERDeduplication Appliance Express
Tivoli Storage Manager
NDMPBackup
Gigabit Ethernet Switch
Oracle Database10.2.0.4
3x DB Clients
Quest BenchmarkFactory
IBM Real-timeCompression
IBM N5600-A10
Brocade 4100
LAN
Figure 2: The test environment
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 7/12
7IBM Systems and Technology
Although the benefits of utilizing database backups (either cold
or hot) are clear, their use can result in the creation of large
amounts of data that must reside in the storage environment,
taking up precious disk space and increasing the complexity and
cost of backup procedures. This is why primary storage opti-
mization is so important.
The test environment: An overview
In order to simulate accurate and realistic data storage scenarios,IBM used Quest Software’s Benchmark Factory and Data
Factory to create and populate an Oracle database running over
NFS to an IBM System Storage® N5600-A10 storage con-
troller. A 37GB baseline database was then used to test the
effects of data deduplication and compression, respectively. Each
test had a seven percent daily change rate that was simulated
between each database copy. Seven database copies were then
taken to simulate a week’s worth of Oracle data sets in an enter-
prise environment through a combination of updates to existing
data, additions of new data, and other database activities such as
delete, drop, create and remove.
Backup using Network Data Management Protocol (NDMP)
The test environment consisted of an IBM TS7610
ProtecTIER Deduplication Appliance Express acting as a
Virtual Tape Library attached to an IBM Tivoli® Storage
Manager server. In such a configuration, the Tivoli Storage
Manager server controls the virtual tape library through a direct
physical connection to the library robotics control port.
(The library robotics, the IBM Tivoli Storage Manager server
and the NAS file server are all connected over Fibre Channel.)For NDMP operations, the drives in the library were connected
directly to the NAS file server, with a path defined from the
NAS head to the virtual drives. The NAS file server transfers
data to the virtual tape drives at the request of the IBM Tivoli
Storage Manager server.
As shown in Figure 3, to allow Tivoli Storage Manager to use
the virtual tape drives for non-NDMP operations, the virtual
tape drives were also connected to the Tivoli Storage Manager
server, with paths defined from the server to the drives.
This configuration also supports an IBM Tivoli Storage
Manager storage agent having access to the virtual tape drivesfor its LAN-free operations.
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 8/12
8 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Tivoli Storage Manager Server
NAS File Server
NAS File Server
File System Disks
Web Client
(optional)
Virtual Tape Library
LEGEND
SCSI or Fibre Channel Connection
TCP/IP Connection
Data Flow
Robotics Control
Drive access
Figure 3: NDMP architecture
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 9/12
9IBM Systems and Technology
SAN configuration
All components with Fibre Channel connectivity (Tivoli Storage
Manager server, TS7610 ProtecTIER Deduplication Appliance
Express and IBM System Storage N5600 storage controller)
were connected to a Brocade 4100 SAN switch in the test envi-
ronment. According to Fibre Channel SAN zoning best prac-
tices, five zones were defined, each including one initiator and
one target. Four zones were defined to connect ProtecTIER’s
virtual tape library robotics and its virtual tape drives to the Tivoli Storage Manager server in a redundant manner, in order
to enable Control Path Failover (CPF) and Data Path Failover
(DPF) between the Tivoli Storage Manager server and the
ProtecTIER virtual tape library. Control Path Failover and Data
Path Failover were handled transparently by the IBM Tape
Device Driver for Windows (IBM tape) installed on the Tivoli
Storage Manager server. In addition, a single virtual drive was
zoned to the N series to enable the NAS file server to transfer
data directly to a virtual drive.
N Series configuration
The IBM N5600-A10 storage controller configurationincluded 28 144 GB 15k Fibre Channel drives in a RAID-DP
environment. The N5600 operating environment was ONTAP
7.3.3 for the tests.
IBM ProtecTIER Deduplication configuration
A ProtecTIER virtual tape library was defined with two virtual
tape drives, each assigned to one Fibre Channel front-end port.
To enable Control Path Failover, the virtual robot was assigned
to both Fibre Channel front-end ports. ProtecTIER’s LUN
masking feature was used to assign specific virtual devices to a
specific host running backup application modules. This feature
enables multiple initiators to share the same target Fibre
Channel port on the ProtecTIER system without having
conflicts on the devices that are being emulated.
Ten cartridges were defined to store backup data. To limit the
nominal cartridge size, maximum cartridge growth was set to200 GB. Under these conditions, as soon as a cartridge stores
200 GB of nominal data, it is marked “full” and another car-
tridge is used to backup data. Ten virtual tape slots were defined
in the virtual tape library to house the ten virtual tape cartridges.
By default, eight import/export slots were defined.
IBM Real-time Compression configuration
The IBM Real-time Compression Appliance STN6800 (Version
3.7.0) with Gigabit Ethernet ports were used for the tests.
1-Gigabit Ethernet connections were established to the Gigabit
Ethernet switch and the N5600 for connectivity between the
Oracle server and the N5600 storage controller.
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 10/12
10 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
Tivoli Storage Manager configuration
A server running IBM Tivoli Storage Manager 5.5.4.3 Extended
Edition was installed using Tivoli Storage Manager’s built-in
configuration wizards. (An Extended Edition license is required
to allow NDMP backups of NAS devices.) The Tivoli Storage
Manager database size was configured to 2048 MB and the log
size was configured to 1024 MB.
To initiate an NDMP backup from the Tivoli Storage Managerserver, the backup node command was used to perform a full
backup of the Oracle database files on the N5600 storage con-
troller. A table of contents (TOC) was not created as it is needed
only for single file restore. The backup NAS process in Tivoli
Storage Manager was monitored to measure backup time.
Test 1: ProtecTIER deduplication only To illustrate the benefits of data deduplication, tests were per-
formed to deduplicate the seven 37 GB cold backup sets using
the ProtecTIER deduplication appliance only. Deduplication
was performed during the time the data was copied using
NDMP from the N5600 storage controller to the ProtecTIER
deduplication appliance.
Test 2: IBM Real-time Compression and
ProtecTIER deduplication To illustrate the added benefits of using real-time compression
with deduplication, an IBM Real-time Compression STN6800
appliance was installed in front of the IBM N5600 storage
controller. IBM Real-time Compression provided an immediate
footprint reduction of the database file size from 37 GB to
6.6 GB, a reduction of over 82 percent. The introduction of
IBM Real-time Compression provided immediate space savings,
since the compression was performed in real time, when the data
was written to storage. No post processing or configuration
changes were required to realize these savings.
Clearly, both data deduplication and compression standing alone
offer significant space savings over traditional, non-optimized
storage. However, the benefits of combining these technologiesare even more compelling.
B a c k u p T i m
e i n S e c o n d s
Day of Backup
900
800
700
600
500
400
300
200
100
0
1 2 3 4 5 6 7
IBM Real-time
Compression withIBM ProtecTIER
IBM ProtecTIER
Figure 4: Backup times for IBM Real-time Compression combined with
ProtecTIER Deduplication, compared to deduplication alone
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 11/12
11IBM Systems and Technology
U s e d S p a c e G B
Day of Backup
25
20
15
10
5
01 2 3 4 5 6 7
IBM Real-time
Compression withIBM ProtecTIER
IBM ProtecTIER
Figure 5: Space used with IBM Real-time Compression combined with
ProtecTIER Deduplication, compared to deduplication alone
When the ProtecTIER deduplication solution was used to
backup the IBM Real-time Compression compressed data, the
seven compressed backup sets were further reduced by an aver-
age of 39 percent. In addition, backup of compressed data took
an average 68 percent less time than backup in the absence of
IBM Real-time Compression.
Summary While IBM Real-time Compression and IBM ProtecTIER
Deduplication solutions both offer compelling storage and data
protection benefits when used individually, the combination of
the two technologies has been shown to produce far greater
storage efficiency, significantly reduce backup times, and
improve utilization of resources. Tests involving Oracle
database physical backups have shown that together, these data
compression and deduplication solutions are capable of produc-
ing benefits far exceeding those found with either technology
alone—including 96 percent overall data reduction and up to
71 percent reduction in backup time, as well as better deduplica-
tion CPU utilization, less deduplication disk activity and less
deduplication network traffic. These results demonstrate strongsynergies between real-time compression and deduplication and
present a powerful argument for using both in order to achieve
storage optimization.
For more information To learn more about how IBM Real-time Compression and
IBM ProtecTIER Deduplication solutions can optimize storage
efficiency in your environment, contact your IBM representative
or visit ibm.com /storage/solutions/rtc and
ibm.com /systems/storage/tape/protectier/
Additionally, financing solutions from IBM Global Financing
can enable effective cash management, protection from technol-
ogy obsolescence, improved total cost of ownership and return
on investment. Also, our Global Asset Recovery Services help
address environmental concerns with new, more energy-efficient
solutions. For more information on IBM Global Financing,
visit: ibm.com /financing
7/29/2019 Combining IBM Real-time Compression and IBM ProtecTIER Deduplication
http://slidepdf.com/reader/full/combining-ibm-real-time-compression-and-ibm-protectier-deduplication 12/12
Please Recycle
© Copyright IBM Corporation 2011
IBM Systems and Technology GroupRoute 100Somers, NY 10589U.S.A.
Produced in the United States of America July 2011 All Rights Reserved
IBM, the IBM logo, ibm.com and ProtecTIER are trademarks orregistered trademarks of International Business Machines Corporation in the
United States, other countries, or both. If these and other IBM trademarksare marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Suchtrademarks may also be registered or common law trademarks in othercountries. A current list of IBM trademarks is available on the web at “Copyright and trademark information” at ibm.com /legal/copytrade.shtml
Other company, product and service names may be trademarks or servicemarks of others.
This paper is intended to provide information regarding IBM Real-timeCompression Appliance (RTC) in combination with ProtecTIER Deduplication solutions. It discusses findings based on configurations that were created and tested under laboratory conditions. These findings may not be realized in all customer environments, and implementation in suchenvironments may require additional steps, configurations and performance,compression and deduplication analysis. This information does not constitutea specification or form part of the warranty for any IBM or non-IBM products.
Information in this document was developed in conjunction with the use of the equipment specified and is limited in application to those specifichardware and software products and levels.
The information contained in this document has not been submitted to any formal IBM test and is distributed as-is. The use of this information or theimplementation of these techniques is a customer responsibility and dependson the customer’s ability to evaluate and integrate them into the customer’soperational environment. While each item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the sameor similar results will be obtained elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.
IBM may not officially support techniques mentioned in this document.For questions regarding officially supported techniques, please refer to theproduct documentation or announcement letters, or contact IBM Support.
This document could include technical inaccuracies or typographicalerrors. IBM may not offer the products, services or features discussed in
this document in other countries, and the product information may besubject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. Any statements regarding IBM’s future direction and intent are subject tochange or withdrawal without notice, and represent goals and objectivesonly. The information contained in this document is current as of theinitial date of publication only and is subject to change without notice. Allperformance information was determined in a controlled environment. Actual results may vary. Performance information is provided “AS IS” andno warranties or guarantees are expressed or implied by IBM. Informationconcerning non-IBM products was obtained from the suppliers of theirproducts their published announcements or other publicly availablesources. Questions on the capabilities of the non-IBM products should beaddressed with the suppliers. IBM does not warrant that the informationoffered herein will meet your requirements or those of your distributorsor customers. IBM disclaims all warranties, express or implied, including
the implied warranties of noninfringement, merchantability and fitness fora particular purpose or noninfringement. IBM products are warrantedaccording to the terms and conditions of the agreements under which they are provided.
TSW03093-USEN-00