17
Lab Validation Report NEC HYDRAstor Self Evolving, Extremely Scalable, Capacity Optimized, Grid Storage By Tony Palmer and Ginny Roth June 2012 © 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Lab Validation Report - NECAM

  • Upload
    others

  • View
    17

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Lab Validation Report - NECAM

Lab Validation Report NEC HYDRAstor

Self Evolving, Extremely Scalable, Capacity Optimized, Grid Storage

By Tony Palmer and Ginny Roth

June 2012 © 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Page 2: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 2

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Contents

Introduction .................................................................................................................................................. 3 Background ............................................................................................................................................................... 3 Introducing NEC HYDRAstor Grid Storage Platform ................................................................................................. 4

ESG Lab Validation ........................................................................................................................................ 5 Scalable Performance ............................................................................................................................................... 5 Global Deduplication and Compression ................................................................................................................... 8 High Availability and Non-disruptive Upgradeability ............................................................................................. 10

ESG Lab Validation Highlights ..................................................................................................................... 14

Issues to Consider ....................................................................................................................................... 14

The Bigger Truth ......................................................................................................................................... 15

Appendix ..................................................................................................................................................... 16

All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources The Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of the Enterprise Strategy Group, Inc., is in violation of U.S. Copyright law and will be subject to an action for civil damages and, if applicable, criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188.

ESG Lab Reports

The goal of ESG Lab reports is to educate IT professionals about data center technology products for companies of all types and sizes. ESG Lab reports are not meant to replace the evaluation process that should be conducted before making purchasing decisions, but rather to provide insight into these emerging technologies. Our objective is to go over some of the more valuable feature/functions of products, show how they can be used to solve real customer problems and identify any areas needing improvement. ESG Lab's expert third-party perspective is based on our own hands-on testing as well as on interviews with customers who use these products in production environments. This ESG Lab report was sponsored by NEC.

Page 3: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 3

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Introduction

This report documents the results of ESG Lab hands-on testing of the of the NEC HYDRAstor scale-out grid storage platform with a focus on performance and capacity scalability, ease of management, capacity optimization, and data resiliency.

Background

Unstructured (file) data growth has been growing faster than e-mail- and database-driven data for some time. Richer file formats, ubiquitous photo and video, online communities, collaboration tools, 3-D modeling, and 4-D imaging are just a few of the reasons for this. Organizations dealing with extensive and ever-growing volumes of disk-based backup and long term archive data struggle with similar problems; they need to accommodate growth efficiently, manage it simply, and access it quickly.

Responding to economic conditions, businesses are emphasizing CAPEX and OPEX reductions more than ever. File growth has resulted in higher costs in terms of storage infrastructure, complex management, data center floor space and power consumption. As a result, the ability to scale out—that is, independently scale and tune bandwidth, processing, and storage capacity on the fly while managing a single, global namespace—is extremely popular for increasing efficiency and saving money.

Adoption of scale-out grid storage solutions is driven by their ability to address multiple challenges. As seen in Figure 1, ESG survey respondents report selecting scale-out storage to achieve faster provisioning, better scalability with easier management, improved performance of both IO and throughput, and higher data availability.1

Figure 1. Scale-out Storage Drivers

Source: Enterprise Strategy Group, 2010.

1 Source: ESG Research Report, Scale-Out Storage Market Trends, December 2010.

12%

29%

26%

29%

31%

28%

33%

25%

29%

32%

35%

16%

23%

25%

27%

27%

32%

36%

38%

38%

39%

50%

0% 10% 20% 30% 40% 50% 60%

Need to support specific applications

Improved data management

Improved storage hardware utilization

Reduced operating expenditures

Improved data availability

Improved performance (I/Os)

Easier to manage

Faster storage provisioning times

Lower cost of infrastructure

Improved performance (throughput)

Improved scalability

Which of the following considerations drove the adoption of—or is driving the interest in—scale-out storage for your organization? (Percent of respondents,

multiple responses accepted)

Currently using scale-out storage (N=56)

Plan to use scale-out storage (N=122)

Page 4: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 4

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Introducing NEC HYDRAstor Grid Storage Platform

HYDRAstor is NEC’s distributed grid storage platform designed to minimize cost and obsolescence risk by modernizing archive storage infrastructure to support data for the long-term. NEC calls HYDRAstor “storage for the next 100 years”, built to address global backup and archive requirements while avoiding the cost, complexity, risks and operational limitations of expensive primary storage, limited scale-up NAS, tape, and specialized single purpose backup and archive appliances.

Figure 2. NEC HYDRAstor Scale-Out Grid

The NEC HYDRAstor distributed grid architecture is designed to store data for the very long term. DynamicStor automated storage management software is combined with extremely robust industry standard servers to enable HYDRAstor to deliver simple management, unrestricted performance and capacity scalability, capacity optimization, and enhanced data resiliency. DynamicStor provides automated load balancing, dynamic capacity allocation and “zero provisioning” to automatically manage the numerous sub-tasks associated with provisioning storage.

HYDRAstor is designed to enable extreme scaling—Performance up to and beyond 300 TB/hour, effective capacity greater than 20PB in a single grid. As a result, very large archives can be accessed, managed, and protected from a single interface. Features include:

Non-disruptive multi generational scalability–HYDRAstor's modular structure accommodates growth without downtime. Support for any generation of HYDRAstor node in a cluster enables IT to upgrade in place with no disruption to the business.

Non-disruptive workload balancing–Workloads are automatically spread across storage resources to avoid overworking individual drives, reduce hotspots, and enhance performance.

Automated administration–Managing a single pool of capacity using an intuitive graphical interface and rich set of command line interfaces reduces administrative cost and complexity.

Enterprise-class software features–HYDRAstor supports valuable enterprise-class storage software capabilities including remote replication, snapshots, and automatic online migration.

Predictable performance at scale–Node level granular scalability enables predictable, near-linear performance and capacity scalability across the entire grid, from one node to 165 nodes.

Page 5: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 5

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

ESG Lab Validation

ESG Lab performed hands-on evaluation and testing of NEC HYDRAstor Grid Storage at NEC’s facilities in Santa Clara, California. The objective was to examine the performance scalability, effective deduplication and high availability of the HYDRAstor solution. The test bed consisted of one NEC Express5800/A1080a-D (GX) server with two partitions connected via 10Gb Ethernet to a HYDRAstor HS8 system consisting of six Accelerator Nodes (ANs) and eight Storage Nodes (SNs) with global inline deduplication and compression enabled. Symantec NetBackup 7.1 was used to test backup performance. An NEC general purpose (GP) server was used for additional testing.

Figure 3. ESG Lab Test Bed

Scalable Performance

Achieving high throughput performance is a key component of effective data protection for large data sets as the window to complete backups is generally limited to the number of off-peak hours in a business day and does not grow as data sets grow. HYDRAstor’s OpenStorage Express I/O uses an API from Symantec NetBackup to reduce protocol overhead and maximize data throughput. To reach maximum throughput with minimal server footprint, NEC combines its powerful GX servers with HYDRAstor to scale media server performance.

When the HYDRAstor is combined with NEC’s GX server, users routinely back up dozens of terabytes per hour. The GX servers perform at a level achieved by only a few other enterprise-class devices. The GX/HYDRAstor combination can scale up to 80 cores in a single server and 165 nodes of storage and acceleration, offering petabytes/hr of throughput.

ESG Lab Testing

ESG Lab began performance testing by comparing HYDRAstor’s OpenStorage Express I/O with standard NFS. One general purpose (GP) server was connected to an AN. ESG Lab used four 80GB datasets, creating four separate data streams, to drive throughput to HYDRAstor. A backup was initiated and results were observed using the HYDRAstor

Page 6: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 6

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

GUI and confirmed against the throughput reported by the operating system for each Ethernet port and NetBackup for each stream. ESG Lab recorded a maximum throughput of 182 MB/sec for the NFS backups and saw throughput of 491 MB/sec using Express I/O. This represents a throughput improvement of 270% using the same source disks, media servers, and 10GbE connectivity. Figure 4 shows the results of the two throughput tests. It's important to note that performance was not limited by HYDRAstor but by the capability of the GP server and the overhead of the NFS protocol.

Figure 4. Express I/O Compared to NFS

The Express I/O test was followed with a look at performance throughput employing the Express5800 GX server. The GX server was configured as two partitions, that is to say, two separate servers inside the same chassis, each running one NetBackup 7.1 media server. Twenty four data sets of 160GB each were mounted by each media server, for 48 total datasets. Eight streams were directed to each AN on the HYDRAstor system.

ESG Lab executed a test with six ANs, using both partitions of the single NEC GX server and observed a maximum throughput of 5,127 MB/sec or 18.5TB/hr of backup performance, as shown in Figure 5. It's important to note that this result was obtained with just one GX server and HYDRAstor sustained consistent performance above 5,000 MB/sec for the duration of the test.

Figure 5. Massive Performance Results with HYDRAstor and One GX Server

Next, ESG Lab tested for scalability of throughput. Six backups were executed in total, the first five using the NEC GX server and the sixth adding a single GP server to drive additional throughput beyond the maximum of the single GX server configuration in the lab. Results were measured using the HYDRAstor GUI, again confirmed against throughput reported by the OS and NetBackup for each backup stream. As shown in Figure 6, throughput was

Page 7: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 7

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

tracked as the number of ANs was scaled from one to six. Average throughput increased from 881 MB/sec on one AN to 5370 MB/s with six ANs, an average of 895 MB/sec per AN.

Figure 6. Linearly Scalable Throughput

The only reason testing stopped at six ANs was because that was all the hardware available to conduct these tests. HYDRAstor has the capacity to grow to a combined 165 nodes and based on these observed results, ESG Lab is confident throughput can continue to scale as resources are added. The total of 165 nodes can be a mixture of SNs and ANs, tuned for particular environments.

Why This Matters

Predictable performance scalability is a critical concern when data protection windows are finite but data is growing at alarming rates. Maximizing throughput to allow backups to complete is imperative to stay ahead of the growing data problem. Consolidating multiple backup servers into an NEC Express5800 GX server maximizes both capital and operational efficiencies by cutting backup software license costs, reducing power consumption and maximizing I/O throughput to a scale-out HYDRAstor system that can grow incrementally as needed.

ESG Lab found HYDRAstor capable of sustaining massive throughput with a very large dataset with deduplication and compression enabled, and found impressive performance for backup services, allowing more data to be protected in a smaller amount of time as ANs were added to the grid. The linear scalability observed with the HYDRAstor grid in combination with Express I/O to reduce protocol overhead proved to be an effective solution that can grow with a customer’s data protection and archiving needs.

Page 8: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 8

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Global Deduplication and Compression

HYDRAstor has the ability to globally dedupe all incoming data during ingest by combining its DataRedux technology with its grid storage system. DataRedux eliminates redundant data across all incoming data streams reducing storage consumption significantly while providing application-aware deduplication which uses application format awareness to optimize deduplication of user data without impact from the corresponding application metadata, allowing greater optimization of storage across the HYDRAstor system. HYDRAstor uses compression to further reduce the footprint of data on disk.

The term “global deduplication” implies the capability of deduplicating data across clusters that span multiple appliances. Standalone deduplication appliances keep a metadata cache and an index. They examine a block of data, calculate a hash, check the hash against a table, and store only a pointer to the existing data if the block is a duplicate. Most such solutions deduplicate within a single appliance. When HYDRAstor deduplicates, each AN has access to hash tables for the entire cluster. This allows deduplication to operate across multiple storage pools, file systems and data sets. Data sets are written and stored across all SNs on the back end. Figure 7 illustrates a simplified view of NEC’s global deduplication and compression.

Figure 7. Global Inline Deduplication and Compression

ESG Lab Testing

ESG Lab examined global deduplication on the HYDRAstor system to measure the effective data reduction a user might experience in a typical backup scenario. Using NetBackup, we archived six 85.9GB NEC data sets to six different file systems. The six data sets were generated by NEC, and were 33% compressible. After the backups had completed, ESG Lab examined the results of both compression and deduplication using the NEC GUI. Figure 8 shows a deduplication ratio of 6:1 and a compression ratio of 1.5:1. Combined, the total reduction of data on the SNs was 9:1, storing 515GB of total data in 57.4GB.

Page 9: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 9

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Figure 8. Data Reduction After One Backup

Finally, ESG Lab projected these results out over 30 days, with the assumptions that users would be performing daily full backups, and the daily rate of unique data generated would be 5%. Figure 9 shows ESG Lab projections, and shows that users could expect to achieve 95% data reduction, or a 20:1 deduplication ratio in as little as 30 days.

Figure 9. Data Reduction Over Time

ESG Lab would expect the deduplication ratios to increase over time as more data is written to the cluster.

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

0 5 10 15 20 25 30

Cap

acit

y (G

B)

Backup Iteration

Capacity Backed Up Capacity Consumed

20:1

Page 10: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 10

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Why This Matters

As data grows the cost to the business to store and protect that data grows proportionally. These costs can grow out of control as data has grown at a rapid pace for most enterprise IT customers. Deduplication and compression represent real savings to the business as storage capacity needs are reduced. NEC provides inline global deduplication across all nodes, ensuring greater data protection than traditional RAID with less overhead. With a 90% reduction in data storage needs companies can quickly see the cost savings as fewer storage resources are needed to protect the ever increasing data requirements.

ESG Lab confirmed a reduction ratio of 9:1 after one set of backups with NEC’s global inline deduplication and compression operating across all file systems and ANs in the cluster, significantly reducing storage requirements for backup and data archive. As more data is written to the cluster over time, ESG Lab projected increased data reduction ratios, to 20:1 and beyond.

High Availability and Non-disruptive Upgradeability

HYDRAstor provides multiple redundancies in the case of hardware failures to ensure that high availability is achieved for access to data. Any failure along the data path, from network interfaces to hard drives, has a serious impact as the volume of data stored in the grid increases as well as on the sensitive and shrinking window for data protection. Devices need to recover quickly, often while users are accessing data and backups or restores are in process with no significant impact.

ESG Lab Testing

ESG Lab performed several tests to examine the resiliency and reliability of the HYDRAstor system. Figure 10 shows the various components that were “failed” in consecutive tests.

Figure 10. Failover Tests for Accelerator and Storage Nodes

Page 11: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 11

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

ESG Lab began testing by simulating drive failures. First, a backup job was started to drive IO to HYDRAstor. Three random drives were removed from three different SNs to simulate drive failures. ESG Lab pulled one drive each from nodes SN0103, SN0106, and SN0107. The event log immediately showed alerts of drive failures for all three nodes and the cache mode was changed from write-back to write-through to protect data during the recovery. ESG Lab observed that recovery of data across the remaining drives completed in less than five minutes. Rebalancing of the data after the drives were restored in the SNs took less than three minutes. The total volume of written data in this test was approximately 64GB.

ESG Lab next tested the impact of a SN failure by pulling both cluster network interfaces on node SN0107. Figure 11 shows the status of “Unreachable” on the SN list, in addition to two downed links to the SN from the cluster switch’s perspective. ESG Lab observed from the event log that the HYDRAstor system again changed the cache mode from write-back to write-through to protect data during recovery. Recovery on the remaining SNs competed in less than five minutes. When ESG Lab replaced the interfaces to bring SN0107 back online, rebalancing of the drives in the HYDRAstor system completed within three minutes.

Figure 11. Interface Links Down on Storage Node

Figure 12 shows the tests for both the SN interface and drive failures. The recovery activity was recorded as the remaining SNs and drives took over for the failed hardware. The balancing activity occurred as the hardware failures were corrected on the SNs.

Page 12: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 12

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Figure 12. Rebalance and Recovery after Storage Node and Drive Tests

Next, ESG Lab simulated an AN failure by pulling a link interface cable on the AN named AN0106. First a backup job was started, writing one 80GB dataset directly to AN0106 to examine the impact of a link failure to data throughput. It was noted that AN0106 was paired with AN0104 in the cluster. ESG Lab pulled the cables from the interface and examined the event log. The failed node was recognized immediately and failover began to the second node. Data from the media server paused for approximately five minutes and resumed again without a restart required. The backup job completed without error.

Figure 13 shows the state of the ANs after the failover, with AN0104 serving the file systems formerly on AN0106 in the cluster. ESG Lab re-attached the cables on AN0106 to test failback. Examining the event log, ESG Lab again observed that the cluster began and successfully completed a failback of all file systems to AN0106 in less than five minutes.

Figure 13. Accelerator Node Failover

Finally, ESG Lab examined and audited a running NEC HYDRAstor with multiple generations of SNs and ANs. Figure 14 shows the configuration tested.

Page 13: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 13

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Figure 14. Multi-generation HYDRAstor Grid

The file systems in this HYDRAstor grid were striped across all generations of SNs and were presented through all three ANs.

Why This Matters

As storage environments grow in size and complexity, so too does the impact of data outages. More than half of IT managers surveyed by ESG indicated data availability as a major driver in choosing to deploy scale-out networked storage.2 Regardless of the number and types of hardware failures that may occur during the life of data on disk, managers, employees, and customers expect their data to be always available.

The HYDRAstor architecture eliminates single points of failure and provides a cluster environment that can integrate multiple generations of hardware non-disruptively and without requiring data migration. DynamicStor distributes and protects data across all SNs in a cluster and provides continuous access to data through disk, network, and storage or AN failures.

ESG Lab has confirmed that NEC HYDRAstor can provide an always-on storage environment able to operate through planned maintenance and unplanned faults thanks to a robust, integrated highly available architecture combined with rock solid cluster services. In addition, the ability to upgrade to new generations of hardware non-disruptively provides investment protection through a storage infrastructure that can evolve with technology and keep data online.

2 Source: ESG Market Report, Scale-out 2.0: Simple, Scalable, Services-Oriented Storage, June 2010.

Page 14: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 14

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

ESG Lab Validation Highlights

ESG Lab found HYDRAstor capable of sustaining massive throughput with a very large dataset, with both deduplication and compression enabled.

HYDRAstor performance scaled impressively, allowing more data to be protected in a smaller amount of time as ANs were added to the grid. The near-linear scalability observed with the HYDRAstor grid in combination with Express I/O to reduce protocol overhead proved to be an effective solution that can grow with a customer’s data protection and archiving needs.

ESG Lab confirmed a total reduction ratio of 9:1 with NEC’s global inline deduplication and compression. ESG Lab projected a potential 20:1 total reduction ratio with daily backups after just 30 days. ESG Lab has confirmed that NEC HYDRAstor can provide an always-on storage environment able to operate

through planned maintenance and unplanned faults thanks to a robust, integrated highly available architecture combined with rock solid cluster services.

HYDRAstor demonstrated the ability to run an active cluster with multiple generations of hardware proving both investment protection and a storage infrastructure that can evolve with technology and keep data online.

Issues to Consider

The amount of disk capacity that can be saved using NEC HYDRAstor global deduplication technology depends on a number of factors including the backup policies in effect and the number of backup generations retained on disk. Take for example, a series of daily full backups which have more duplicate data than the same number of weekly full, daily incremental backups. While de-duplication rates of 50:1 or more are possible when retaining months of daily full backups, ESG research data indicates that ratios of 20:1 are more common.

Page 15: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 15

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

The Bigger Truth

The growing volume of unstructured data is expensive to store, a challenge to protect, and hard to manage. Because of this, it drives continuing customer challenges and storage industry innovation. IT is constantly searching for a solution to the staffing, energy, data access, retention, and protection challenges that come with an ever expanding pool of data.

NEC is a leading manufacturer of diverse and enterprise class computing, storage, and network infrastructure components. This experience has led the company to develop HYDRAstor, NEC’s distributed grid storage platform designed to minimize cost and the risk of obsolescence by engineering archive storage infrastructure to support data for the very long-term. NEC calls HYDRAstor “one hundred year storage”, built to address global backup and archive requirements while avoiding the cost, complexity, risks and operational limitations of expensive primary storage, limited scale-up NAS, tape, or specialized single purpose backup and archive appliances.

ESG has been evaluating NEC IT solutions for quite a while, and we have been impressed with the company’s ability to develop solutions to address each IT challenge and evolve as conditions change. They offer solutions for every infrastructure component - servers, storage, network, software, and communications - and their offerings continue to evolve.

In this most recent Lab Validation, ESG Lab confirmed that NEC HYDRAstor running on industry-standard servers and storage from NEC delivers outstanding levels of performance and scalability. An excellent backup throughput rate of 18.5 TB/hr can be achieved by a relatively small HYDRAstor HS8 cluster. HYDRAstor was able to scale performance, capacity, or both in a near-linear fashion as ANs or SNs were added to the cluster.

ESG research indicates that scalability, performance, cost, ease of provisioning and management are the top-rated considerations in users’ minds, when thinking about scale-out storage. NEC’s HYDRAstor delivers those values by combining massively scalable, robust grid architecture with global inline deduplication and compression to minimize the footprint of data on disk.

Selecting IT solutions for archives and data protection is becoming more of a business decision that it has been in the past. Instead of choosing components with specific features, decision makers are selecting infrastructure designs that offer particular types of support. Senior management buying decisions are focusing on what the business need: non-disruptive performance and capacity scalability, long-term investment protection, and continuous operations. NEC HYDRAstor provides an integrated scale-out infrastructure to fulfill those business needs.

Page 16: Lab Validation Report - NECAM

Lab Validation: NEC HYDRAstor 16

© 2012, Enterprise Strategy Group, Inc. All Rights Reserved.

Appendix

Table 1. ESG Lab Test Bed

NEC Infrastructure Configuration

NEC Express5800/A1080a-D Server Partition 1

2 Intel E7-8700 Xeon Processors, 20 cores 128GB RAM

3x 10GbE NICs

NEC Express5800/A1080a-D Server Partition 2

2 Intel E7-8700 Xeon Processors, 20 cores 128GB RAM

3x 10GbE NICs

NEC HYDRAstor HS8-3000 Six Accelerator Nodes, eight Storage Nodes

96 TB

Backup Software Version

Symantec NetBackup 7.1

Page 17: Lab Validation Report - NECAM

20 Asylum Street | Milford, MA 01757 | Tel: 508.482.0188 Fax: 508.482.0218 | www.enterprisestrategygroup.com