11
50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: CHOOSING THE STORAGE INFRASTRUCTURE THAT BEST FITS YOUR BUSINESS NEEDS BY JON TOIGO CHAIRMAN THE DATA MANAGEMENT INSTITUTE SUMMARY Software-Defined Storage (SDS) has become a meme in industry and trade press discussions of storage technology lately, though the term itself lacks rigorous technical definition. Essentially, SDS is touted as a model for building storage that will work better with virtualized workloads running under server hypervisor technology than do “legacy” NAS and SAN infrastructure. Regardless of the veracity of these claims, the business-savvy IT planner should base his or her choice of storage infrastructure not on trendy memes, but on traditional selection criteria: cost, availability, and simplicity.

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: CHOOSING THE STORAGE INFRASTRUCTURE THAT BEST FITS YOUR BUSINESS NEEDS

BY JON TOIGO CHAIRMAN THE DATA MANAGEMENT INSTITUTE

SUMMARY Software-Defined Storage (SDS) has become a meme in industry and trade press discussions of storage technology lately, though the term itself lacks rigorous technical definition. Essentially, SDS is touted as a model for building storage that will work better with virtualized workloads running under server hypervisor technology than do “legacy” NAS and SAN infrastructure. Regardless of the veracity of these claims, the business-savvy IT planner should base his or her choice of storage infrastructure not on trendy memes, but on traditional selection criteria: cost, availability, and simplicity.

Page 2: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 2 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

INTRODUCTION IT planners have been hearing a lot about software-defined storage (SDS) lately -- and how it is supposed to replace all of the “legacy storage” that companies have deployed for the past decade or so, including Storage Area Networks (SANs) and Network Attached Storage (NAS) appliances. Unfortunately, many of the arguments advanced by evangelists for adopting SDS are only partially grounded in fact or have nothing whatsoever to do with improvements in storage performance, allocation or utilization efficiency. The purpose of this paper is to quickly review the arguments for SDS, to provide the proper context for analysis of the technologies, and to help guide the business-savvy IT manager or planner to basic common-sense criteria for evaluating the various products available to implement the SDS model if it is deemed suitable. In the final analysis, the kind of storage infrastructure that is appropriate for a data center, branch office or small business equipment room should be determined by what the application workload requires, first and foremost.

Page 3: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 3 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

WHY SDS? Software-defined storage is touted by evangelists as the fix for what ails traditional or legacy storage – as the next step in storage evolution. It is unclear what that means beyond the obvious marketing appeal of flowery rhetoric. If there is any sort of evolution in storage, it has been the movement away from isolated “islands” of storage that created obstacles to accessing and sharing stored data, and toward storage infrastructure models that enabled greater data sharing.

In the 1960’s and 70’s, the predominant forms of storage were internal storage (storage devices mounted inside the server frame itself) and direct-attached storage (a crate of disk drives connected to the outside of the server cabinet via an external extension of the server mainboard bus). From those implementation models, storage technology expanded in two directions: one supporting the storage and sharing of file-based data across a network; the other supporting the storage and sharing of block data from database and transaction-oriented systems. Block data storage “evolved” from island architectures (internal and directly attached storage arrays that were accessed only through the server itself, making stored data difficult to access, share or scale efficiently) to shareable arrays – that is, arrays with multiple ports for connecting many server hosts. Scaling these systems, however, became problematic, so engineers figured out ways to attach storage arrays into a switched fabric (a sort of rudimentary network) using a serial SCSI protocol such as Fibre Channel or SCSI over IP (iSCSI). The ultimate goal was to enable the storage infrastructure to be shared by all applications by creating a true “storage area network” or SAN.

Page 4: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 4 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

However, the holy grail of an open and heterogeneous SAN never appeared in the market. Vendors preferred to keep their storage equipment sales profitable by adding functionality to the proprietary controllers installed in their external arrays, and never fully participated in any sort of scheme to enable different equipment from different vendors to be managed in concert. Absent common management, the fabric SAN was incredibly pricey to own because of all the proprietary value-add software on the proprietary controllers, and to operate because of the lack of a universal management scheme and the need to embed expensive storage experts in IT staff. The above situation already existed when server virtualization took hold in the early 2000s. This made legacy storage an easy target for server hypervisor vendors to blame for application performance issues that arose following the consolidation of application workloads on fewer server platforms. Truth be told, whatever inefficiencies that already existed in SANs and NAS, server virtualization workloads just made them worse. Consolidating applications onto fewer servers dramatically changed the amount of I/O emanating from a single server, requiring on average the addition of 7 to 16 I/O connections per server. Hypervisor computing also changed the traffic patterns on networks and fabric interconnects and switching systems.

Hypervisor vendors assured customers that many good things would accrue to this model. For one, virtual machines could be replicated from host to host and joined into highly available clusters. Virtual machines could also move around server to server to distribute load more efficiently and to provide high availability.

Page 5: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 5 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

Only, consolidating a lot of virtual machines into a single server host, then migrating workload at will from one physical server to another, created challenges in terms of maintaining connections between applications and their stored data. When an app moved locations, often intervention was required by an administrator to provide the application with a new route to access its storage shares from a new server perch. Moreover, we were introduced to a new problem that resulted from combining the I/O from multiple workloads in the same server – something that the industry has started to refer to as the “I/O blender effect.” In operation, every hosted application will simply write its data out at random into a shared I/O path. Without special technology to intervene, this randomized I/O will eventually clog memory and disk storage devices and slow applications down to a crawl.

With application performance issues becoming the number one complaint of hypervisor computing customers, hypervisor vendors pushed several explanations that blamed “legacy storage” inefficiencies and began arguing for the “ripping and replacing” of older storage infrastructure and its replacement with something new, which they came to call software-defined storage.

Page 6: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 6 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

HYPERVISOR VENDOR ARGUMENTS OFTEN FLAWED, BUT SDS IS VIABLE NONETHELESS

Surveying the literature, hypervisor vendors have developed five reasons to de-provision legacy storage. Some are valid arguments while others are based on half-truths or downright falsehoods.

1. “Slow application performance due to ‘legacy SAN (and NAS) infrastructure’” – This is often a falsehood. You can see whether the storage I/O path is causing application delays rather simply by checking storage I/O queue depths. In many cases, queue depths are not significant (meaning that there is no data waiting in line to be written to disk). When this queue depth reading is combined with a processor cycle or processor activity measurement showing high rates of processor cycling, a more logical conclusion is that a log jam exists in “raw I/O” – the communications path between CPU and memory where application processing occurs. If that is the case, the application slowdown is a function of either application code or hypervisor code, not storage at all.

2. “High cost of storage (OPEX & CAPEX) due to proprietary storage” – Legacy storage was and is without a doubt expensive. Hardware vendors argue that value-add software functionality added to array controllers are differentiators of their products and represent much R&D that needs to be rewarded. Unfortunately, value-add often adds configuration complexity and expensive administrative staff skills requirements, and sometimes interferes with unified management of the kit as part of infrastructure.

3. “Poor utilization efficiency and management difficulty due to lack of common management” – Per #2, proprietary value-add storage is difficult to manage in common, and this has long been a reason to argue for infrastructure that might deliver better management.

4. “Lack of agility due to legacy infrastructure” – Allocating and de-allocating storage resources and services is much more difficult with proprietary and complicated value-add storage. However, this has nothing whatsoever to do with application performance.

5. “DAS storage outperforms SAN” – This is not true and makes no sense since both NAS and SAN are also direct attached storage configurations. NAS is storage directly-attached to a thin server that provides network-based access to data. A SAN is direct-attached storage with a physical or fabric-layer switch that makes and breaks server/storage connections at high speed, giving the illusion of network attachment.

Page 7: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 7 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

Valid or not, the arguments of hypervisor vendors regarding need to abandon legacy storage have resonated. In its place, vendors are recommending something called software-defined storage – which is, once again, storage that is physically, directly attached to virtual server hosts. The only difference between this and direct-attached storage in the past is that value-add storage services (software) are not located on the controller of the external array, but are instead instantiated in a software layer in the server, usually as part of the hypervisor vendor’s software stack.

Vendors can’t agree, of course, on a single definition for software defined storage. Instead, it is simply a panacea – a cure-all for everything that is wrong with “legacy storage” for use in a software-defined data center. The key difference is where the intelligence is located: is it on the array controller or is it somewhere in the server hypervisor software stack?

Actually, there are two sets of discriminators to consider when looking at the many SDS products from hypervisor vendors and independent software developers that have appeared in the market of late.

Page 8: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 8 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

Some SDS solutions are hypervisor-dedicated or hypervisor-dependent. Certainly, this is the case with VMware’s Virtual SAN, which only works with the company’s proprietary hypervisor. To a lesser extent, Microsoft’s Clustered Storage Spaces is proprietary to Microsoft, though they do claim to be able to share their storage with VMware (you just convert your VMware workload into Microsoft VHD format and import it into Hyper-V so you can share the Microsoft SDS infrastructure!).

The other end of the spectrum are the many third party SDS offerings, including StorMagic’s SvSAN technology, that are hypervisor-agnostic. Vendors of these solutions are earnestly trying to make their software-defined storage infrastructure work with multiple hypervisors and, in some cases, with non-virtualized workload as well. Some implement a common storage environment into which the data from different hypervisor workloads can be written, while others enable common management of storage infrastructures created with their software but dedicated to specific hypervisor workloads and their data. Either way, they are more robust than hypervisor-dedicated solutions and offer a better fit for companies that are deploying more than one hypervisor or that may do so in the future. The other differentiator to consider is hardware dependency and hardware agnosticism. Despite the ideology of software-defined (“use any hardware you want”), the truth is that some software-defined solutions have fairly rigid requirements when it comes to hardware components and topology. Some of the latest “hyper-converged infrastructure appliances” for example are just another kind of proprietary storage array. Logically, these products would be placed on the hardware-dependent side of the SDS solution spectrum. Hypervisor-dependent

Page 9: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 9 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

SDS products like VMware’s EVO:RAIL are also hardware dependent, and even their VMware Virtual SAN carries with it some fairly exacting and expensive hardware and topology requirements, including a minimum of three storage nodes. In the middle of this spectrum are a number of third party virtual SAN SDS solutions, which unlike their hypervisor software cousins, do not require special configurations or limit what kind of disk or what kind of flash devices you can use. StorMagic, for example, provides a solution that can embrace a wide range of hardware components. On the hardware-agnostic end of the spectrum are storage virtualization software products. To SDS purists, storage virtualization is not software-defined storage because SDS seeks to decouple storage capacity from storage value add services functionality and performance; capacity virtualization is not part of their SDS model. In general, if you are considering an SDS solution, your storage infrastructure requirements are usually best served by pursuing hardware- and hypervisor-agnostic SDS solutions. That decision is more likely to enable you to customize your storage gear selections to your needs and budget and to consolidate the management of different storage infrastructures supporting different workloads. If you are operating a large shop, such as a managed hosting environment or a large enterprise data center, moving to a Virtual SAN may actually be a step backward on the evolutionary chart. A real SAN, assuming that one is ever delivered to market, will deliver all of the capabilities and cost metrics that you need for such environments. On the other hand, if yours is a small to medium sized business or if you are tasked to provide storage infrastructure for remote or branch office operations, big centralized storage may not be your best or most strategic infrastructure choice. You may be attracted to VMware’s virtual SAN or Microsoft’s all-in-one Cluster Storage Spaces SDS solution as a template for building your storage. That is, until you find out what it will cost.

The fact is that VMware’s Virtual SAN only works with VMware vSphere workloads and requires you to limit your hardware to a specific list of supported equipment. The same is true of Microsoft’s Clustered Storage Spaces model, which is heavily dependent on SAS storage. Such equipment requirements may not be a deal breaker for firms that want a simple single vendor solution, but they do limit options. Planners may not be able to take advantage of newer storage technologies or leverage existing investments in storage equipment that they have today.

Page 10: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 10 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

More to the point, the cost of implementing a hypervisor dependent storage stack can be prohibitive. A recent lab evaluation published in a popular computing publication placed the cost of a basic Virtual SAN implementation, which requires three nodes minimum, at $11 to 14K in software licenses per node and about that much again for the hardware that is approved by VMware for use in their infrastructure. The prices for Microsoft are significantly lower, but there is still a minimum 3 node requirement and a preference for PCIe flash and SAS disk drives rather than SATA.

One significant benefit of using a third party Software Defined Storage solution is that it can insulate the planner from hardware lock-ins and usually does not require a three-node minimum hardware cluster. SDS can be implemented initially as a much less expensive two node cluster. For the smaller shop, or the ROBO environment, the independent software vendor’s SDS solution might be just the thing to test the value of software defined storage to application performance.

Page 11: 50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE: HOOSING THE STORAGE … Shades of Grey in... · 2020-02-19 · Software-Defined Storage (SDS) has become a meme in industry and trade

50 SHADES OF GREY IN SOFTWARE-DEFINED STORAGE 11 © COPYRIGHT 2015 BY THE DATA MANAGEMENT INSTITUTE, LLC. ALL RIGHTS RESERVED.

CONCLUSION

The fit for software-defined storage, whether from your hypervisor vendor or from an

independent developer like StorMagic, should be a function of its fit with deployed (and likely

to be deployed) applications and their requirements, with strategic goals for infrastructure and

data management, and with budget, staff skills and other practical boundary conditions. It is a

good idea to determine what your needs are before you seek out an SDS solution.

It is worth noting that, in the absence of uniform standards or guarantees of interoperability

between different software stacks, IT planners may be looking at a need to deploy and manage

different SDS solutions to meet different requirements. A product aimed at large enterprise

shops and hosting environments may not be well suited to deployment in a ROBO environment,

for example.

It is a good idea to try the SDS solution before you buy it. StorMagic offers a 60 day trial version

of its product, StorMagic SvSAN, that you can access from the company website

(http://www.stormagic.com/60-day-free-trial/).