20
Business white paper Hybrid HPC/HPC Cloud Building a flexible, automated IaaS for Hybrid HPC

Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper

Hybrid HPC/HPC CloudBuilding a flexible, automated IaaS for Hybrid HPC

Page 2: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper

Table of contents

3 Introduction

4 HPC Cloud strategic landscape4 Long-term view5 Inhibitors to HPC public clouds5 Use of “classic” public cloud for HPC5 A fast-changing landscape6 The flexibility of public cloud comes with compromises6 However why wait?6 Private cloud offering with HPE and partner: where to start?

7 How to implement HPC Cloud solutions7 Scenario 1: Bursting to Tier 1 public cloud provider (with or without HPC-specialized hardware)8 Scenario 2: Bursting to regional HPC Cloud service provider9 Scenario 3: Hosting and bursting with a regional HPC Cloud service provider

10 The critical role of service providers in delivering HPC Cloud10 Technical attributes of a service provider offering10 How to solve cloud challenges11 Building blocks12 Products and design principles16 Next steps: Open APIs and the Hybrid HPC platform18 Reference and additional resources

Page 3: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper Page 3

Introduction

The significance of cloud computing in all its variations (on-premises, off-premises, and public) is widely understood. Businesses are increasingly utilizing cloud principles for their compute and storage requirements in order to leverage its benefits and accelerate time to value for IT projects. Those benefits typically include rapid deployment thanks to high levels of automation, moving to a fully transparent operational expenditure (OPEX) model, which reduces the cost burden of experimentation and rapid scale up and scale down of environments to help drive successful experiments forward while quickly repurposing resources of less successful trials.

Cloud computing is typically targeted by new applications or architecture that are born in the cloud, meaning that they have been designed to take advantage of the architectural specifics of a cloud environment. As the platforms matured, the cloud could cope with more traditional workloads that were not necessarily built for a cloud environment. Now comes the turn of High Performance Computing (HPC), a paradigm within IT that has typically been bound to specific hardware and challenged to be dynamic and mobile. This white paper outlines the new possibilities and capabilities to support Hybrid HPC Clouds, looking at architectural choices, implementation options, as well as potential challenges.

Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all cloud computing variations (on-premises, off-premises, and public), therefore with the same meaning as Hybrid Cloud. HPC Cloud is likewise used with the same meaning as Hybrid HPC.

Page 4: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper Page 4

HPC Cloud strategic landscape

Long-term viewEven though HPC has been around for multiple decades with a proven return on investment, companies have only scratched the surface of the opportunities given by HPC. HPC covers both the industry and the service economy—including research institutes, manufacturing enterprises, semiconductor companies, financial institutions, and more recently Web 2.0, as well as business analytics players—and is continually expanding its coverage.

With the emergence of new use cases like digital twins or connected cars combining HPC with maturing technologies such as Big Data, HPC is now massively impacting—if not transforming—the entire economy. The unprecedented pace in data growth and the emergence of commodity-based Artificial Intelligence (AI) technologies has meant that the demand for these HPC solutions is growing rapidly too. However, the skills and specializations required to standup and operate a traditional HPC environment are still a rarity and can be the breaking force for projects wanting to take advantage of the technology. The current HPC operating model and underlying technologies need to be improved to introduce cloud like Infrastructure as a Service (IaaS) interfaces and infrastructure lifecycle management. Finally, companies find it difficult to upgrade their legacy data center facilities to cope with HPC growth, therefore the need to seamlessly tap in existing on-premises HPC resources as well as expand these resources on demand, off-premises, and in the public cloud.

It is expected that by 2020, many enterprises will transfer their pre-exascale and exascale HPC compute power to the public cloud or community cloud service providers.

From

To

Increasedsocietal andeconomicalimpact

E.g., multiphysics, designimplementation, and more

The best productto beat my legacy

competition

E.g., digital twins

New industryoperatingstandards

E.g., Predix (a GE Digital initiative)

Value co-operation(digital platforms)

E.g., circular economy

NeweconomicalparadigmsE.g., connected cars

New applications

Figure 1. Increasing impact of HPC

Page 5: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper Page 5

Inhibitors to HPC public cloudsThe pervasive availability of HPC capabilities in public clouds depends on how fast the following inhibitors will be overcome:

• Hardware heterogeneity

• Operating environment heterogeneity (bare metal/virtualization/containers)

• HPC independent software vendor (ISV) flexible licensing models (The cost of HPC applications is often more than the cost of the hardware, sometimes by a considerable margin; this is why the economic benefit of the public cloud is reduced without cloud-friendly licensing.)

• Lack of low-latency interconnects, HPC-specialized accelerators, and noise-free operating systems to enable tightly coupled HPC applications to scale

• Data mobility in and out of the cloud in a less costly manner preventing data lock-in and decreasing the standup time for new environments

• Mapping HPC operations from a given company to a given public cloud (even with HPC specialization) without compromising on security, supervision, administration, networking topology, and more

• Security, regulatory compliance

Use of “classic” public cloud for HPCIn some cases, using the standard public cloud for HPC or compute-intensive applications can be good enough. Providing that they are not security sensitive, three types of workloads can take advantage of public clouds in their current form or with a few adaptations:

• Massively parallel applications.

• Small-scale parallel applications (usually less than one socket)—These applications would run inside one VM per host and will be limited by the amount of memory or local I/Os allocated to this VM.

• Parallel applications with small data set.

Altogether, these three classes of applications represent a large portion of the HPC applications—if not the main part of it.

It is also possible to leverage public cloud resources for the nonperformance aspects of an HPC environment such as visualization terminals (not rendering but image serving), scheduling nodes and supporting infrastructure such as time, DNS, authentication and logging, and so on.

A fast-changing landscapeHPC public cloud with HPC specialization is already on offer. HPC capabilities are increasingly available in public clouds, even for non-HPC applications. The HPC public clouds’ offerings can be segmented into Tier 1 cloud service providers and regional, vertically integrated, industry-vertical-focused service providers.

Standard public cloud and Web 2.0 service providers have already started adopting HPC acceleration technologies, such as RDMA (for zero-copy remote data processing), GPU (for Deep Learning and massively parallel processing), and FPGA (for signal transcoding or data compression/decompression).

Hewlett Packard Enterprise is currently working with regional partners leveraging the Partner Ready for Service Providers (PRSP) program.

Page 6: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper Page 6

The flexibility of public cloud comes with compromisesMapping a company’s HPC operations and service-level agreements (SLAs) to a given public cloud (be it with or without HPC specialization) comes with compromises, ranging from performance, costs, security, supervision, administration, networking topology, and reuse of application licenses.

The UberCloud compendium series can help to further understand the reality of rolling out HPC in public cloud with or without HPC specialization from a technical standpoint.

However why wait?As with standard IT, companies should not wait to start their journey to an HPC Cloud; many have already started experimenting using the public cloud for HPC. However, the public cloud scenario has its limitations—as stated in the previous section. This is why it is essential that enterprises consider either transforming their on-premises HPC infrastructure into a private cloud or expanding their operations through an off-premises private cloud as an initial first step on their cloud journey (which enable them to get as much public cloud attributes, as they need without the drawback of the public cloud).

Private cloud offering with HPE and partner: where to start?Hewlett Packard Enterprise has a number of offerings to assist its customers on their journey to the cloud, including the ones listed as follows:

HPE offerings Related public cloud attributes

HPE Datacenter Care for Hyperscale

Single point of operationsHPE Datacenter Care offering provides proactive support and an assigned team to simplify the operations lifecycle of the on-premises, hosted, or managed HPC cluster, tailored to fit the way one specific company operates IT.

Hosting with one of the HPE Partner Ready Program partners

No data center to manageData center upgrades (power and cooling density, floor space) on schedule for HPC clusters are probably the main pain point for most IT departments.HPE can work with IT departments to define the best strategy, either your own data center upgrade or using a third-party data center provider registered within the HPE Partner Ready Program.

HPE Flexible Capacity for Hyperscale

Pay as you growConsume HPC in the data center, saving on the cost of overprovisioning, with reserved capacity ready for growth or unpredictable spikes in use.HPE Flexible Capacity has also already been adopted by many service providers.

Cloud Management Platform

FlexibilityOn-premises HPC clusters have always been designed in a monolithic way so that nominal performance can be ensured. Nevertheless, customers are looking for ways to match new and fast-growing business needs, including a growing number of HPC users with no HPC skills and new types of simulations that require other operation environments than bare-metal compute. These new and fast-growing businesses require the capability of dialing up and down or repurposing cluster partitions on-premises.

Bursting to handle peak loadsAll companies are expecting public cloud bursting to handle peak loads. In order to meet this goal, the public cloud partitions must be integrated into the existing end-user environment so that the legacy application workflows remain unchanged when bursting. Middleware such as Micro Focus CSA, Cycle Computing CycleCloud, or Rescale ScaleX can provide flows and connectors for bursting from on-premises HPC to a public cloud.

Table 1. Private cloud: where to start?

Page 7: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper Page 7

How to implement HPC Cloud solutions

When it comes to cloud implementations, there are several deployment options ranging from 100% on-premises to 100% public cloud (This is the Hybrid IT concept.). A similar set of options is available when implementing HPC Cloud, each scenario having its benefits and challenges.

HPE and partners are well positioned to create the Right Mix of Hybrid HPC Clouds that is tailored to each company’s innovation needs. The different scenarios are listed as follows along with their salient characteristics.

Scenario 1: Bursting to Tier 1 public cloud provider (with or without HPC-specialized hardware)

On-premises

Visualizationnode

Login node

Scheduler

Deploy

Cloudorchestration Cloud portal

Monitor

Tier 1publiccloud

Figure 2. Bursting to public cloud

Bursting to a Tier 1 public cloud provider offers flexibility and the capability to meet new and unexpected demands, but it is only suitable for a few HPC workloads (even if it has HPC specialized hardware). While the public cloud offers great amount of resources on demand, some factors can be limiting for those who need repeatability, predictability, and reliability through SLAs. Cost predictability can also be a challenge when using public clouds for bursting because hidden costs around bandwidth, data transfer, and storage often appear afterward.

Data security is often a key topic due to the following reasons: it has to be handled by the end user, it can breach local regulation, and it can be very time-consuming and expensive. Another drawback to bursting to the public cloud is that the customer still has HPC assets on-premises.

Common issues that arise could be the latency when using remote visualization, as well as the time to transfer data to and from the cloud.

This scenario is recommended only for certain workloads and where a part of the HPC pipeline lends itself to this sort of environment due to the factors listed previously.

Page 8: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper Page 8

Scenario 2: Bursting to regional HPC Cloud service providerAlongside public cloud Tier 1 providers, there are regional service providers, whose core business relies on providing built-for-purpose bare-metal HPC solutions.

On-premises

HPC Cloudservice provider

Visualizationnode

Login node

Scheduler

Deploy

Cloudorchestration Cloud portal

Monitor

Figure 3. Bursting to regional HPC Cloud service provider

Bursting into a regional HPC Cloud service provider offers the same advantages as bursting to a Tier 1 public cloud provider but it:

• Allows for a more secure environment

• Enables the presharing of data for speed

• Provides a bare-metal option for HPC hardware offering predictable performance for HPC jobs that can be repeated in a reliable manner

Such providers offer SLAs where the time to execute compute job and the price are predictable and controlled by the user, with support from an HPC support team, operated by HPC experts.

Regional HPC Cloud providers offer best-in-class security ensuring job and data isolation, usually at the hardware level. This allows for network segregation and dedicated resource for the lifespan of the jobs. A critical component when leveraging sensitive data where the handling is controlled by local legislation.

Moreover, HPC tenants are accessed through their own login node that runs on physical hardware, which is the single-entry point to the tenant’s environment.

HPE recommends service providers that are ISO 9001 and ISO 27001 certified and has implemented ITIL® processes in their IT and hosting operation workflows. This ensures that the provider has an auditable and proven environment and has practices in place to detect and mitigate breaches or incidents that impact the security of the environment.

Service providers also have the advantage of having diverse and resilient backbone connections either onto the internet or with the possibility of connecting directly onto your corporate or research network of choice. This reduces the latency and cost of data movement of potentially large sets of data.

Page 9: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Business white paper Page 9

Scenario 3: Hosting and bursting with a regional HPC Cloud service provider

HPC Cloudservice

provider

Hosted

Visualizationnode

Login node

Scheduler

Deploy

Monitor Bursting

Visualizationnode

Login node

Scheduler

Deploy

Monitor

On-premises

Cloudorchestration

Cloud portal

Figure 4. Hosting and bursting with a regional HPC Cloud service provider

In most cases, enterprises choose an HPC Cloud hosting provider in order to reduce the overhead of hosting, management, and support of the HPC environment. This HPC Cloud provider can provide a flexible solution that is suited to each customer’s needs while some only need floor space and power, others want to have management further up the HPC stack and some want it all the way up to the HPC application layer and licensing support.

Some regional HPC Cloud service providers can offer the best of both worlds, namely the cost-effectiveness and control of an on-premises HPC cluster and the scalability of an HPC Cloud available for bursting.

In that case, the cost, performance, and availability become predictable.

Another benefit is perfect consistency between HPC operating environment whether hosted or bursted. The same software images can be used in both environments.

This is the preferred scenario.

Page 10: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

The critical role of service providers in delivering HPC Cloud

Hewlett Packard Enterprise understands the critical role of its ecosystem of service providers to deliver HPC Cloud. The following sections detail the distinct value of service providers and the benefits they bring.

Technical attributes of a service provider offeringA service provider should deliver on-demand HPC resources that can be provisioned through a self-service portal or via a request to an HPC support team. Applications are then delivered on service-level appropriate configurations be that on bare-metal servers with high-speed interconnect, enabling a predictable application performance or virtualized in some manner to improve density and optimize costs for work that does not require the performance of bare metal.

Most service providers will provide SLAs for delivering 99.999% uptime, fast delivery of resources, and 24x7x365 support from their HPC engineering team that can also assist with configuration and setup of applications. However, these requirements are also flexible and can be managed on a per-project, application, or tenant basis ensuring the appropriate service levels are applied to specific projects. For example, a university with term-time demands might not require 24x7x365 support and could potentially get by on a reduced SLA for a lower cost.

Given that HPC clusters are generally heavy on power consumption and cooling requirements, leveraging service providers is a good way of potentially locating the cluster in a geographical area that benefits from lower power costs and naturally supported cooling due to climate. When considering location of the data center, it is worth analyzing if not only these are seasonal costs but also what the sovereignty requirements of the data sets are and how that would be impacted.

How to solve cloud challengesTypical challenges with an HPC Cloud include bandwidth requirements, high latency, and the costs of transferring data.

The higher the network latency, the worse transferring the data to and from the cloud infrastructure will be. This is also true for remote desktop connections.

There are several options to address the latency issue. For example, the routing path can be optimized to reduce the number of hops, hence reducing the latency. UDP transfer protocols such as Aspera, Signiant, or GridFTP, that are designed to work well over high latency links, can also be used.

Another typical challenge is the handling of storage. It is important to look for a provider that handles storage by separating each tenant’s storage into locally and sometimes physically separate domains, depending on the handling requirements of the data set. Then, there is the provision of scratch storage, which needs to usually be high performance but also secure. Service providers will often provide dedicated storage clusters based on technology such as BeeGFS or Lustre, where the cluster only includes the tenant’s nodes, which are then only accessible by the tenant, and therefore fully isolated. It’s also possible to leverage gateways that could be virtualized within the tenants’ environment and so provide access to a shared pool in a secure way. Again, the requirements of the data and the governance around it will usually dictate the approach and most service providers should be capable of accommodating those challenges.

Network isolation and separation are other key elements to be considered in an HPC Cloud environment. In order to handle them, every cluster has its own VLAN and subnet. This allows the use of Access Control Lists (ACLs) to restrict all access between the different tenant subnets. The provider could also potentially use VXLAN to provide a greater range of subnets and overlapping address space options between tenants. This eliminates the ability

Business white paper Page 10

Page 11: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

for tenants to communicate with other resources on different subnets. It is also possible to leverage software-defined networking techniques to either further segregate the network into VLAN or tunnels. This allows separate component networks and users to provide same subnet connectivity to service endpoints and reduces the latency of routing. Each service provider will have different options in this space, and the requirements should be discussed in advance in order to determine what is possible.

If the clusters are using Intel® Omni-Path communication, then Omni-Path partitioning can be used for security and segregation here. Access to this network can then be limited to a single point such as the bastion or login node, where audit logs can be stored and analyzed to detect intrusion or otherwise suspicious activities.

As described in the Scenario 2, service providers that implement and are certified to ISO 27001 standard will be putting security at the very heart of their solutions, providing proactive security monitoring.

Building blocksScenarios described in the previous sections are implemented through an Infrastructure-as-a-Service (IaaS) solution built for purpose for HPC.

HPE HPC IaaS allows HPC cluster provisioning, utilization and, if needed, release seamlessly on-premises, off-premises, and in public clouds.

Figure 5 exposes HPE IaaS principle, which is to integrate classical HPC components (outlined on the left in dark purple) integrated into a standard cloud IaaS framework (shown in blue). Both HPC and IaaS components sit on top of hardware components: compute, storage, and network.

Business white paper Page 11

Visualization

Tooling/logs

Scheduler

Login nodes

Deployment

Monitoring

HPE FlexibleCapacity

Billing

uCMDB

Intel Omni-PathEthernet

InfiniBand

Object

Scratch

Block

File

NetworkNetwork

Containers

Accelerators

Bare metal

Virtualization

Compute

HPC Deployment &Management scripts

Compute management Storage management Network management

Cloud Management Platform

Orchestrator/workflows

Hardware management aggregator

Global poolhardware resources

Figure 5. HPC IaaS building blocks

Page 12: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Products and design principlesFigure 6 exposes the software products used by HPE to build its HPC IaaS v1.

Micro Focus Cloud Service Automation (CSA) provides the cloud management engine and the marketplace portal. The content is organized in catalogs describing the available services (infrastructure and applications), and these catalogs are independent of the delivery mode. This allows companies to have a high level of flexibility and decide where the resources should be created and consumed.

The orchestrator/workflow component is based on Micro Focus Operation Orchestration (OO); it interfaces with infrastructure managers like HPE Insight Cluster Management Utility (CMU) and can also integrate other infrastructure managers such as OpenStack®. This flexibility allows for easy integration of new environments in the HPC IaaS. The bottom row of the graph represents the supply layer technology.

Using the Cloud Management Platform, HPC clusters can be deployed anywhere, on-premises, off-premises, or in public cloud.

Business white paper Page 12

Other Cloud Management Platforms can be integrated in lieu of Micro Focus CSA/OO with no impact on the design principles.

Designer

Debugger Dashboard and reporting Events

API Micro Focus OOadministrationEngineProject and

packaging

Flow execution Scheduler

Consumerportal

Servicecatalogs

Servicedesigner

Lifecyclecontroller

Service o�ers and service designs

MicroFocus CSA

Prov

ides

MicroFocus OO

HPE InsightCMU

Figure 6. Commercial and open source products selected to build the HPC IaaS

Micro Focus Cloud Service AutomationMicro Focus CSA offers a flexible cloud management solution for any public, private, or hybrid cloud environment. It completely automates service lifecycle management across heterogeneous environments, delivering an end-to-end solution from infrastructure to applications, providing a unified, consistent experience. It is also possible to use the open architecture to extend the solution to integrate with third-party and complementary HPE products.

Page 13: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

As per Figure 7, the block labeled Service offers and service designs is made of three elements: consumer portals, service catalogs, and service designer. The consumer portals are the customers’ endpoints where they can purchase services by browsing the service catalogs. The portal is multitenant: each organization has its own catalog linked to specific offers. The service designer is not accessible to the end customer but only to the creator of the service offer. This is where the developer describes the architecture of the offer, the different components, and the links between them. The components are theoretical objects and therefore need to be bound to physical existing resources.

Micro Focus CSA benefits:

• Micro Focus CSA supports different types of cloud (public, private, and hybrid).

• Micro Focus CSA is open to external cloud providers, such as Microsoft® Azure, Amazon Web Services, and Cloud28+ partners.

Micro Focus Operation OrchestrationThe orchestrator is the module responsible for triggering actions, for example, deploying a complete HPC cluster or resetting a server. The customer initiates these actions through the cloud portal.

Micro Focus OO benefits:

• Micro Focus OO offers an out-of-the-box library with over 7000 workflows available, as well as an important and active development community.

• Micro Focus OO is a mature product used by many of the Top Fortune 500 leading companies.

HPE Insight CMU for Hybrid HPCHPE Insight CMU is a system management solution offering provisioning, monitoring, and management of cluster systems:

• Provisioning consists of populating the cluster with nodes (The typical case is a management node and compute nodes.), configuring the network parameters of the cluster, and deploying OS images on the nodes.

• Monitoring consists of keeping an eye on the cluster’s activity through the visualization tools monitoring different metrics, such as CPU load, Ethernet throughput, memory used, and more.

• Management offers broadcast of a command on a set of chosen nodes, comparing the output of a command on a given set of nodes, basic power, and boot actions on the nodes. They consist of a set of tools to ease the users’ experience when controlling their cluster.

Business white paper Page 13

Consumerportal

Servicecatalogs

Servicedesigner

Service o�ers and service designs

MicroFocus CSA

Figure 7. Micro Focus CSA

Page 14: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

There are three types of groups within HPE Insight CMU: Network Groups, Image Groups, and Custom Groups.

• Network Groups correspond to network switches that connect to a group of nodes.

• Image Groups represent each disk image that has been captured, they are made of the nodes for that image and those already cloned with that specific image.

• Custom Groups are customizable for the customers own use.

HPE Insight CMU cloning engine is used to deploy applications for the HPC Cloud stack. The cloning operation copies the complete contents of the golden image (the node from which the image is taken as a reference) to other nodes. The copied image deployed on the other nodes is the same except for the following cases:

• HPE Insight CMU updates the hostname of the node.

• HPE Insight CMU updates the IP address of the network used for cloning.

• HPE Insight CMU updates the compute node default gateway.

Before performing a cloning operation, a certain number of prerequisites must be satisfied:

• There needs to be creation of a valid Image Group for cloning.

• A backup needs to be made of the Image Group of the node; this image will be used to clone other nodes.

• The nodes to be cloned must belong to the Image Group created previously.

• The nodes to be cloned must belong to a Network Group.

• The Image Group created must have an image that is compatible with the hardware of the nodes to be cloned.

• Nodes must be ready to be powered on by the management card.

HPE Insight CMU benefits:

• HPE Insight CMU helps customers to speed time to production, ease cluster operations, and address issues faster.

• HPE Insight CMU has a 15-year track record of deployment—from small clusters to Top 500 sites with thousands of nodes.

• HPE Insight CMU is an HPC product developed and supported by HPE.

Deployment scriptsThe process is as follows:

• Micro Focus CSA interfaces with Micro Focus OO.

• Micro Focus OO calls various deployment scripts.

• The scripts will then return data and exit codes to Micro Focus OO for notification to Micro Focus CSA.

The goal is to propose a modular architecture that is easily reusable and adaptable. A set of playbooks implements simple functions. For example, there is a playbook to allocate nodes, another to allocate a VLAN, another to create storage on NFS, another to create a storage partition, and so on.

Business white paper Page 14

Page 15: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Provisioning resources with the Hybrid HPC platformHPE and Advania have partnered together and created a demo on How-to video to explain how easy it is to provision HPC resources from a service providers HPC Cloud offering using the dedicated self-service portal (shown in Figure 9).

Business white paper Page 15

HPC

adv

isor

y se

rvic

es a

nd m

anag

ed s

ervi

ces

HPC Cloud Management Platform(Micro Focus CSA/OO)

Server Network Storage Public cloud

Serviceprovider

tools

ContainersVirtualization

HPE InsightCMU Partner’s integration

Other Cloud Management Platform(new cloud stack, custom stack, and so on)

Application program interface (API)

Deployment & management scripts DB

Figure 8. Positioning the deployment scripts within the IaaS reference architecture

Figure 9. Service provider marketplace self-service portal

Page 16: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

The video presents the service provider marketplace portal and how a customer can easily subscribe to an HPC offering. The customer selects an offering in a catalog and subscribes by selecting a few options: number of servers, type of servers, type of storage required, and so on.

An automation engine runs in the background to build the requested infrastructure. Upon completion, the customer receives information to access the new infrastructure (shown in Figure 10).

Business white paper Page 16

Figure 10. Service provider marketplace portal service details

Being able to provision HPC resources without dependency on the service or hosting provider is an important attribute of Hybrid HPC. The HPC IaaS offering ensures that time zone limitations or late replies are eliminated, resulting in a zero-wait experience for customers.

Next steps: Open APIs and the Hybrid HPC platformOne of the advantages of a cloud platform over traditional infrastructure is the ability to manipulate it using APIs. These APIs can be exposed at many of the architectural levels from the user facing portals such as Micro Focus CSA to the IaaS platform such as OpenStack. In moving forward with the Hybrid HPC platform, HPE is looking to bring those open standards together in order to allow flexibility of choice and a common interface to the environments being leveraged. The OpenStack architecture and the modularity of the Hybrid HPC platform allow the blend of the two. This combination provides integration points with the following:

• The provision of compute resource

– Virtual

– Bare metal

• The deployment of container clusters

• The deployment of noncompute resources

– Storage

– Software-defined networking

Page 17: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

The benefit of adding an OpenStack-based IaaS into Hybrid HPC is that the breadth of technical advancement and integration offered by a wide ranging, open source community becomes immediately accessible and consumable by that platform. For example, the plug-ins and drivers provided by that community into third-party hardware and software allows the catalog of services to be simply increased. The multitenancy-based approach and integration into third-party authentication and authorization systems such as Lightweight Directory Access Protocol (LDAP) and Security Assertion Markup Language (SAML) provides the security required for a multitenant platform. The metrics provided by standard services such as Ceilometer, Monasca, and Panko provide the ability to instrument aspects of the platform and take action against those metrics when events and thresholds are triggered.

Business white paper Page 17

Cloud Management Platform

Operating system Operating system

Hypervisor

VM workloads

Deep/machine learning CICD pipeline

VM workloads

HPC schedulers

HPC jobs

Ope

nSta

ck

Controller hardware

HPE OneView/HPE Insight CMU/Redfish

Physical compute resourcesPhysical compute resources

OpenStack control plane

Ironic

MagnumTensorFlow

Kubernetes

Docker

CoreOS

Docker registry Prometheus

Figure 11. An example of HPC and AI co-existing in an OpenStack environment

Page 18: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Figure 11 shows an example deployment where the OpenStack control services can be leveraged to deploy a traditional HPC cluster (in the example, virtualized but bare metal is also possible) alongside a containerized AI cluster running the TensorFlow framework. These services could be logically separated whilst sharing data pools and network connectivity. This would allow a pipeline through both mechanisms for the same data sets where required. As these two technologies allow the processing of raw data followed by simulation to interpret the results of those experiments using AI, platforms such as this will become evermore useful.

From a costing perspective, this also offers flexibility in both domains and allows each area to scale as the need justifies optimizing costs for the HPC users.

Using filters and schedulers within the IaaS environment for resources allows selection of the appropriate hardware for the task requested. One example might be providing GPU hardware for TensorFlow processing units and CPU with high memory to the classic HPC environment that might not be able to leverage GPU in the same manner.

Reference and additional resourcesThis section is an overview of all the resources used in this document. The following documentation provides more background or detail on product features, described at a very high level earlier in this document.

Business white paper Page 18

Page 19: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Product Document Link

HPE Datacenter Care HPE Datacenter Care information hpe.com/us/en/services/datacenter-care-options.html

HPE Datacenter Care for Hyperscale documentation hpe.com/h20195/v2/Getdocument.aspx?docname=4AA4-4708ENW

HPE Datacenter Care for Hyperscale addendum documentation hpe.com/h20195/v2/Getdocument.aspx?docname=4AA6-3460ENW

HPE Datacenter Care for Hyperscale video youtube.com/watch?v=F2zP2kiNUeI

HPE Partner Ready Program HPE Partner Ready for Service Providers Program h22168.www2.hpe.com/us/en/partner-ready-sp/index.aspx

Partner Ready for Service Providers Program solution brief hpe.com/h20195/v2/getdocument.aspx?docname=4AA5-7800ENW

HPE Flexible Capacity HPE Flexible Capacity information hpe.com/us/en/services/flexible-capacity.html

HPE Flexible Capacity: Go hybrid and get the best of both worlds hpe.com/h20195/v2/Getdocument.aspx?docname=4AA4-4248ENW

Hewlett Packard Labs The Who, What, Why and How of High Performance Computing Applications in the Cloud

hpl.hp.com/techreports/2013/HPL-2013-49.pdf

Cloud28+ Open community of cloud service providers, cloud resellers, ISVs, integrators, government entities, and so on

cloud28plus.com

UberCloud Compendium UberCloud Compendium information theubercloud.com/ubercloud-compendium-2015

Hybrid Cloud Security Hybrid Cloud Security for Dummies guide documentation hpe.com/emea_europe/en/resources/cloud/cloud-security-for-dummies.html

Demo How-to video Provisioning resources with HPE and Advania demo video information

advania.com/hpc/news-resources/view/2017/04/28/DEMO-Video-How-to-provision-in-Advania-HPCaaS/?re=true&nid=339055c5-2c05-11e7-9408-005056bc0bdb

HPE community cloud Example of community cloud where HPE is participating virtualfortknox.de/en/

NIST Official cloud computing definition documentation nvlpubs.nist.gov/nistpubs/Legacy/SP/nistspecialpublication800-145.pdf

Hybrid Cloud HPE Hybrid Cloud Solutions landing page hpe.com/us/en/solutions/cloud.html

Micro Focus CSA Cloud Service Automation information software.microfocus.com/en-us/software/cloud-service-automation

Table 2. List of relevant documents and product documentation

Business white paper Page 19

Page 20: Hybrid HPC/HPC Cloud...Note: Cloud-related definitions can be found in the National Institute of Standards and Technology (NIST) papers. In this paper Cloud (or cloud) refers to all

Sign up for updates

Learn more athpe.com/us/en/solutions/hpc-high-performance-computing

© Copyright 2017 Hewlett Packard Enterprise Development LP. The information contained herein is subject to change without notice. The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. Hewlett Packard Enterprise shall not be liable for technical or editorial errors or omissions contained herein.

Intel is a trademark of Intel Corporation in the U.S. and other countries. Microsoft is either a registered trademark or trademark of Microsoft Corporation in the United States and/or other countries. The OpenStack Word Mark is either a registered trademark/service mark or trademark/service mark of the OpenStack Foundation, in the United States and other countries and is used with the OpenStack Foundation’s permission. We are not affiliated with, endorsed or sponsored by the OpenStack Foundation or the OpenStack community. ITIL® is a registered trade mark of AXELOS Limited. All other third-party trademark(s) is/are property of their respective owner(s).

a00029377ENW, November 2017

Business white paper

Make the right purchase decision. Click here to chat with our presales specialists.