33
SUSE Best Practices SUSE® Enterprise Storage on Ampere® eMAG Reference Architecture SUSE Enterprise Storage 6, Ampere eMAG Bryan Gartner, Senior Technology Strategist, SUSE 1 SUSE® Enterprise Storage on Ampere® eMAG

Bryan Gartner, Senior Technology Strategist, SUSE

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bryan Gartner, Senior Technology Strategist, SUSE

SUSE Best Practices

SUSE® Enterprise Storage on Ampere®eMAGReference Architecture

SUSE Enterprise Storage 6, Ampere eMAG

Bryan Gartner, Senior Technology Strategist, SUSE

1 SUSE® Enterprise Storage on Ampere® eMAG

Page 2: Bryan Gartner, Senior Technology Strategist, SUSE

The objective of this document is to present a step-by-step guide on how toimplement SUSE Enterprise Storage® (6) on the Ampere® eMAG platform.It is suggested that the document be read in its entirety, along with the sup-plemental appendix information before attempting the process.

Disclaimer: Documents published as part of the SUSE Best Practices se-ries have been contributed voluntarily by SUSE employees and third parties.They are meant to serve as examples of how particular actions can be per-formed. They have been compiled with utmost attention to detail. Howev-er, this does not guarantee complete accuracy. SUSE cannot verify that ac-tions described in these documents do what is claimed or whether actionsdescribed have unintended consequences. SUSE LLC, its affiliates, the au-thors, and the translators may not be held liable for possible errors or theconsequences thereof.

Publication Date: 2020-07-27

Contents

1 Introduction 4

2 Target Audience 4

3 Business Value 4

4 Hardware & Software 6

5 Requirements 7

6 Architectural Overview 8

7 Component Model 10

8 Deployment 11

9 Conclusion 17

10 Appendix A: Bill of Materials 18

11 Appendix B: policy.cfg 19

2 SUSE® Enterprise Storage on Ampere® eMAG

Page 3: Bryan Gartner, Senior Technology Strategist, SUSE

12 Appendix C: Network Switch Configuration 20

13 Appendix D: OS Networking Configuration 21

14 Resources 23

15 Legal Notice 23

16 GNU Free Documentation License 25

3 SUSE® Enterprise Storage on Ampere® eMAG

Page 4: Bryan Gartner, Senior Technology Strategist, SUSE

1 Introduction

The objective of this guide is to present a step-by-step guide on how to implement SUSE Enter-prise Storage (6) on the Ampere eMAG platform. It is suggested that the document be read inits entirety, along with the supplemental appendix information before attempting the process.

The deployment presented in this guide aligns with architectural best practices and will supportthe implementation of all currently supported protocols as identified in the SUSE EnterpriseStorage documentation.

Upon completion of the steps in this document, a working SUSE Enterprise Storage (6) clusterwill be operational as described in the SUSE Enterprise Storage Deployment Guide. (https://docu-

mentation.suse.com/ses/6/single-html/ses-deployment/#book-storage-deployment)

2 Target Audience

This reference guide is targeted at administrators who deploy software defined storage solutionswithin their data centers and make that storage available to end users. By following this docu-ment, as well as those referenced herein, the administrator should have a full view of the SUSEEnterprise Storage architecture, deployment and administrative tasks, with a specific set of rec-ommendations for deployment of the hardware and networking platform.

3 Business Value

SUSE Enterprise Storage

SUSE Enterprise Storage delivers a highly scalable, resilient, self-healing storage system designedfor large scale environments ranging from hundreds of Terabytes to Petabytes. This softwaredefined storage product can reduce IT costs by leveraging industry standard servers to presentunified storage servicing block, le, and object protocols. Having storage that can meet thecurrent needs and requirements of the data center while supporting topologies and protocolsdemanded by new web-scale applications, enables administrators to support the ever-increasingstorage requirements of the enterprise with ease.

Ampere eMAG

4 SUSE® Enterprise Storage on Ampere® eMAG

Page 5: Bryan Gartner, Senior Technology Strategist, SUSE

The Ampere eMAG server is an high performance, power efficient data center class platformfeaturing 32 Ampere-designed 64-bit Armv8 cores running up to 3.3 GHz. Designed for clouddata center workloads, the eMAG server is ideal scalable performance applications like the SUSEEnterprise Storage stack. The server processor has the following features:

32 Ampere Armv8 64-bit CPU cores at 3.3 GHz Sustained - SBSA Level 3

32 KB L1 I-cache, 32KB L1 D-cache per core

Shared 256 KB L2 cache per 2 cores

32MB globally shared L3 cache

8x 72-bit DDR4-2667 channels

ECC, ChipKill, and DDR4 RAS features

Up to 16 DIMMs and 1TB/socket

42 lanes of PCIE Gen 3, with 8 controllers

TDP: 75-125W

Also included in this configuration are the following key peripherals and infrastructure compo-nents that can be used to build a very high performance Ceph based storage cluster:

Micron

Enterprise IT and cloud managers want the fast, low latency and consistent performance ofNVMe storage that won’t break the budget.

The 7300 NVMe SSDs leverage the low power consumption and price-performance effi-ciencies of 3D NAND technology and deliver fast NVMe IOPS and GB/s for a wide arrayof workloads.

Built with the innovative 96-layer 3D TLC NAND, the 5300 series combines the latest inNAND technology and a proven architecture to provide performance upgrades now and apath forward for moving to an all-ash future. The 5300’s high capacity, added security,and enhanced endurance enable strong performance.

NVIDIA

System Network Interface Card

5 SUSE® Enterprise Storage on Ampere® eMAG

Page 6: Bryan Gartner, Senior Technology Strategist, SUSE

MCX653105A-HDAT ConnectX-6 VPI Adapter is the world’s rst 200Gb/s capable HDR In-finiBand and Ethernet network adapter card, offering industry-leading performance, smartoffloads and in-network computing, leading to the highest return on investment for high-performance computing, cloud, web 2.0, storage and machine learning applications.

Network Switch

Spectrum-2 MSN3700C is a 1U 32-port 100GbE spine that can also be used as a high den-sity 10/25GbE leaf when used with splitter cables. SN3700C allows for maximum flexi-bility, with ports spanning from 1GbE to 100GbE and port density that enables full rackconnectivity to any server at any speed, and a variety of blocking ratios. SN3700C portsare fully splittable to up to 128 x 10/25GbE ports.

Broadcom

The high-port 9500-16i Tri-Mode, PCIe Gen 4.0 HBA is ideal for increased connectivity andmaximum performance for enterprise data center flexibility. With increased bandwidth and IOPSperformance compared to previous generations, the 9500-16i adapter delivers the performanceand scalability needed by critical applications.

Connects up to 1024 SAS/SATA devices or 32 NVMe devices

Provides maximum connectivity and performance for high-end servers and applications

Support critical applications with the bandwidth of PCIe® 4.0 connectivity

Universal Bay Management (UBM) Ready

4 Hardware & SoftwareThe recommended architecture for SUSE Enterprise Storage on Ampere eMAG leverages twomodels of Ampere servers. The role and functionality of each type of system within the SUSEEnterprise Storage environment will be explained in more detail in the architectural overviewsection.

STORAGE NODES:

Ampere eMAG Core 2U Servers ( Lenovo HR350A )

ADMIN, MONITOR, AND PROTOCOL GATEWAYS:

Ampere eMAG 32 Core 1U Servers ( Lenovo HR330A )

6 SUSE® Enterprise Storage on Ampere® eMAG

Page 7: Bryan Gartner, Senior Technology Strategist, SUSE

SWITCHES:

NVIDIA Spectrum-2 MSN3700C 100Gb

SOFTWARE:

SUSE Enterprise Storage (6)

SUSE Linux Enterprise Server 15 SP1

TIP

Please note that limited use SUSE Linux Enterprise Server operating system subscrip-tions are provided with SUSE Enterprise Storage as part of the subscription entitle-ment

5 Requirements

Enterprise storage systems require reliability, manageability, and serviceability. The legacy stor-age players have established a high threshold for each of these areas and now expect the soft-ware defined storage solutions to offer the same. Focusing on these areas helps SUSE make opensource technology enterprise consumable. When combined with highly reliable and manageablehardware from Ampere, the result is a solution that meets the customer’s expectation.

5.1 Functional Requirements

A SUSE Enterprise Storage solution is:

Simple to setup and deploy, within the documented guidelines of system hardware, net-working and environmental prerequisites.

Adaptable to the physical and logical constraints needed by the business, both initially andas needed over time for performance, security, and scalability concerns.

Resilient to changes in physical infrastructure components, caused by failure or requiredmaintenance.

Capable of providing optimized object and block services to client access nodes, eitherdirectly or through gateway services.

7 SUSE® Enterprise Storage on Ampere® eMAG

Page 8: Bryan Gartner, Senior Technology Strategist, SUSE

6 Architectural Overview

This architecture overview section complements the SUSE Enterprise Storage Technical

Overview (https://www.suse.com/docrep/documents/1mdg7eq2kz/suse_enterprise_storage_tech-

nical_overview_wp.pdf) document available online which presents the concepts behind soft-ware defined storage and Ceph as well as a quick start guide (non-platform specific).

6.1 Solution Architecture

SUSE Enterprise Storage provides unified block, le, and object access based on Ceph. Cephis a distributed storage solution designed for scalability, reliability and performance. A criticalcomponent of Ceph is the RADOS object storage. RADOS enables a number of storage nodes tofunction together to store and retrieve data from the cluster using object storage techniques. Theresult is a storage solution that is abstracted from the hardware. Ceph supports both native andtraditional client access. The native clients are aware of the storage topology and communicatedirectly with the storage daemons over the public network, resulting in horizontally scalingperformance. Non-native protocols, such as ISCSI, S3, and NFS require the use of gateways.While these gateways may be thought of as a limiting factor, the ISCSI and S3 gateways canscale horizontally using load balancing techniques.

FIGURE 1: CEPH ARCHITECTURE

8 SUSE® Enterprise Storage on Ampere® eMAG

Page 9: Bryan Gartner, Senior Technology Strategist, SUSE

In addition to the required network infrastructure, the minimum SUSE Enterprise Storage clusteris comprised of a minimum of one administration server (physical or virtual), four object storagedevice nodes (OSDs), and three monitor nodes (MONs).

SPECIFIC TO THIS IMPLEMENTATION:

One system is deployed as the administrative host server. The administration host is theSalt-master and hosts the SUSE Enterprise Storage Administration Interface, openATTIC,which is the central management system which supports the cluster.

Three systems are deployed as monitor (MONs) nodes. Monitor nodes maintain informa-tion about the cluster health state, a map of the other monitor nodes and a CRUSH map.They also keep history of changes performed to the cluster.

Additional servers may be deployed as iSCSI gateway nodes. iSCSI is a storage area network(SAN) protocol that allows clients (called initiators) to send SCSI command to SCSI storagedevices (targets) on remote servers. This protocol is utilized for block-based connectivity toenvironments such as Microsoft Windows, VMware, and traditional UNIX. These systemsmay be scaled horizontally through client usage of multi-path technology.

The RADOS gateway provides S3 and Swift based access methods to the cluster. Thesenodes are generally situated behind a load balancer infrastructure to provide redundancyand scalability. It is important to note that the load generated by the RADOS gateway canconsume a significant amount of compute and memory resources making the minimumrecommended configuration contain 6-8 CPU cores and 32GB of RAM.

SUSE Enterprise Storage requires a minimum of four systems as storage nodes. The storagenodes contain individual storage devices that are each assigned an Object Storage Daemon(OSD). The OSD assigned to the device stores data and manages the data replication andrebalancing processes. OSDs also communicate with the monitor (MON) nodes and providethem with the state of the other OSDs.

9 SUSE® Enterprise Storage on Ampere® eMAG

Page 10: Bryan Gartner, Senior Technology Strategist, SUSE

6.2 Networking Architecture

A software-defined solution is only as reliable as its slowest and least redundant component.This makes it important to design and implement a robust, high performance storage networkinfrastructure. From a network perspective for Ceph, this translates into:

Separation of cluster (backend) and client-facing (public) network traffic. This isolatesCeph OSD replication activities from Ceph clients. This may be achieved through separatephysical networks or through use of VLANs.

Redundancy and capacity in the form of bonded network interfaces connected to switches.

The following figure shows the logical layout of the traditional Ceph cluster implementation.

FIGURE 2: CEPH NETWORK ARCHITECTURE

7 Component ModelThe preceding sections provided information on the both the overall Ampere hardware as well asan introduction to the Ceph software architecture. In this section, the focus is on the SUSE com-ponents: SUSE Linux Enterprise Server (SLES), SUSE Enterprise Storage (SES), and the Reposi-tory Mirroring Tool (RMT).

10 SUSE® Enterprise Storage on Ampere® eMAG

Page 11: Bryan Gartner, Senior Technology Strategist, SUSE

COMPONENT OVERVIEW (SUSE)

SUSE Linux Enterprise Server - A world class secure, open source server operating system,equally adept at powering physical, virtual, or cloud-based mission-critical workloads.Service Pack 3 further raises the bar in helping organizations to accelerate innovation, en-hance system reliability, meet tough security requirements and adapt to new technologies.

Repository Mirroring Tool (RMT) for SLES - allows enterprise customers to optimize themanagement of SUSE Linux Enterprise (and extensions such as SUSE Enterprise Storage)software updates and subscription entitlements. It establishes a proxy system for SUSECustomer Center (SCC) with repository and registration targets.

SUSE Enterprise Storage - Provided as an extension on top of SUSE Linux Enterprise Serv-er, this intelligent software-defined storage solution, powered by Ceph technology withenterprise engineering and support from SUSE enables customers to transform enterpriseinfrastructure to reduce costs while providing unlimited scalability.

8 DeploymentThis deployment section should be seen as a supplement online documentation. (https://

www.suse.com/documentation/) Specifically, the SUSE Enterprise Storage (6) Deployment

Guide (https://documentation.suse.com/ses/6/single-html/ses-deployment/#book-storage-deploy-

ment) as well as SUSE Linux Enterprise Server Administration Guide. (https://documenta-

tion.suse.com/sles/15-SP1/single-html/SLES-admin/#book-sle-admin) It is assumed that a Repos-itory Mirroring Tool server exists within the environment. If not, please follow the informa-tion in Repository Mirroring Tool (RMT) for SLES (https://documentation.suse.com/sles/15-SP1/sin-

gle-html/SLES-rmt/#book-rmt) to make one available. The emphasis is on specific design andconfiguration choices.

8.1 Network Deployment Overview

The following considerations for the network configuration should be attended to:

Ensure that all network switches are updated with consistent rmware versions.

Specific configuration for this deployment can be found in Appendix C: Network SwitchConfiguration & Appendix D: OS Networking Configuration

11 SUSE® Enterprise Storage on Ampere® eMAG

Page 12: Bryan Gartner, Senior Technology Strategist, SUSE

Network IP addressing and IP ranges need proper planning. In optimal environments, asingle storage subnet should be used for all SUSE Enterprise Storage nodes on the primarynetwork, with a separate, single subnet for the cluster network. Depending on the sizeof the installation, ranges larger than /24 may be required. When planning the network,current as well as future growth should be taken into consideration.

Setup DNS A records for all nodes. Decide on subnets and VLANs and configure the switchports accordingly.

Ensure that you have access to a valid, reliable NTP service, as this is a critical requirementfor all nodes. If not, it is recommended to use the admin node.

Function Hostname Primary Network(VLAN)

Cluster Network(VLAN)

Admin amp-admin.suse.lab 172.16.227.60 N/A

Monitor amp-mon1.suse.lab 172.16.227.61 N/A

Monitor amp-mon2.suse.lab 172.16.227.62 N/A

Monitor amp-mon3.suse.lab 172.16.227.63 N/A

Gateway amp-gw1.suse.lab 172.16.227.64 N/A

Gateway amp-gw2.suse.lab 172.16.227.65 N/A

OSD amp-osd1.suse.lab 172.16.227.59 172.16.220.59

OSD amp-osd2.suse.lab 172.16.227.58 172.16.220.58

OSD amp-osd3.suse.lab 172.16.227.57 172.16.220.57

OSD amp-osd4.suse.lab 172.16.227.56 172.16.220.56

OSD amp-osd5.suse.lab 172.16.227.55 172.16.220.55

OSD amp-osd6.suse.lab 172.16.227.54 172.16.220.54

OSD amp-osd7.suse.lab 172.16.227.53 172.16.220.52

OSD amp-osd8.suse.lab 172.16.227.52 172.16.220.52

12 SUSE® Enterprise Storage on Ampere® eMAG

Page 13: Bryan Gartner, Senior Technology Strategist, SUSE

Function Hostname Primary Network(VLAN)

Cluster Network(VLAN)

OSD amp-osd9.suse.lab 172.16.227.51 172.16.220.51

OSD amp-osd10.suse.lab 172.16.227.50 172.16.220.50

8.2 Operating System Installation

There are several key tasks to ensure are performed correctly during the operating system in-stallation.

During the SUSE Linux Enterprise installation, be sure and register the system with anupdate server. Ideally, this is a local RMT server which will reduce the time requiredfor updates to be downloaded and applied to all nodes. By updating the nodes duringinstallation, the system will deploy with the most up-to-date packages available, helpingto ensure the best experience possible.

To speed installation, on the System Role screen, it is suggested to select Text Mode. Theresulting installation is a text mode server that is an appropriate base OS for SUSE LinuxEnterprise Server.

The next item is to ensure that the operating system is installed on the correct device.Especially on OSD nodes, the system may not choose the right drive by default. The properway to ensure the right device is being used is to select Create Partition Setup on theSuggested Partitioning screen. This will then display a list of devices, allowing selectionof the correct boot device. Next select Edit Proposal Settings and unselect the ProposeSeparate Home Partition checkbox.

Do ensure that NTP is configured to point to a valid, physical NTP server. This is criticalfor SUSE Enterprise Storage to function properly, and failure to do so can result in anunhealthy or non-functional cluster.

13 SUSE® Enterprise Storage on Ampere® eMAG

Page 14: Bryan Gartner, Senior Technology Strategist, SUSE

8.3 SUSE Enterprise Storage Installation & Configuration

8.3.1 Software Deployment configuration (Deepsea and Salt)

Salt, along with DeepSea, is a stack of components that help deploy and manage server infra-structure. It is very scalable, fast, and relatively easy to get running.

There are three key Salt imperatives that need to be followed:

The Salt Master is the host that controls the entire cluster deployment. Ceph itself shouldNOT be running on the master as all resources should be dedicated to Salt master services.In our scenario, we used the Admin host as the Salt master.

Salt minions are nodes controlled by Salt master. OSD, monitor, and gateway nodes areall Salt minions in this installation.

Salt minions need to correctly resolve the Salt master’s host name over the network.This can be achieved through configuring unique host names per interface (e.g. osd1-clus-ter.suse.lab and osd1-public.suse.lab) in DNS and/or local /etc/hosts les.

Deepsea consists of a series of Salt les to automate the deployment and management of a Cephcluster. It consolidates the administrator’s decision making in a single location around clusterassignment, role assignment and profile assignment. Deepsea collects each set of tasks into agoal or stage.

The following steps, performed in order, will be used for this reference implementation:

1. Install DeepSea on the Salt master which is the Admin node:

zypper in deepsea

2. Start the salt-master service and enable:

systemctl start salt-master.servicesystemctl enable salt-master.service

3. Install the salt-minion on all cluster nodes (including the Admin):

zypper in salt-minion

4. Configure all minions to connect to the Salt master:Modify the entry for master in the /etc/salt/minion

14 SUSE® Enterprise Storage on Ampere® eMAG

Page 15: Bryan Gartner, Senior Technology Strategist, SUSE

master: sesadmin.domain.com

5. Start the salt-minion service and enable:

systemctl start salt-minion.servicesystemctl enable salt-minion.service

6. List and accept all Salt keys on the Salt master: salt-key --accept-all and verify their ac-ceptance:

salt-key --list-allsalt-key --accept-all

7. Select the nodes to participate in the cluster:

salt '*' grains.append deepsea default

8. If the OSD nodes were used in a prior installation, zap ALL the OSD disks (ceph-disk zap<DISK>)

9. At this point, the cluster can be deployed.

a. Prepare the cluster:

salt-run state.orch ceph.stage.prep

b. Run the discover stage to collect data from all minions and create configuration frag-ments:

salt-run state.orch ceph.stage.discovery

c. A proposal for the storage layout needs to be generated at this time. For the hardwareconfiguration used for this work, the following command was utilized:

salt-run proposal.populate name=default target='amp-osd*'

The result of the above command is a deployment proposal for the disks that placesthe RocksDB, Write-Ahead Log (WAL), and on the same device.

d. A /srv/pillar/ceph/proposals/policy.cfg le needs to be created to instruct Salt on thelocation and configuration les to use for the different components that make up theCeph cluster (Salt master, admin, monitor, and OSDs).

15 SUSE® Enterprise Storage on Ampere® eMAG

Page 16: Bryan Gartner, Senior Technology Strategist, SUSE

See Appendix B for the policy.cfg le used in the installation.

e. Next, proceed with the configuration stage to parse the policy.cfg le and merge theincluded les into the final form

salt-run state.orch ceph.stage.configure

f. The last two steps manage the actual deployment.Deploy monitors and ODS daemons rst:

salt-run state.orch ceph.stage.deploy

Note

The command can take some time to complete, depending on the size of thecluster.

g. Check for successful completion via:

ceph -s

h. Finally, deploy the services-gateways (iSCSI, RADOS, and openATTIC to name a few):

salt-run state.orch ceph.stage.services

8.3.2 Post-deployment quick test

The steps below can be used (regardless of the deployment method) to validate the overallcluster health:

ceph statusceph osd pool create test 1024rados bench -p test 300 write --no-cleanuprados bench -p test 300 seq

Once the tests are complete, you can remove the test pool via:

ceph tell mon.* injectargs --mon-allow-pool-delete=trueceph osd pool delete test test --yes-i-really-really-mean-itceph tell mon.* injectargs --mon-allow-pool-delete=false

16 SUSE® Enterprise Storage on Ampere® eMAG

Page 17: Bryan Gartner, Senior Technology Strategist, SUSE

8.4 Deployment Considerations

Some final considerations before deploying your own version of a SUSE Enterprise Storage clus-ter, based on Ceph. As previously stated, please refer to the Administration and DeploymentGuide.

With the default replication setting of 3, remember that the client-facing network willhave about half or less of the traffic of the backend network. This is especially true whencomponent failures occur or rebalancing happens on the OSD nodes. For this reason, it isimportant not to under provision this critical cluster and service resource.

It is important to maintain the minimum number of monitor nodes at three. As the clusterincreases in size, it is best to increment in pairs, keeping the total number of Mon nodes asan odd number. However, only very large or very distributed clusters would likely needbeyond the 3 MON nodes cited in this reference implementation. For performance reasons,it is recommended to use distinct nodes for the MON roles, so that the OSD nodes can bescaled as capacity requirements dictate.

As described in this implementation guide and the SUSE Enterprise Storage documentation,a minimum of four OSD nodes is recommended, with the default replication setting of 3.This will ensure cluster operation, even with the loss of a complete OSD node. Generallyspeaking, performance of the overall cluster increases as more properly configured OSDnodes are added.

9 ConclusionThe Ampere eMAG servers provides a strong capacity-oriented platform for enterprise, HPC orCloud Ceph-based storage cluster. In addition to the strong raw performance demonstrated bythis configuration as characterized in industry standard benchmarks like the IO500 workload,the Ampere systems provide a very compelling value proposition when combining its high per-formance the with the ultra-efficient power profile and the lighter than expected acquisitioncost of the cluster! These features combined with the access flexibility and reliability of SUSEEnterprise Storage and industry leading support from Ampere allows any business to proceedconfidently with a solution that addresses many storage use cases driven by the exponentialgrowth in storage capacity and performance currently facing the industry.

17 SUSE® Enterprise Storage on Ampere® eMAG

Page 18: Bryan Gartner, Senior Technology Strategist, SUSE

10 Appendix A: Bill of Materials

Role Qty Component Notes

Admin, mon-itor, and pro-tocol gate-ways

6 Ampere 1U Servers( Lenovo HR330A )

Configuration:

1x Ampere eMAG 8180 32Core 3.3GHz

32GB DRAM ( 4x8 DIMM 2667 )

2x Micron 7300 PRO NVMe M.2 480GB

1x NVIDIA MCX653105A-HDAT Con-nectX-6 VPI Adapter

OSD Nodes 10 Ampere 2U Servers( Lenovo HR350A )

Configuration:

1x Ampere eMAG 8180 32Core 3.3GHz

128GB DRAM ( 8x16 DIMM 2667 )

2x Micron 240GB NVMe M.2

4x Micron 7300 PRO NVMe M.2 480GB

1x Broadcom BRCM 9500-16i HBA

1x NVIDIA MCX653105A-HDAT Con-nectX-6 VPI Adapter

NetworkSwitch

2 NVIDIA Spectrum-2MSN3700C Switch

Updated with latest OS image

18 SUSE® Enterprise Storage on Ampere® eMAG

Page 19: Bryan Gartner, Senior Technology Strategist, SUSE

11 Appendix B: policy.cfg

cluster-ceph/cluster/*.slsrole-master/cluster/amp-admin*.slsrole-admin/cluster/amp-admin*.slsrole-mon/cluster/amp-mon*.slsrole-mgr/cluster/amp-mon*.slsrole-storage/cluster/amp-osd*.slsrole-mds/cluster/amp-[mo]*.slsrole-grafana/cluster/amp-admin*.slsrole-prometheus/cluster/amp-admin*.slsconfig/stack/default/global.ymlconfig/stack/default/ceph/cluster.yml

19 SUSE® Enterprise Storage on Ampere® eMAG

Page 20: Bryan Gartner, Senior Technology Strategist, SUSE

12 Appendix C: Network Switch ConfigurationThe switch uplinks are configured with a LAG. The load generation nodes are blade serversconnected with 16 10Gb ethernet ports bonded in two LACP bonds, one to each upstream switch.The cluster network carries back end and is VLAN 220.

#### Active saved database "c3-mellanox-s3700"## Generated at 2020/07/13 20:53:19 +0000## Hostname: switch-6bdea0## Product release: 3.9.0914##

#### Running-config temporary prefix mode setting##no cli default prefix-modes enable

#### Interface Ethernet configuration## interface port-channel 28 interface port-channel 30 fae interface ethernet 1/1 speed 100G no-autoneg fae interface ethernet 1/2 speed 100G no-autoneg fae interface ethernet 1/3 speed 100G no-autoneg fae interface ethernet 1/4 speed 100G no-autoneg fae interface ethernet 1/5 speed 100G no-autoneg fae interface ethernet 1/6 speed 100G no-autoneg fae interface ethernet 1/7 speed 100G no-autoneg fae interface ethernet 1/8 speed 100G no-autoneg fae interface ethernet 1/9 speed 100G no-autoneg fae interface ethernet 1/10 speed 100G no-autoneg fae interface ethernet 1/11 speed 100G no-autoneg fae interface ethernet 1/12 speed 100G no-autoneg fae interface ethernet 1/13 speed 100G no-autoneg fae interface ethernet 1/14 speed 100G no-autoneg fae interface ethernet 1/15 speed 100G no-autoneg fae interface ethernet 1/16 speed 100G no-autoneg fae interface ethernet 1/30 speed 100G no-autoneg interface ethernet 1/1-1/16 mtu 9216 force interface ethernet 1/28-1/30 mtu 9216 force interface port-channel 28 mtu 9216 force interface ethernet 1/1-1/16 switchport mode hybrid interface ethernet 1/28-1/29 channel-group 28 mode on interface ethernet 1/30-1/32 switchport mode hybrid

20 SUSE® Enterprise Storage on Ampere® eMAG

Page 21: Bryan Gartner, Senior Technology Strategist, SUSE

interface port-channel 28 switchport mode hybrid interface port-channel 28 description uplink LACP

#### LAG configuration## lacp interface port-channel 28 lacp-individual enable force port-channel load-balance ethernet source-destination-mac

#### VLAN configuration## vlan 197 vlan 220-2227 interface ethernet 1/1-1/16 switchport access vlan 197 interface ethernet 1/1-1/16 switchport hybrid allowed-vlan all interface ethernet 1/30-1/32 switchport hybrid allowed-vlan all interface port-channel 28 switchport hybrid allowed-vlan all vlan 197 name "pxe" vlan 220 name "storage" vlan 227 name "storage2"

13 Appendix D: OS Networking ConfigurationEach host is configured with an active passive bond. This alleviates the need for switch basedconfiguration to support the bonding and still provides sufficient bandwidth for all IO requests

/etc/sysconfig/network # cat ifcfg-eth0BOOTPROTO='dhcp'STARTMODE='auto'#/etc/sysconfig/network # cat ifcfg-vlan227BOOTPROTO='static'BROADCAST=''ETHERDEVICE='eth0'ETHTOOL_OPTIONS=''IPADDR='172.16.227.50/24'MTU=''NAME=''NETWORK=''REMOTE_IPADDR=''STARTMODE='auto'VLAN_ID='227'#

21 SUSE® Enterprise Storage on Ampere® eMAG

Page 22: Bryan Gartner, Senior Technology Strategist, SUSE

/etc/sysconfig/network # cat ifcfg-vlan220BOOTPROTO='static'BROADCAST=''ETHERDEVICE='eth0'ETHTOOL_OPTIONS=''IPADDR='172.16.220.50/24'MTU=''NAME=''NETWORK=''REMOTE_IPADDR=''STARTMODE='auto'VLAN_ID='220'

22 SUSE® Enterprise Storage on Ampere® eMAG

Page 23: Bryan Gartner, Senior Technology Strategist, SUSE

14 Resources

SUSE Enterprise Storage Technical Overview https://www.suse.com/docrep/documents/1mdg7e-

q2kz/suse_enterprise_storage_technical_overview_wp.pdf

SUSE Enterprise Storage (6) Deployment Guide https://documentation.suse.com/ses/6/sin-

gle-html/ses-deployment/#book-storage-deployment

SUSE Linux Enterprise Server 15 SP1 Administration Guide https://documenta-

tion.suse.com/sles/15-SP1/single-html/SLES-admin/#book-sle-admin

Repository Mirroring Tool https://documentation.suse.com/sles/15-SP1/single-html/SLES-rmt/

#book-rmt

Armv8 https://developer.arm.com/architectures/cpu-architecture/a-profile

Ampere https://amperecomputing.com/

Micron Operating system and storage drives https://www.micron.com/products/ssd/prod-

uct-lines/5300 https://www.micron.com/products/ssd/product-lines/7300

Broadcom HBA BRCM 9500-16i HBA https://www.broadcom.com/products/storage/host-bus-

adapters/sas-nvme-9500-16i

NVIDIA System Network Interface Card MCX653105A-HDAT ConnectX-6 VPIAdapter https://store.mellanox.com/products/mellanox-mcx653105a-hdat-sp-single-pack-connec-

tx-6-vpi-adapter-card-hdr-ib-and-200gbe-single-port-qsfp56-pcie4-0-x16-tall-bracket.html andNetwork Switch Spectrum-2 MSN3700C https://www.mellanox.com/products/ethernet-switch-

es/sn3000

15 Legal Notice

Copyright © 2006–2022 SUSE LLC and contributors. All rights reserved.

Permission is granted to copy, distribute and/or modify this document under the terms of theGNU Free Documentation License, Version 1.2 or (at your option) version 1.3; with the InvariantSection being this copyright notice and license. A copy of the license version 1.2 is included inthe section entitled "GNU Free Documentation License".

SUSE, the SUSE logo and YaST are registered trademarks of SUSE LLC in the United States andother countries. For SUSE trademarks, see https://www.suse.com/company/legal/ .

Linux is a registered trademark of Linus Torvalds. All other names or trademarks mentioned inthis document may be trademarks or registered trademarks of their respective owners.

23 SUSE® Enterprise Storage on Ampere® eMAG

Page 24: Bryan Gartner, Senior Technology Strategist, SUSE

Documents published as part of the SUSE Best Practices series have been contributed voluntarilyby SUSE employees and third parties. They are meant to serve as examples of how particularactions can be performed. They have been compiled with utmost attention to detail. However,this does not guarantee complete accuracy. SUSE cannot verify that actions described in thesedocuments do what is claimed or whether actions described have unintended consequences.SUSE LLC, its affiliates, the authors, and the translators may not be held liable for possible errorsor the consequences thereof.

Below we draw your attention to the license under which the articles are published.

24 SUSE® Enterprise Storage on Ampere® eMAG

Page 25: Bryan Gartner, Senior Technology Strategist, SUSE

16 GNU Free Documentation License

Copyright © 2000, 2001, 2002 Free Software Foundation, Inc. 51 Franklin St, Fifth Floor, Boston,MA 02110-1301 USA. Everyone is permitted to copy and distribute verbatim copies of thislicense document, but changing it is not allowed.

0. PREAMBLE

The purpose of this License is to make a manual, textbook, or other functional and useful docu-ment "free" in the sense of freedom: to assure everyone the effective freedom to copy and redis-tribute it, with or without modifying it, either commercially or noncommercially. Secondarily,this License preserves for the author and publisher a way to get credit for their work, while notbeing considered responsible for modifications made by others.

This License is a kind of "copyleft", which means that derivative works of the document mustthemselves be free in the same sense. It complements the GNU General Public License, whichis a copyleft license designed for free software.

We have designed this License in order to use it for manuals for free software, because freesoftware needs free documentation: a free program should come with manuals providing thesame freedoms that the software does. But this License is not limited to software manuals; itcan be used for any textual work, regardless of subject matter or whether it is published as aprinted book. We recommend this License principally for works whose purpose is instructionor reference.

1. APPLICABILITY AND DEFINITIONS

This License applies to any manual or other work, in any medium, that contains a notice placedby the copyright holder saying it can be distributed under the terms of this License. Such anotice grants a world-wide, royalty-free license, unlimited in duration, to use that work underthe conditions stated herein. The "Document", below, refers to any such manual or work. Anymember of the public is a licensee, and is addressed as "you". You accept the license if you copy,modify or distribute the work in a way requiring permission under copyright law.

A "Modified Version" of the Document means any work containing the Document or a portionof it, either copied verbatim, or with modifications and/or translated into another language.

25 SUSE® Enterprise Storage on Ampere® eMAG

Page 26: Bryan Gartner, Senior Technology Strategist, SUSE

A "Secondary Section" is a named appendix or a front-matter section of the Document that dealsexclusively with the relationship of the publishers or authors of the Document to the Document’soverall subject (or to related matters) and contains nothing that could fall directly within thatoverall subject. (Thus, if the Document is in part a textbook of mathematics, a Secondary Sectionmay not explain any mathematics.) The relationship could be a matter of historical connectionwith the subject or with related matters, or of legal, commercial, philosophical, ethical or po-litical position regarding them.

The "Invariant Sections" are certain Secondary Sections whose titles are designated, as beingthose of Invariant Sections, in the notice that says that the Document is released under thisLicense. If a section does not t the above definition of Secondary then it is not allowed to bedesignated as Invariant. The Document may contain zero Invariant Sections. If the Documentdoes not identify any Invariant Sections then there are none.

The "Cover Texts" are certain short passages of text that are listed, as Front-Cover Texts or Back-Cover Texts, in the notice that says that the Document is released under this License. A Front-Cover Text may be at most 5 words, and a Back-Cover Text may be at most 25 words.

A "Transparent" copy of the Document means a machine-readable copy, represented in a formatwhose specification is available to the general public, that is suitable for revising the documentstraightforwardly with generic text editors or (for images composed of pixels) generic paintprograms or (for drawings) some widely available drawing editor, and that is suitable for inputto text formatters or for automatic translation to a variety of formats suitable for input to textformatters. A copy made in an otherwise Transparent le format whose markup, or absence ofmarkup, has been arranged to thwart or discourage subsequent modification by readers is notTransparent. An image format is not Transparent if used for any substantial amount of text. Acopy that is not "Transparent" is called "Opaque".

Examples of suitable formats for Transparent copies include plain ASCII without markup, Tex-info input format, LaTeX input format, SGML or XML using a publicly available DTD, and stan-dard-conforming simple HTML, PostScript or PDF designed for human modification. Examplesof transparent image formats include PNG, XCF and JPG. Opaque formats include proprietaryformats that can be read and edited only by proprietary word processors, SGML or XML forwhich the DTD and/or processing tools are not generally available, and the machine-generatedHTML, PostScript or PDF produced by some word processors for output purposes only.

The "Title Page" means, for a printed book, the title page itself, plus such following pages as areneeded to hold, legibly, the material this License requires to appear in the title page. For worksin formats which do not have any title page as such, "Title Page" means the text near the mostprominent appearance of the work’s title, preceding the beginning of the body of the text.

26 SUSE® Enterprise Storage on Ampere® eMAG

Page 27: Bryan Gartner, Senior Technology Strategist, SUSE

A section "Entitled XYZ" means a named subunit of the Document whose title either is preciselyXYZ or contains XYZ in parentheses following text that translates XYZ in another language.(Here XYZ stands for a specific section name mentioned below, such as "Acknowledgements","Dedications", "Endorsements", or "History".) To "Preserve the Title" of such a section when youmodify the Document means that it remains a section "Entitled XYZ" according to this definition.

The Document may include Warranty Disclaimers next to the notice which states that this Li-cense applies to the Document. These Warranty Disclaimers are considered to be included byreference in this License, but only as regards disclaiming warranties: any other implication thatthese Warranty Disclaimers may have is void and has no effect on the meaning of this License.

2. VERBATIM COPYING

You may copy and distribute the Document in any medium, either commercially or noncom-mercially, provided that this License, the copyright notices, and the license notice saying thisLicense applies to the Document are reproduced in all copies, and that you add no other condi-tions whatsoever to those of this License. You may not use technical measures to obstruct orcontrol the reading or further copying of the copies you make or distribute. However, you mayaccept compensation in exchange for copies. If you distribute a large enough number of copiesyou must also follow the conditions in section 3.

You may also lend copies, under the same conditions stated above, and you may publicly displaycopies.

3. COPYING IN QUANTITY

If you publish printed copies (or copies in media that commonly have printed covers) of theDocument, numbering more than 100, and the Document’s license notice requires Cover Texts,you must enclose the copies in covers that carry, clearly and legibly, all these Cover Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on the back cover. Both covers must alsoclearly and legibly identify you as the publisher of these copies. The front cover must present thefull title with all words of the title equally prominent and visible. You may add other materialon the covers in addition. Copying with changes limited to the covers, as long as they preservethe title of the Document and satisfy these conditions, can be treated as verbatim copying inother respects.

27 SUSE® Enterprise Storage on Ampere® eMAG

Page 28: Bryan Gartner, Senior Technology Strategist, SUSE

If the required texts for either cover are too voluminous to t legibly, you should put the rstones listed (as many as t reasonably) on the actual cover, and continue the rest onto adjacentpages.

If you publish or distribute Opaque copies of the Document numbering more than 100, you musteither include a machine-readable Transparent copy along with each Opaque copy, or state inor with each Opaque copy a computer-network location from which the general network-usingpublic has access to download using public-standard network protocols a complete Transparentcopy of the Document, free of added material. If you use the latter option, you must take rea-sonably prudent steps, when you begin distribution of Opaque copies in quantity, to ensure thatthis Transparent copy will remain thus accessible at the stated location until at least one yearafter the last time you distribute an Opaque copy (directly or through your agents or retailers)of that edition to the public.

It is requested, but not required, that you contact the authors of the Document well beforeredistributing any large number of copies, to give them a chance to provide you with an updatedversion of the Document.

4. MODIFICATIONS

You may copy and distribute a Modified Version of the Document under the conditions of sec-tions 2 and 3 above, provided that you release the Modified Version under precisely this License,with the Modified Version filling the role of the Document, thus licensing distribution and mod-ification of the Modified Version to whoever possesses a copy of it. In addition, you must dothese things in the Modified Version:

A. Use in the Title Page (and on the covers, if any) a title distinct from that of the Document,and from those of previous versions (which should, if there were any, be listed in theHistory section of the Document). You may use the same title as a previous version if theoriginal publisher of that version gives permission.

B. List on the Title Page, as authors, one or more persons or entities responsible for authorshipof the modifications in the Modified Version, together with at least ve of the principalauthors of the Document (all of its principal authors, if it has fewer than ve), unless theyrelease you from this requirement.

C. State on the Title page the name of the publisher of the Modified Version, as the publisher.

D. Preserve all the copyright notices of the Document.

28 SUSE® Enterprise Storage on Ampere® eMAG

Page 29: Bryan Gartner, Senior Technology Strategist, SUSE

E. Add an appropriate copyright notice for your modifications adjacent to the other copyrightnotices.

F. Include, immediately after the copyright notices, a license notice giving the public permis-sion to use the Modified Version under the terms of this License, in the form shown inthe Addendum below.

G. Preserve in that license notice the full lists of Invariant Sections and required Cover Textsgiven in the Document’s license notice.

H. Include an unaltered copy of this License.

I. Preserve the section Entitled "History", Preserve its Title, and add to it an item stating atleast the title, year, new authors, and publisher of the Modified Version as given on theTitle Page. If there is no section Entitled "History" in the Document, create one stating thetitle, year, authors, and publisher of the Document as given on its Title Page, then add anitem describing the Modified Version as stated in the previous sentence.

J. Preserve the network location, if any, given in the Document for public access to a Trans-parent copy of the Document, and likewise the network locations given in the Documentfor previous versions it was based on. These may be placed in the "History" section. Youmay omit a network location for a work that was published at least four years before theDocument itself, or if the original publisher of the version it refers to gives permission.

K. For any section Entitled "Acknowledgements" or "Dedications", Preserve the Title of thesection, and preserve in the section all the substance and tone of each of the contributoracknowledgements and/or dedications given therein.

L. Preserve all the Invariant Sections of the Document, unaltered in their text and in theirtitles. Section numbers or the equivalent are not considered part of the section titles.

M. Delete any section Entitled "Endorsements". Such a section may not be included in theModified Version.

N. Do not retitle any existing section to be Entitled "Endorsements" or to conflict in title withany Invariant Section.

O. Preserve any Warranty Disclaimers.

29 SUSE® Enterprise Storage on Ampere® eMAG

Page 30: Bryan Gartner, Senior Technology Strategist, SUSE

If the Modified Version includes new front-matter sections or appendices that qualify as Se-condary Sections and contain no material copied from the Document, you may at your optiondesignate some or all of these sections as invariant. To do this, add their titles to the list ofInvariant Sections in the Modified Version’s license notice. These titles must be distinct fromany other section titles.

You may add a section Entitled "Endorsements", provided it contains nothing but endorsementsof your Modified Version by various parties—for example, statements of peer review or that thetext has been approved by an organization as the authoritative definition of a standard.

You may add a passage of up to ve words as a Front-Cover Text, and a passage of up to 25words as a Back-Cover Text, to the end of the list of Cover Texts in the Modified Version. Onlyone passage of Front-Cover Text and one of Back-Cover Text may be added by (or througharrangements made by) any one entity. If the Document already includes a cover text for thesame cover, previously added by you or by arrangement made by the same entity you are actingon behalf of, you may not add another; but you may replace the old one, on explicit permissionfrom the previous publisher that added the old one.

The author(s) and publisher(s) of the Document do not by this License give permission to usetheir names for publicity for or to assert or imply endorsement of any Modified Version.

5. COMBINING DOCUMENTS

You may combine the Document with other documents released under this License, under theterms defined in section 4 above for modified versions, provided that you include in the combi-nation all of the Invariant Sections of all of the original documents, unmodified, and list themall as Invariant Sections of your combined work in its license notice, and that you preserve alltheir Warranty Disclaimers.

The combined work need only contain one copy of this License, and multiple identical InvariantSections may be replaced with a single copy. If there are multiple Invariant Sections with thesame name but different contents, make the title of each such section unique by adding at theend of it, in parentheses, the name of the original author or publisher of that section if known,or else a unique number. Make the same adjustment to the section titles in the list of InvariantSections in the license notice of the combined work.

30 SUSE® Enterprise Storage on Ampere® eMAG

Page 31: Bryan Gartner, Senior Technology Strategist, SUSE

In the combination, you must combine any sections Entitled "History" in the various originaldocuments, forming one section Entitled "History"; likewise combine any sections Entitled "Ac-knowledgements", and any sections Entitled "Dedications". You must delete all sections Entitled"Endorsements".

6. COLLECTIONS OF DOCUMENTS

You may make a collection consisting of the Document and other documents released underthis License, and replace the individual copies of this License in the various documents with asingle copy that is included in the collection, provided that you follow the rules of this Licensefor verbatim copying of each of the documents in all other respects.

You may extract a single document from such a collection, and distribute it individually underthis License, provided you insert a copy of this License into the extracted document, and followthis License in all other respects regarding verbatim copying of that document.

7. AGGREGATION WITH INDEPENDENT WORKS

A compilation of the Document or its derivatives with other separate and independent docu-ments or works, in or on a volume of a storage or distribution medium, is called an "aggregate"if the copyright resulting from the compilation is not used to limit the legal rights of the com-pilation’s users beyond what the individual works permit. When the Document is included inan aggregate, this License does not apply to the other works in the aggregate which are notthemselves derivative works of the Document.

If the Cover Text requirement of section 3 is applicable to these copies of the Document, then ifthe Document is less than one half of the entire aggregate, the Document’s Cover Texts may beplaced on covers that bracket the Document within the aggregate, or the electronic equivalentof covers if the Document is in electronic form. Otherwise they must appear on printed coversthat bracket the whole aggregate.

8. TRANSLATIONTranslation is considered a kind of modification, so you may distribute translations of the Doc-ument under the terms of section 4. Replacing Invariant Sections with translations requires spe-cial permission from their copyright holders, but you may include translations of some or all

31 SUSE® Enterprise Storage on Ampere® eMAG

Page 32: Bryan Gartner, Senior Technology Strategist, SUSE

Invariant Sections in addition to the original versions of these Invariant Sections. You may in-clude a translation of this License, and all the license notices in the Document, and any War-ranty Disclaimers, provided that you also include the original English version of this Licenseand the original versions of those notices and disclaimers. In case of a disagreement betweenthe translation and the original version of this License or a notice or disclaimer, the originalversion will prevail.

If a section in the Document is Entitled "Acknowledgements", "Dedications", or "History", therequirement (section 4) to Preserve its Title (section 1) will typically require changing the actualtitle.

9. TERMINATION

You may not copy, modify, sublicense, or distribute the Document except as expressly providedfor under this License. Any other attempt to copy, modify, sublicense or distribute the Documentis void, and will automatically terminate your rights under this License. However, parties whohave received copies, or rights, from you under this License will not have their licenses termi-nated so long as such parties remain in full compliance.

10. FUTURE REVISIONS OF THIS LICENSE

The Free Software Foundation may publish new, revised versions of the GNU Free Documenta-tion License from time to time. Such new versions will be similar in spirit to the present version,but may differ in detail to address new problems or concerns. See http://www.gnu.org/copyleft/ .

Each version of the License is given a distinguishing version number. If the Document specifiesthat a particular numbered version of this License "or any later version" applies to it, you havethe option of following the terms and conditions either of that specified version or of any laterversion that has been published (not as a draft) by the Free Software Foundation. If the Documentdoes not specify a version number of this License, you may choose any version ever published(not as a draft) by the Free Software Foundation.

ADDENDUM: How to use this License for your documents

Copyright (c) YEAR YOUR NAME. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2

32 SUSE® Enterprise Storage on Ampere® eMAG

Page 33: Bryan Gartner, Senior Technology Strategist, SUSE

or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.

If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts, replace the “ with…Texts.” line with this:

with the Invariant Sections being LIST THEIR TITLES, with the Front-Cover Texts being LIST, and with the Back-Cover Texts being LIST.

If you have Invariant Sections without Cover Texts, or some other combination of the three,merge those two alternatives to suit the situation.

If your document contains nontrivial examples of program code, we recommend releasing theseexamples in parallel under your choice of free software license, such as the GNU General PublicLicense, to permit their use in free software.

33 SUSE® Enterprise Storage on Ampere® eMAG