Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
REFERENCE ARCHITECTURE
Containers as a Service Integrate Kubernetes and Pure Storage solutions.
Contents Introduction ........................................................................................................................................................................................ 4
Containers as a Service (CaaS) ......................................................................................................................................................... 4
CaaS Defined ........................................................................................................................................................................................................................... 4
Components, Prerequisites, and Configuration ............................................................................................................................... 4
Pure Storage FlashArray ...................................................................................................................................................................................................... 4
Purity for FlashArray (Purity//FA 5) ..................................................................................................................................................................................... 5
Technical Specifications* ..................................................................................................................................................................................................... 5
Pure Storage FlashBlade ................................................................................................................................................................... 6
Purpose-Built for Modern Analytics .................................................................................................................................................................................. 6
The FlashBlade Difference .................................................................................................................................................................................................. 7
Purity for FlashBlade (Purity//FB) ........................................................................................................................................................................................ 8
Pure1 ................................................................................................................................................................................................... 8
Pure1 Manage .......................................................................................................................................................................................................................... 8
Pure1 Analyze ........................................................................................................................................................................................................................... 8
Pure1 Support ........................................................................................................................................................................................................................... 8
Pure1 Meta ................................................................................................................................................................................................................................. 8
Evergreen™ Storage ........................................................................................................................................................................... 8
Pure Service Orchestrator ................................................................................................................................................................. 9
Kubernetes ......................................................................................................................................................................................... 9
High-level Design ............................................................................................................................................................................ 10
Software Version Details ................................................................................................................................................................. 10
Compute ............................................................................................................................................................................................ 11
Networking ........................................................................................................................................................................................ 11
Deployment ............................................................................................................................................................................................................................. 11
Persistent Storage ................................................................................................................................................................................................................ 12
Pure Service Orchestrator Installation ............................................................................................................................................ 12
PSO Installation (using Helm) ............................................................................................................................................................................................ 12
Validate Correct Plugin Installation ................................................................................................................................................................................. 14
Scaling Backend Storage .................................................................................................................................................................................................. 15
Simple Pod Deployment with a Pure Storage FlashArray Persistent Volume .................................................................................................. 15
Persistent Volume Claim ................................................................................................................................................................. 15
REFERENCE ARCHITECTURE
3
Application Deployment ................................................................................................................................................................... 17
Simple Multiple Pod Deployment with a Pure Storage FlashBlade Persistent Volume ................................................................................ 19
Persistent Volume Claim ................................................................................................................................................................. 20
Nginx Application Pod ..................................................................................................................................................................... 21
Additional Application Pod .............................................................................................................................................................. 23
Adding Nodes to a Kubernetes Cluster ....................................................................................................................................................................... 24
Conclusion ....................................................................................................................................................................................... 25
Appendix: Application Examples .................................................................................................................................................... 26
MongoDB ............................................................................................................................................................................................................................... 26
WordPress .............................................................................................................................................................................................................................. 28
About the Author ............................................................................................................................................................................. 31
4
Introduction
This document provides a practical reference implementation to help integrate Pure Storage® products into the deployment of a bare-metal Kubernetes infrastructure. You can easily scale this underlying infrastructure to whatever size is required. This document assumes that you understand how to deploy a bare-metal Kubernetes solution and provides details only for Pure Storage integration pieces. Find links to details on Kubernetes deployments in the Appendix.
Containers as a Service (CaaS)
CaaS Defined
Containers as a Service is an implementation of container-based virtualization, where container engines, underlying
compute servers, and orchestration toolsets are made available to users from a provider. The providers range from the big
three cloud providers, down to private, on-premises, company-owned solutions.
CaaS can provide users with an architecture to enable DevOps teams the agility to automate ‘code check-in and go-live’
process for containerized solutions, which can significantly reduce the time to deploy and time to go-live into production for
these applications. CaaS is for deploying applications where there is a requirement for more control over the components
of applications and a requirement for developers to have a greater understanding of the build and run processes required
by the application. For example, in a CaaS environment a developer who has written an application in, say, Python, needs
to understand how to create an empty container image with a base filesystem and then move the code into the container
locally. You might then have a requirement to compile the code, download dependencies, and finally create a Docker
image. Only when the image has been created can it be used in the CaaS platform.
Components, Prerequisites, and Configuration
Pure Storage FlashArray
The Pure Storage FlashArray™ family delivers purpose-built, software-
defined all-flash power and reliability for businesses of every size.
FlashArray is all-flash enterprise storage that is up to 10X faster, more
space and power-efficient, more reliable, and far simpler than other
available solutions. Critically, FlashArray also costs less, with a TCO that’s
typically 50% lower than traditional performance disk arrays.
At the top of the FlashArray line is FlashArray//X – the first mainstream, 100% NVMe, enterprise-class all-flash array. //X
represents a higher performance tier for mission-critical databases, top-of-rack flash deployments, and Tier 1 application
REFERENCE ARCHITECTURE
5
consolidation. //X, at 1PB in 3U, with hundred-microsecond range latency and GBs of bandwidth, delivers an unprecedented
level of performance density that makes possible previously unattainable levels of consolidation. FlashArray//X is ideal for
cost-effective consolidation of everything on flash. Whether accelerating a single database, scaling virtual desktop
environments, or powering an all-flash cloud, there is an //X model that fits your needs.
Purity for FlashArray (Purity//FA 5)
At the heart of every FlashArray is Purity Operating Environment software. Purity//FA5 implements advanced data
reduction, storage management, and flash management features, enabling your organization to enjoy Tier 1 data services
for all workloads, proven 99.9999% availability over two years (inclusive of maintenance and generational upgrades),
completely non-disruptive operations, 2X better data reduction versus alternative all-flash solutions, and – with
FlashArray//X – the power and efficiency of DirectFlash™. Moreover, Purity includes enterprise-grade data security,
comprehensive data protection options, and complete business continuity via Pure Storage ActiveCluster™ multi-site stretch
cluster. All these features are included with every array.
Technical Specifications*
Capacity** Physical
//X10 Up to 55TB/53.5TIB effective capacity
Up to 20TB/18.6TIB raw capacity
3PU, 490–600 Watts (normal–peak)
95lbs (43.1kg) fully loaded, 5.12”x18.94”x29.72” chassis
//X20 Up to 55TB/53.5TIB effective capacity
Up to 20TB/18.6TIB raw capacity
3PU, 490–600 Watts (normal–peak)
95lbs (43.1kg) fully loaded, 5.12”x18.94”x29.72” chassis
//X50 Up to 55TB/53.5TIB effective capacity
Up to 20TB/18.6TIB raw capacity
3PU, 490–600 Watts (normal–peak)
95lbs (43.1kg) fully loaded, 5.12”x18.94”x29.72” chassis
//X70 Up to 55TB/53.5TIB effective capacity
Up to 20TB/18.6TIB raw capacity
3PU, 490–600 Watts (normal–peak)
95lbs (43.1kg) fully loaded, 5.12”x18.94”x29.72” chassis
//X90 Up to 55TB/53.5TIB effective capacity
Up to 20TB/18.6TIB raw capacity
3PU, 490–600 Watts (normal–peak)
95lbs (43.1kg) fully loaded, 5.12”x18.94”x29.72” chassis
Direct Flash Shelf Up to 55TB/53.5TIB effective capacity
Up to 20TB/18.6TIB raw capacity
3PU, 490–600 Watts (normal–peak)
95lbs (43.1kg) fully loaded, 5.12”x18.94”x29.72” chassis
* Stated //X specifications are applicable to //X R2 versions.
** Effective capacity assumes HA, RAID, and metadata overhead, GB-to-GiB conversion, and includes the benefit of data reduction with always-on inline deduplication, compression, and pattern
removal. Average data reduction is calculated at 5-to-1 and does not include thin provisioning.
REFERENCE ARCHITECTURE
6
//X Connectivity
Onboard Ports (per controller) • 2x1/10/25 GB Ethernet
• 2x1/10/25 GB Ethernet Replication
• 2x1GB Management Ports
Host I/O Cards (3 slots/controller) • 2-port 10 GBase-T Ethernet
• 2-port 1/10/25 GB Ethernet
• 2-port 40GB Ethernet
• 2-port 50GB Ethernet (NVMe-oF Ready)
• 2-port 16/32GB Fibre Channel (NVMe-oF Ready)
• 4-port 16/32GB Fibre Channel (NVMe-oF Ready)
Pure Storage FlashBlade
FlashBlade™ is a new, innovative scale-out storage system designed
to accelerate modern analytics applications while providing best-of-
breed performance in all dimensions of concurrency – including
IOPS, throughput, latency, and capacity. FlashBlade is as simple as it
is powerful, offering elastic scale-out storage services at every layer
alongside DirectFlash technology for global flash management.
Purpose-Built for Modern Analytics
FlashBlade is the industry’s first cloud-era flash purpose-built for modern analytics, delivering unprecedented performance
for big data applications. Its massively distributed architecture enables consistent performance for all analytics applications
using NFS, S3/Object, SMB, and HTTP protocols.
FAST• Elastic performance that
grows with data, up to 17 GB/s
• Always-fast, from small to
large files
• Massively parallel
architecture from software
to flash
BIG• Petabytes of capacity
• Elastic concurrency,
up to 10s of thousands
of clients
• 10s of billions of objects
and files
SIMPLE• Evergreen™ – don’t rebuy
TBs you already own
• “Tuned for Everything”
design, no manual
optimizations required
• Scale-out everything
instantly by simply
adding blades
REFERENCE ARCHITECTURE
7
The FlashBlade Difference
BLADESCALE-OUT DIRECTFLASH + COMPUTEUltra-low latency, 8, 17, and 52TB
capacity options that can be
hot-plugged into the system for
expansion and performance
PURITY//FBSCALE-OUT STORAGE SOFTWAREThe heart of FlashBlade,
implementing its scale-out
storage capabilities, services,
and management
FABRICSOFTWARE- DEFINED NETWORKINGIncludes a built in 40Gb Ethernet
fabric, providing a total network
bandwidth of 320Gb/s for the chassis
POWER, DENSITY, EFFICIENCYFlashBlade delivers industry-leading throughput,
IOPS, latency, and capacity – with up to 20x less
space and 10x less power and cooling.
REFERENCE ARCHITECTURE
8
Purity for FlashBlade (Purity//FB)
FlashBlade is built on the scale-out metadata architecture of Purity for FlashBlade, capable of handling 10s of billions of files
and objects while delivering maximum performance, effortless scale, and global flash management. The distributed
transaction database built into the core of Purity means storage services at every layer are elastic: simply adding blades
grows system capacity and performance, linearly and instantly. Purity//FB supports S3-compliant object store, offering ultra-
fast performance at scale. It also supports File protocol including NFSv3 and SMB, and offers a wave of new enterprise
features, like snapshots, LDAP, network lock management (NLM), and IPv6, to extend FlashBlade into new use cases.
Pure1
Pure1®, our cloud-based management, analytics, and support platform, expands the self-managing, plug-n-play design of
Pure all-flash arrays with the machine learning predictive analytics and continuous scanning of Pure1 Meta™ to enable an
effortless, worry-free data platform.
Pure1 Manage
In the Cloud IT operating model, installing and deploying management software is an oxymoron: you simply log in. Pure1
Manage is SaaS-based, allowing you to manage your array from any browser or the Pure1 Mobile App – with nothing extra
to purchase, deploy, or maintain. From a single dashboard, you can manage all your arrays, with full visibility on the health
and performance of your storage.
Pure1 Analyze
Pure1 Analyze delivers accurate performance forecasting – giving you complete visibility into the performance and capacity
needs of your arrays – now and in the future. Performance forecasting enables intelligent consolidation and
unprecedented workload optimization.
Pure1 Support
Pure combines an ultra-proactive support team with the predictive intelligence of Pure1 Meta to deliver unrivalled support
that’s a key component in our proven FlashArray 99.9999% availability. Customers are often surprised and delighted when
we fix issues they did not even know existed.
Pure1 Meta
The foundation of Pure1 services, Pure1 Meta, is global intelligence built from a massive collection of storage array health
and performance data. By continuously scanning call-home telemetry from Pure’s installed base, Pure1 Meta uses machine
learning predictive analytics to help resolve potential issues and optimize workloads. The result is both a white glove
customer support experience and breakthrough capabilities like accurate performance forecasting. Meta is always
expanding and refining what it knows about array performance and health, moving the Data Platform toward a future of
self-driving storage.
Evergreen™ Storage
Customers can deploy storage once and enjoy a subscription to continuous innovation via Pure’s Evergreen Storage
ownership model: expand and improve performance, capacity, density, and/or features for 10 years or more – all without
REFERENCE ARCHITECTURE
9
downtime, performance impact, or data migrations. Pure has disrupted the industry’s 3-5-year rip-and-replace cycle by
engineering compatibility for future technologies right into its products, notably with the NVMe-Ready Guarantee for //M
and online upgrade from any //M to //X.
Pure Service Orchestrator
Since 2017 Pure Storage has been building seamless integrations with container platforms and orchestration engines using
the plugin model, allowing persistent storage to be leveraged by environments such as Kubernetes.
As adoption of container environments move forward, the device plugin model is no longer sufficient to deliver the cloud
experience developers are expecting. This is amplified by the fluid nature of modern containerized environments, where
stateless containers are spun up and spun down within seconds, and stateful containers have much longer lifespans. Some
applications in these environments require block storage, while others require file storage, and a container environment
can rapidly scale to 1000s of containers. These requirements can quickly push past the boundaries of any single storage
system. We designed Pure Service Orchestrator™ to provide your developers with a similar experience to what they expect
from the public cloud. Pure Service Orchestrator can offer a seamless container-as-a-service environment that is:
• Simple, Automated and Integrated: Provisions storage on-demand automatically via policy, and integrates seamlessly,
enabling DevOps and Developer friendly ways to consume storage
• Elastic: Allows you to start small and scale your storage environment with ease and flexibility, mixing and matching
varied configurations as your Swarm environment grows
• Multi-protocol: Support for both file and block
• Enterprise-grade: Deliver the same Tier1 resilience, reliability and protection that your mission-critical applications
depend upon, for stateful applications in your Kubernetes clusters
• Shared: Makes shared storage a viable and preferred architectural choice for the next generation, containerized data
centers by delivering a vastly superior experience relative to direct-attached storage alternatives.
Pure Service Orchestrator integrates seamlessly with your Kubernetes orchestration environment and functions as a
control-plane virtualization layer that enables containers as a service rather than storage as a service.
Kubernetes
Kubernetes is an Open Source system for managing containerized applications across multiple hosts, providing basic
mechanisms for deployment, maintenance, and scaling of applications. The Open Source project is hosted by the Cloud
Native Computing Foundation.
Kubernetes coordinates a highly available cluster of computers that are connected to work as a single unit. The
abstractions in Kubernetes allow you to deploy containerized applications to a cluster without tying them explicitly to
individual machines. To make use of this new model of deployment, applications need to be packaged in a way that
decouples them from individual hosts: they need to be containerized. Containerized applications are more flexible and
available than in past deployment models, where applications were installed directly onto specific machines as packages
deeply integrated into the host. Kubernetes automates the distribution and scheduling of application containers across a
cluster in a more efficient way. Kubernetes is an open-source platform and is production-ready.
REFERENCE ARCHITECTURE
10
High-level Design
The reference implementation used and described in this document consists of an eight-node cluster, consisting of two
kube-master hosts, with all eight nodes being considered node hosts. The clustered etcd key-value store is run over three
nodes. It is responsible for managing the entire cluster, where the node hosts run the applications within pods and
communications between the kube-master nodes and the etcd system using APIs. This is shown in the figure below:
Figure 1.: High-Level Architectural Design
Software Version Details
This table provides the installed software versions for the different components used in building this
Reference Implementation.
Software Version
CentOS 7 kernel 3.10.0-1062
Kubernetes 1.17.3
Docker 19.03.6
HELM 3.1.0
Pure Storage Kubernetes Plugin (Pure Service Orchestrator)
1.0.8
REFERENCE ARCHITECTURE
11
Compute
The role of this Reference Implementation is not to prescribe specific compute platforms for the OpenShift platform.
Therefore, we refer to the servers being used in white-box terms. The servers used here have the following specifications:
• Intel® Xeon® E5-2630 v3 @ 2.40GHz
• 32 vCPU
• 128 GiB memory
Networking
From a networking perspective, the servers in use have the following connected network interfaces:
• 1 x 10GbE (management)
• 1 x 10GbE (iSCSI data plane)
There is no specific network hardware-defined within this document, as this decision is dependent on the actual
implementation performed by the reader. Within the Kubernetes networking layer, this implementation uses the Calico
network plugin, the default provided by kubespray, although there are other network plugins available.
The networking communication between the Pure Service Orchestrator and backing storage devices requires that all
cluster nodes have management plane access to all FlashArray and FlashBlade devices. FlashBlade data plane
communication is performed using the NFS protocol whereas data plane communication between cluster nodes and
FlashArrays is performed using an iSCSI network that can be either layer 2 or layer 3 depending on your network
architecture. In this implementation, the iSCSI data plane is isolated from the management plane network, and jumbo
frames are used end-to-end for the data plane. Fibre Channel is also a supported data plane protocol for FlashArrays, but
this would require HBA cards to be installed in all cluster nodes and for zones to have been created between FlashArray
FC ports and all cluster nodes prior to installing the Pure Service Orchestrator.
Deployment
While it is not in the scope of this document to go into detail on how to build a Kubernetes cluster, this deployment was
implemented using the Kubernetes Incubator project, kubespray. If you decide to use kubespray as your deployment
toolset, you are then recommended to perform the following tasks to ensure a smooth deployment on all cluster nodes:
• Ensure swap is disabled on all cluster nodes and the swap entry is removed from /etc/fstab
• Disable the firewalld software as this will interrupt the Kubernetes API communications within the cluster
• Install the Fedora Extra Packages for Enterprise Linux (EPEL) package
To ensure that all FlashArray connections are optimal, it is necessary to install the latest device-mapper-multipath package
as well as the latest iscsi-initiator-utils package, and then enable both the multipath and iscsid daemons, to ensure they
persist after any reboots. More details can be found in the Pure Knowledge Base article on Linux Recommendations.
Note: It is also advisable to implement the udev rules defined in the Knowledge Base article mentioned above to
ensure optimal performance of your connected Pure Storage volumes.
REFERENCE ARCHITECTURE
12
At this point, the deployment of the Kubernetes cluster can proceed using kubespray1. By default, kubespray installs the
Kubernetes Dashboard so you will want to grant the Dashboard Service Account Admin privileges2 or create a user to
access the dashboard3.
After completing the deployment of your cluster, it is necessary to install Helm4 as the Pure Storage plugin detailed below
uses Helm Charts for deployment.
Persistent Storage
Within Kubernetes, we can use multiple Pure Storage backends to provide persistent storage in the form of Persistent
Volumes for Persistent Volume Claims issued by developers.
The Pure Storage Kubernetes plugin provides both file- and block-based Storage Classes, provisioned from either
FlashArray or FlashBlade storage devices. To make these Storage Classes available to your Kubernetes cluster, you must
install the Pure Service Orchestrator in the form of the Pure Storage Kubernetes plugin.
Pure Service Orchestrator Installation
Installation and configuration of the Pure Service Orchestrator is simple and requires only a few steps, which are described
in the Docker Store location of the Pure Storage Kubernetes plugin. However, there are a couple of actions that need to be
performed on every k8s worker node in your cluster before performing the installation:
• Ensure the latest multipath software package is installed and enabled.
• Ensure the /etc/multipath.conf file exists and contains the Pure Storage stanza as described in the
Linux Best Practices referenced above.
PSO Installation (using Helm)
The Pure Service Orchestrator manages the installation of all required files across your Kubernetes cluster by using a
DaemonSet to perform this cross-node installation. The DaemonSet runs a pod on each node in the cluster, which copies
the required files in the right path on the host for the kubelet to access. It will keep the config updated and ensure that files
are installed safely.
As previously mentioned, the installation of the Pure Service Orchestrator for Kubernetes requires that you have Helm
installed on your Kubernetes cluster. After you have installed the Helm binaries and completed the installation, you should
perform the following steps:
1. Add the pure repo to Helm:
# helm repo add pure http://purestorage.github.io/helm-charts
# helm repo update
# helm search repo pure-csi
1 https://github.com/kubernetes-incubator/kubespray/blob/master/README.md 2 https://github.com/kubernetes/dashboard/wiki/Access-control#admin-privileges 3 https://github.com/kubernetes/dashboard/wiki/Creating-sample-user 4 https://docs.helm.sh/using_helm/#installing-helm
REFERENCE ARCHITECTURE
13
2. Update the PSO configuration file: Enable the Pure Service Orchestrator for Kubernetes to communicate with your
Pure Storage backend arrays, by updating the PSO configuration file to reflect the access information for the
backend storage solutions. The file is called values.yaml and needs to contain the management IP address of the
backend devices, together with a valid, privileged, API token for each device. Additionally, an NFS Data VIP address
is required for each FlashBlade.
Take a copy of the values.yaml provided by the Helm Chart5 and update the parameters for the arrays in the
configuration file with your site-specific information, as shown in the following example:
arrays:
FlashArrays:
- MgmtEndPoint: "1.2.3.4"
APIToken: "a526a4c6-18b0-a8c9-1afa-3499293574bb"
- MgmtEndPoint: "1.2.3.5"
APIToken: "b526a4c6-18b0-a8c9-1afa-3499293574bb"
FlashBlades:
- MgmtEndPoint: "1.2.3.6"
APIToken: "T-c4925090-c9bf-4033-8537-d24ee5669135"
NfsEndPoint: "1.2.3.7"
- MgmtEndPoint: "1.2.3.8"
APIToken: "T-d4925090-c9bf-4033-8537-d24ee5669135"
NfsEndPoint: "1.2.3.9"
Ensure that the values you enter are correct for your own Pure Storage devices.
If you wish to use Fibre Channel as your data protocol for FlashArrays, then you must also change the following
parameter in the configuration file:
flasharray.sanType: FC
Please note that fibre channel support is only for bare-metal installation.
3. Create a Namespace for PSO: Helm3 requires that namespace to install into, therefore create a namespace with the
following command:
# kubectl create namespace <name>
4. Install the plugin: It is advisable to perform a ‘dry run’ installation to ensure that your YAML file is correctly formatted:
# helm install --name pure-storage-driver pure/pure-k8s-plugin -f
<your_own_dir>/<your_own_values>.yaml –namespace <name> --dry-run –-debug
Perform the actual install.
# helm install --name pure-storage-driver pure/pure-csi –namespace <name> -f
<your_own_dir>/<your_own_values>.yaml
The values set in your own YAML will overwrite any default values, but the --set option can also take precedence
over any value in the YAML, for example:
5 Or download from https://raw.githubusercontent.com/purestorage/helm-charts/master/pure-csi/values.yaml
REFERENCE ARCHITECTURE
14
# helm install --name pure-storage-driver pure/pure-csi –namespace <name> -f
<your_own_dir>/<your own values>.yaml –-set flasharray.sanType=FC,namespace.pure=k8s_xxx
The recommendation is to use the values.yaml file rather than the –-set option for ease of use, especially should
modifications be required to your configuration in the future.
Validate Correct Plugin Installation
To ensure that the Pure Service Orchestrator is correctly installed and running, we need to check for a few simple things in
the Kubernetes cluster:
• StorageClass: Three classes should exist.
# kubectl get sc
NAME PROVISIONER AGE
pure pure-provisioner 2d
pure-block pure-provisioner 2d
pure-file pure-provisioner 2d
• DaemonSet:
# kubectl get ds -n <name>
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
pure-csi 8 8 8 8 8 <none> 2d
• Deployment:
# kubectl get statefulset -n <name>
NAME READY AGE
pure-provisioner 1/1 2d
• Service:
# kubectl get service -n <name>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
pure-provisioner ClusterIP 10.233.34.208 <none> 12345/TCP 2d
pure-storage-driver-pure-csi ClusterIP 10.233.4.137 <none> 12345/TCP 2d
• Pods: One pure-csi pod should be running on each cluster node, plus one pure-provisioner pod
# kubectl get pod --namespace <name>
NAME READY STATUS RESTARTS AGE
pure-csi-d8llx 3/3 Running 0 2d
pure-csi-dhpxf 3/3 Running 0 2d
pure-csi-jb8vn 3/3 Running 0 2d
pure-csi-k6q2l 3/3 Running 0 2d
pure-csi-p9pwh 3/3 Running 0 2d
pure-csi-rndzj 3/3 Running 0 2d
pure-csi-w7fpg 3/3 Running 0 2d
pure-csi-wlc5g 3/3 Running 0 2d
pure-provisioner-0 3/3 Running 0 2d
REFERENCE ARCHITECTURE
15
Scaling Backend Storage
As your CaaS platform scales with increased demand from applications, workflows and users, you’ll inevitably face a
demand for additional backend persistent storage to support these applications and workflows.
You may have a block-only persistent storage environment and have been requested to add a file-based solution as well,
or your current block and file backends may be reaching capacity limits. Additionally, you may want to add or change
existing labels.
With the Pure Service Orchestrator, adding additional storage backends or changing labels is seamless and
straightforward. The process is as simple as updating your values YAML file with new labels or adding new FlashArray or
FlashBlade access information and then running this single command:
# helm upgrade pure-storage-driver pure/pure-csi -f <your_own_dir>/<your_own_values>.yaml
If you used the --set option when initially Installing the plugin, you must use the same option again, unless these have
been incorporated into your latest YAML file.
Simple Pod Deployment with a Pure Storage FlashArray Persistent Volume
To validate that the Pure Storage Kubernetes plugin has been configured and installed correctly, we can create a simple
pod with the running Microsoft® SQL Server® for Linux using a persistent volume from the configured Pure Storage
backend. Provided here are two files that we can use to validate the installation and show a working application
deployment. These are YAML files, firstly defining a Pure Storage-based persistent volume claim, and secondly defining the
Nginx application using the persistent volume.
Before we can do any of this work, the SQL Server is going to require a password, and this is obtained from a kubernetes
secret. To create the secret, we must issue this command:
# kubectl create secret generic mssql --from-literal=SA_PASSWORD=<your password>
Persistent Volume Claim
Create a file called sql-pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mssql-data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 8Gi
storageClassName: pure-block
REFERENCE ARCHITECTURE
16
Execute the following command:
# kubectl create -f sql-pvc.yaml
This will create a PVC called mssql-data and the Pure Storage Dynamic Provisioner will automatically create a Persistent
Volume to back this claim and be available to a pod that requests it.
The PV created for the PVC can be seen using the following command:
# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mssql-data Bound pvc-41c347e9-968f-11e8-9c45-0025b5c0808f 8Gi RWO pure-block 18s
From this, we can cross-reference to the actual volume created on FlashArray.
Figure 2. FlashArray GUI Storage > Volumes pane
We can see that the volume name matches the PV name with a prefix of k8s-. This prefix is technically the Namespace
used by the k8s plugin, but in the current version of the plugin, this is a fixed value. Looking more closely at the volume on
FlashArray, we see that it is also not yet connected to any host, as no pod is using the volume.
Figure 3. FlashArray GUI Storage > Selected Volume Pane
REFERENCE ARCHITECTURE
17
Application Deployment
Create a file called sqldeployment.yaml:
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: mssql-deployment
spec:
replicas: 1
template:
metadata:
labels:
app: mssql
spec:
terminationGracePeriodSeconds: 10
containers:
- name: mssql
image: microsoft/mssql-server-linux
ports:
- containerPort: 1433
securityContext:
privileged: true
env:
- name: ACCEPT_EULA
value: "Y"
- name: SA_PASSWORD
valueFrom:
secretKeyRef:
name: mssql
key: SA_PASSWORD
volumeMounts:
- name: mssqldb
mountPath: /var/opt/mssql
volumes:
- name: mssqldb
persistentVolumeClaim:
claimName: mssql-data
Execute the following command:
# kubectl create -f sqldeployment.yaml
REFERENCE ARCHITECTURE
18
This will create a pod for the mssql-deployment running SQL Server for Linux and the Pure Service Orchestrator will mount
the PV created earlier to the directory /var/opt/mssql within the pod. To find the exact name of the pod created use the
‘kubectl get pods’ command. A lot of information can be gathered regarding the newly created pod – some useful
information is highlighted below:
# kubectl describe pod mssql-deployment-5f9b58fd9b-2nzfm Name: mssql-deployment-5f9b58fd9b-2nzfm Namespace: default Node: sn1-c08-caas-02/10.21.200.62 Start Time: Thu, 02 Aug 2018 14:25:42 -0700 Labels: app=mssql pod-template-hash=1956149856 Annotations: <none> Status: Running IP: 10.233.80.205 Controlled By: ReplicaSet/mssql-deployment-5f9b58fd9b Containers: mssql: Container ID: docker://f38ea59cb674f83aa1db18508029021d131806da8fe0d423767d8ef3260e514c Image: microsoft/mssql-server-linux Image ID: docker-pullable://microsoft/mssql-server-linux@sha256:8231b746946d12a6a1d5e6c7bcb3d983e97ed1fe5ba3ad04e52305060aced166 Port: 1433/TCP Host Port: 0/TCP State: Running Started: Thu, 02 Aug 2018 14:25:46 -0700 Ready: True Restart Count: 0 Environment: ACCEPT_EULA: Y SA_PASSWORD: <set to the key 'SA_PASSWORD' in secret 'mssql'> Optional: false Mounts: /var/opt/mssql from mssqldb (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-7v5xw (ro) Conditions: Type Status Initialized True Ready True PodScheduled True Volumes: mssqldb: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: mssql-data ReadOnly: false default-token-7v5xw: Type: Secret (a volume populated by a Secret) SecretName: default-token-7v5xw Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: <none> – – –
REFERENCE ARCHITECTURE
19
We can see which node the pod has been created on, and this can be confirmed from the FlashArray GUI.
We can confirm the SQL Server application is working by running sqlcmd to connect to the pod at its internal IP address,
which we can find in the pod description above, and then a couple of simple SQL commands to prove the database is
there.
# /opt/mssql-tools/bin/sqlcmd -S 10.233.80.205 -U sa -P <password>
1> select @@servername
2> go
------------------------------------------------------------------------------------------------
mssql-deployment
(1 rows affected)
1> select name, database_id, create_date from sys.databases ;
2> go
name database_id create_date
------------------------------------------------ ----------- -----------------------
master 1 2003-04-08 09:13:36.390
tempdb 2 2018-08-02 21:25:51.700
model 3 2003-04-08 09:13:36.390
msdb 4 2018-06-30 00:03:38.280
(4 rows affected)
1>
Simple Multiple Pod Deployment with a Pure Storage FlashBlade Persistent Volume
Here we are going to validate that the Pure Service Orchestrator plugin has been configured and installed correctly to
creates NFS based persistent volumes on a Pure Storage FlashBlade backend, that can be shared by multiple pods.
Figure 4. FlashArray GUI Storage > Selected Volume Pane
REFERENCE ARCHITECTURE
20
Provided here are files that we can use to validate the installation and show an end-to-end example. These are YAML files
firstly defining a Pure Storage based persistent volume claim, secondly defining the Nginx application using the persistent
volume and finally defining an additional pod to connect to the same PVC.
Persistent Volume Claim
Create a file called nginx-nfs-pvc.yaml:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pure-nfs-claim
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: pure-file
Execute the following command:
# kubectl create -f nginx-nfs-pvc.yaml
This will create a PVC called pure-nfs-claim and the Pure Service Orchestrator will automatically create a Persistent Volume
to back this claim and be available to a pod that requests it.
The PV created for the PVC can be seen using the following command:
# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
pure-nfs-claim Bound pvc-a59d3f5a-997a-11e8-9c45-0025b5c0808f 10Gi RWX pure-file 12s
and from this, we can cross-reference to the actual volume created on the Pure Storage FlashBlade.
REFERENCE ARCHITECTURE
21
Again, we can see that the filesystem name matches the PV name with a prefix of k8s-.
Nginx Application Pod
Create a file called nginx-pod-nfs.yaml:
apiVersion: v1
kind: Pod
metadata:
name: nginx-nfs
namespace: default
spec:
volumes:
- name: pure-nfs
persistentVolumeClaim:
claimName: pure-nfs-claim
containers:
- name: nginx-nfs
image: nginx
command:
- sleep
- "3600"
volumeMounts:
- name: pure-nfs
mountPath: /data
ports:
- name: pure
containerPort: 80
Execute the following command:
# kubectl create -f nginx-pod-nfs.yaml
Figure 5. FlashBlade GUI Storage > Storage File Systems Pane
REFERENCE ARCHITECTURE
22
This will create a pod called nginx-pod-nfs that will run the Nginx image and the CSI driver will mount the PV created earlier
to the directory /data within the pod. A lot of information can be gathered regarding the newly created pod as shown
below, but some useful information is highlighted below:
# kubectl describe pod nginx-nfs
Name: nginx-nfs
Namespace: default
Node: sn1-c08-caas-01/10.21.200.61
Start Time: Mon, 06 Aug 2018 06:22:00 -0700
Labels: <none>
Annotations: <none>
Status: Running
IP: 10.233.90.13
Containers:
nginx-nfs:
Container ID: docker://420ae7b6172425432dfa8faacd0b583a4f43235ba622936b9a235e61abe29837
Image: nginx
Image ID: docker-
pullable://nginx@sha256:d85914d547a6c92faa39ce7058bd7529baacab7e0cd4255442b04577c4d1f424
Port: 80/TCP
Host Port: 0/TCP
State: Running
Started: Mon, 06 Aug 2018 06:22:08 -0700
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/data from pure-nfs (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-7v5xw (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
pure-nfs:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same
namespace)
ClaimName: pure-nfs-claim
ReadOnly: false
default-token-7v5xw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-7v5xw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
REFERENCE ARCHITECTURE
23
Additional Application Pod
Create a new pod definition file called busybox-nfs.yaml:
apiVersion: v1
kind: Pod
metadata:
name: busybox-nfs
namespace: default
spec:
volumes:
- name: pure-nfs-2
persistentVolumeClaim:
claimName: pure-nfs-claim
containers:
- name: busybox-nfs
image: busybox
volumeMounts:
- name: pure-nfs-2
mountPath: /usr/share/busybox
Execute the following command:
# kubectl create -f busybox-nfs.yaml
This will create a second pod, running in the same namespace as the Nginx pod, however we are using the same backing
store by using the same claim name. A lot of information can be gathered regarding the newly created pod as shown
below, but some useful information is highlighted below:
# kubectl describe pod busybox-nfs
Name: busybox-nfs
Namespace: default
Node: sn1-c08-caas-08/10.21.200.68
Start Time: Mon, 06 Aug 2018 06:27:13 -0700
Labels: <none>
Annotations: <none>
Status: Running
IP: 10.233.121.142
Containers:
busybox-nfs:
Container ID: docker://92bbe9949f7ba6251a9035fc33567a2067e3fa422d01014747402d66f9628655
Image: busybox
Image ID: docker-
pullable://busybox@sha256:cb63aa0641a885f54de20f61d152187419e8f6b159ed11a251a09d115fdff9bd
REFERENCE ARCHITECTURE
24
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 06 Aug 2018 06:27:40 -0700
Finished: Mon, 06 Aug 2018 06:27:40 -0700
Ready: False
Restart Count: 2
Environment: <none>
Mounts:
/usr/share/busybox from pure-nfs-2 (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-7v5xw (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
pure-nfs-2:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same
namespace)
ClaimName: pure-nfs-claim
ReadOnly: false
default-token-7v5xw:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-7v5xw
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
It can be seen that both the nginx and busybox pods are using the same storage claim that is attached to the same NFS
mount point on the backend, but each pod is actually running on a different node in the Kubernetes cluster.
Adding Nodes to a Kubernetes Cluster
This reference implementation is large enough to provide enough resources to run a few simple applications (see
Appendix). However, as a cluster becomes more utilized, it may be necessary to provide additional application or
infrastructure nodes to support other resource requirements.
REFERENCE ARCHITECTURE
25
The current scalability limits for Kubernetes 1.17 are 100 pods per node, assuming you have enough cores per node. The
maximum number of nodes in a single Kubernetes 1.17 cluster is 5000. However, if you are reaching a cluster of this size
you should be using multiple backend storage arrays to provide your persistent volumes.
There are processes for adding additional nodes to a Kubernetes cluster, and they are well documented within the main
Kubernetes documentation set. Still, it is essential to cover how to ensure that additional nodes have the ability to utilize the
Pure Storage arrays as providers of stateful storage.
When it comes to ensuring your new node can access stateful storage on Pure Storage devices, it is good to note that, as
we are using a DaemonSet to ensure that our plugin is correctly installed on cluster nodes, the addition of a new cluster
node to your Kubernetes cluster will cause the DaemonSet to create a new pure-csi pod on the new node and install the
plugin correctly.
Conclusion
With the growth of applications and deployments that require a CaaS platform that can also provide an underlying stateful
storage solution, the Pure Storage Kubernetes plugin meets these needs.
Additionally, using Pure Storage products to provide stateful storage also enables storage that is enterprise-ready,
redundant, fast, resilient, and scalable.
REFERENCE ARCHITECTURE
26
Appendix: Application Examples
Here we show two simple application deployments that can be used in a Kubernetes environment that would also require
stateful storage.
MongoDB
Running MongoDB in a HA configuration is a good example of how, using pre-existing Helm charts, you can easily deploy a
production ready application seamlessly using Kubernetes StatefulSets with Pure Service Orchestrator providing the
persistent storage from a Pure Storage FlashArray.
Here we are going to us a ‘stable’ Helm chart to create a MongoDB deployment using ReplicaSets. The database will be
deployed with a primary and two secondary pods, each having their own persistent volume. To show that MongoDB
replication is working we will add some data into the database on the primary and then read it from one of the secondaries.
Notice that we only need to supply the name of the storageClass to the Helm configuration because the statefulSet comes
with a template for creating new PVCs as the deployment scales.
# kubectl create namespace caas-mongo
# helm install --set "auth.adminUser=admin, auth.adminPassword=password, persistentVolume.storageClass=pure"
stable/mongodb-replicaset
NAME: caas-mongo
LAST DEPLOYED: Mon Aug 6 07:37:31 2018
NAMESPACE: default
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
caas-mongo-mongodb-replicaset-init 1 0s
caas-mongo-mongodb-replicaset-mongodb 1 0s
caas-mongo-mongodb-replicaset-tests 1 0s
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
caas-mongo-mongodb-replicaset ClusterIP None <none> 27017/TCP 0s
==> v1beta2/StatefulSet
NAME DESIRED CURRENT AGE
caas-mongo-mongodb-replicaset 3 1 0s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
caas-mongo-mongodb-replicaset-0 0/1 Init:0/3 0 0s # kubectl get pvc NAME STATUS VOLUME CAPACITY
ACCESS MODES STORAGECLASS AGE
datadir-caas-mongo-mongodb-replicaset-0 Bound pvc-3ff67917-9986-11e8-9b47-0025b5c0807f 10Gi
RWO pure 47m
datadir-caas-mongo-mongodb-replicaset-1 Bound pvc-5ca21b0f-9986-11e8-9b47-0025b5c0807f 10Gi
RWO pure 46m
datadir-caas-mongo-mongodb-replicaset-2 Bound pvc-770a92f6-9986-11e8-9b47-0025b5c0807f 10Gi
RWO pure 45m
REFERENCE ARCHITECTURE
27
Figure 6. MongoDB Stateful Sets
Figure 7. MongoDB Persistent Volume Claims
Here we’ll access the Primary server for the MongoDB deployment and add some simple entries into the database.
# kubectl exec -it caas-mongo-mongodb-replicaset-0 -- mongo --host caas-mongo-mongodb-replicaset
MongoDB shell version v3.6.6
connecting to: mongodb://caas-mongo-mongodb-replicaset:27017/
MongoDB server version: 3.6.6
Welcome to the MongoDB shell.
rs0:PRIMARY> db.products.insert({manufactirer:'Pure Storage', product:'FlashArray'})
WriteResult({ "nInserted" : 1 })
rs0:PRIMARY> db.products.insert({manufactirer:'Pure Storage', product:'FlashBlade'})
WriteResult({ "nInserted" : 1 })
rs0:PRIMARY> db.products.find()
{ "_id" : ObjectId("5b68602213827b834aa4a62e"), "manufactirer" : "Pure Storage", "product" : "FlashArray" }
{ "_id" : ObjectId("5b68602a13827b834aa4a62f"), "manufactirer" : "Pure Storage", "product" : "FlashBlade" }
rs0:PRIMARY> exit
Now, let's interrogate one of the secondary nodes to ensure the data has been correctly replicated.
# kubectl exec -it caas-mongo-mongodb-replicaset-1 -- mongo --eval="rs.slaveOk();
db.products.find().forEach(printjson)"
MongoDB shell version v3.6.6
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 3.6.6
{
"_id" : ObjectId("5b68602213827b834aa4a62e"),
"manufactirer" : "Pure Storage",
"product" : "FlashArray"
}
{
REFERENCE ARCHITECTURE
28
"_id" : ObjectId("5b68602a13827b834aa4a62f"),
"manufactirer" : "Pure Storage",
"product" : "FlashBlade"
}
WordPress
In this example we are using another ‘stable’ Helm chart to deploy WordPress, the content management system. The chart
will be used to deploy a production-ready configuration with three WordPress pods, as well as a MariaDB deployment for
the database requirements of the WordPress application.
The MariaDB deployment will use a persistent volume from a FlashArray and the WordPress pods will all use the same
ReadWriteMany persistent volume made available from a FlashBlade.
The production-values.yaml was copied from the Helm chart github and the following simple modifications were made to
ensure the required storage comes from the FlashArray and FlashBlade:
# kubectl create namespace caas-wordpress
# helm install -f ./wordpress-production.yaml stable/wordpress
NAME: caas-wordpress
LAST DEPLOYED: Mon Aug 6 08:50:07 2018
NAMESPACE: default
STATUS: DEPLOYED
RESOURCES:
==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
caas-wordpress-wordpress 3 3 3 0 1s
==> v1beta1/StatefulSet
NAME DESIRED CURRENT AGE
caas-wordpress-mariadb 1 1 1s
==> v1beta1/Ingress
NAME HOSTS ADDRESS PORTS AGE
wordpress.local-caas-wordpress wordpress.local 80, 443 1s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
caas-wordpress-wordpress-5b45fc89c5-dtz8q 0/1 ContainerCreating 0 1s
caas-wordpress-wordpress-5b45fc89c5-f52xq 0/1 ContainerCreating 0 1s
caas-wordpress-wordpress-5b45fc89c5-ntds9 0/1 ContainerCreating 0 1s
caas-wordpress-mariadb-0 0/1 ContainerCreating 0 1s
==> v1/Secret
NAME TYPE DATA AGE
caas-wordpress-mariadb Opaque 2 1s
caas-wordpress-wordpress Opaque 2 1s
REFERENCE ARCHITECTURE
29
==> v1/ConfigMap
NAME DATA AGE
caas-wordpress-mariadb 1 1s
caas-wordpress-mariadb-tests 1 1s
==> v1/PersistentVolumeClaim
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
caas-wordpress-wordpress Pending pure-file 1s
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
caas-wordpress-mariadb ClusterIP 10.233.9.205 <none> 3306/TCP 1s
caas-wordpress-wordpress ClusterIP 10.233.35.83 <none> 80/TCP,443/TCP 1s
Looking at the Kubernetes dashboard we can see the persistent volumes associated with the deployment
And also the four pods that have been created:
If we examine the MariaDB pod, we will see it is using the persistent volume from the FlashBlade:
And looking at one of the WordPress pods we can see it is using the ReadWriteMany volume from the FlashBlade:
REFERENCE ARCHITECTURE
30
purestorage.com 800.379.PURE
©2020 Pure Storage, the Pure P Logo, and the marks on the Pure Trademark List at https://www.purestorage.com/legal/productenduserinfo.html are trademarks of Pure Storage, Inc. Other names are trademarks of their respective owners. Use of Pure Storage Products and Programs are covered by End User Agreements, IP, and other terms, available at: https://www.purestorage.com/legal/productenduserinfo.html and https://www.purestorage.com/patents
The Pure Storage products and programs described in this documentation are distributed under a license agreement restricting the use, copying, distribution, and decompilation/reverse engineering of the products. No part of this documentation may be reproduced in any form by any means without prior written authorization from Pure Storage, Inc. and its licensors, if any. Pure Storage may make improvements and/or changes in the Pure Storage products and/or the programs described in this documentation at any time without notice.
THIS DOCUMENTATION IS PROVIDED “AS IS” AND ALL EXPRESS OR IMPLIED CONDITIONS, REPRESENTATIONS AND WARRANTIES, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT, ARE DISCLAIMED, EXCEPT TO THE EXTENT THAT SUCH DISCLAIMERS ARE HELD TO BE LEGALLY INVALID. PURE STORAGE SHALL NOT BE LIABLE FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES IN CONNECTION WITH THE FURNISHING, PERFORMANCE, OR USE OF THIS DOCUMENTATION. THE INFORMATION CONTAINED IN THIS DOCUMENTATION IS SUBJECT TO CHANGE WITHOUT NOTICE.
Pure Storage, Inc. 650 Castro Street, #400 Mountain View, CA 94041
PS1117-02 04/2020
About the Author
As Technical Director of New Stack, Simon is responsible for managing the direction of Pure Storage pertaining to Open
Source technologies including OpenStack, containers, and associated orchestration and automation toolsets. His
responsibilities also include developing best practices, reference architectures, and configuration guides.
With over 30 years of storage experience across all aspects of the discipline, from administration to architectural design,
Simon has worked with all major storage vendors’ technologies and organisations, large and small, across Europe and the
USA as both customer and service provider.
Read Simon’s blog.