4
FUNGIBLE STORAGE CLUSTER DELIVERS BETTER THAN DAS PERFORMANCE FOR CASSANDRA Thinking of disaggregating storage but worried about impacting performance? The Fungible Storage Cluster serves as a quintessential hyperdisaggregated shared storage platform for Apache Cassandra, delivering all the benefits of disaggregated storage at better than DAS performance. CASSANDRA DATABASE OVERVIEW Cassandra is a free and open source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. In Cassandra, all nodes are the same; there is no concept of a master node, with all nodes communicating with each other via a gossip protocol. One of the key design features for Cassandra is the ability to scale incrementally. This requires the ability to dynamically partition the data over the set of nodes in the cluster. Cassandra partitions data across the cluster using consistent hashing. Cassandra provides automatic data distribution across all nodes that participate in a “ring” or database cluster. There is nothing programmatic that a developer or administrator needs to do or code to distribute data across a cluster. Cassandra uses replication to achieve high availability and durability. Each data item is replicated at N hosts, where N is the replication factor configured “per-instance”. Cassandra provides the client with various options for how data needs to be replicated. Cassandra provides various replication policies such as “Rack Unaware”, “Rack Aware” (within a datacenter) and “Datacenter Aware”. Replicas are chosen based on the replication policy chosen by the application. Cassandra is in use at some of the world’s largest enterprises such as Apple, eBay, Comcast, Microsoft, Netflix, Walmart and Uber. It is used by 40% of the Fortune100. Source https://cassandra.apache.org/ BENCHMARK TOPOLOGY To demonstrate a 4-node Cassandra cluster performance with the Fungible Storage Cluster (FSC), we setup a 4-node Cassandra cluster attached to the FSC via one 100GbE Mellanox CX5 network card on each node running NMVe/TCP. There is a separate server acting as a client running the Yahoo! Cloud Serving Benchmark (YCSB) tool. The same type and model of NVMe SSD is used in the FSC and as locally attached storage to ensure a fair comparison. Figure 1 below shows the test topology for the two performance tests. LAN Management Network NVMe-oF Data Network Fungible Composer Node1 Node1 6 x 100GbE Node2 Node3 FSC DAS Node2 Node3 Node4 Node4 Figure 1: YCSB Benchmark Test Topology Cassandra Node Components Quantity / Description Server Type 4 x Supermicro Memory 256GB per server CPU 2x20 Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz per server Network Card 1 x Mellanox ConnectX-5 – 100GbE per server 1 x 10GbE per server Direct Attach Storage 1 x SATA SSD as Boot Disk per server 1 x NVMe SSDs (Cassandra DB) per server Disaggregated Storage 1 x FSC NVMe volume (Cassandra DB) per server Table 1: Cassandra Node Component BENCHMARK RESOURCES The tables below list all the hardware and software used for the benchmark. YCSB Client Components Quantity / Description Supermicro 1 x Supermicro Memory 256GB CPU 2x20 Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz Network Card 1 x 1GbE – Management network 1 x 10GbE – Test client network Direct Attach Storage 1 x SATA SSD as Boot Disk Table 2: YCSB Client Component Software Description Operating System CentOS 8.2 Linux Kernel 4.18.0-193.6.3.el8_2.x86_64 Cassandra Version 3.11 YCSB Version 0.17.0 Table 3: Software Resources Network Component Purpose / Model Network Switches NVMe-oF / Juniper Management Network (LAN) / Cisco Table 4: Network Component SOLUTION WHITEPAPER FSC Component Description Fungible FSC 2 x FS1600 nodes FS1600 Network Ports 6 x 100GbE Fungible Composer Control Plane Table 5: Fungible FSC Component 1

SOLUTION WHITEPAPER FUNGIBLE STORAGE CLUSTER …

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: SOLUTION WHITEPAPER FUNGIBLE STORAGE CLUSTER …

FUNGIBLE STORAGE CLUSTER DELIVERS BETTER THAN DAS PERFORMANCE FOR CASSANDRAThinking of disaggregating storage but worried about impacting performance? The Fungible Storage Cluster serves as a quintessential hyperdisaggregated shared storage platform for Apache Cassandra, delivering all the benefits of disaggregated storage at better than DAS performance.

CASSANDRA DATABASE OVERVIEWCassandra is a free and open source, distributed, wide column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Cassandra offers robust support for clusters spanning multiple datacenters, with asynchronous masterless replication allowing low latency operations for all clients. In Cassandra, all nodes are the same; there is no concept of a master node, with all nodes communicating with each other via a gossip protocol.

One of the key design features for Cassandra is the ability to scale incrementally. This requires the ability to dynamically partition the data over the set of nodes in the cluster. Cassandra partitions data across the cluster using consistent hashing.

Cassandra provides automatic data distribution across all nodes that participate in a “ring” or database cluster. There is nothing programmatic that a developer or administrator needs to do or code to distribute data across a cluster. Cassandra uses replication to achieve high availability and durability. Each data item is replicated at N hosts, where N is the replication factor configured “per-instance”. Cassandra provides the client with various options for how data needs to be replicated. Cassandra provides various replication policies such as “Rack Unaware”, “Rack Aware” (within a datacenter) and “Datacenter Aware”. Replicas are chosen based on the replication policy chosen by the application.

Cassandra is in use at some of the world’s largest enterprises such as Apple, eBay, Comcast, Microsoft, Netflix, Walmart and Uber. It is used by 40% of the Fortune100. Source https://cassandra.apache.org/

BENCHMARK TOPOLOGYTo demonstrate a 4-node Cassandra cluster performance with the Fungible Storage Cluster (FSC), we setup a 4-node Cassandra cluster attached to the FSC via one 100GbE Mellanox CX5 network card on each node running NMVe/TCP. There is a separate server acting as a client running the Yahoo! Cloud Serving Benchmark (YCSB) tool. The same type and model of NVMe SSD is used in the FSC and as locally attached storage to ensure a fair comparison.

Figure 1 below shows the test topology for the two performance tests.

LAN Management

Network

NVMe-oFData Network

Fungible Composer

Node1

Node1

6 x 100GbE

Node2 Node3

FSC DAS

Node2 Node3

Node4

Node4

Figure 1: YCSB Benchmark Test Topology

Cassandra Node Components

Quantity / Description

Server Type 4 x Supermicro

Memory 256GB per server

CPU2x20 Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz per server

Network Card1 x Mellanox ConnectX-5 – 100GbE per server 1 x 10GbE per server

Direct Attach Storage

1 x SATA SSD as Boot Disk per server 1 x NVMe SSDs (Cassandra DB) per server

Disaggregated Storage

1 x FSC NVMe volume (Cassandra DB) per server

Table 1: Cassandra Node Component

BENCHMARK RESOURCESThe tables below list all the hardware and software used for the benchmark.

YCSB Client Components Quantity / Description

Supermicro 1 x Supermicro

Memory 256GB

CPU2x20 Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz

Network Card1 x 1GbE – Management network1 x 10GbE – Test client network

Direct Attach Storage 1 x SATA SSD as Boot Disk

Table 2: YCSB Client Component

Software Description

Operating System CentOS 8.2

Linux Kernel 4.18.0-193.6.3.el8_2.x86_64

Cassandra Version 3.11

YCSB Version 0.17.0

Table 3: Software Resources

Network Component Purpose / Model

Network SwitchesNVMe-oF / Juniper Management Network (LAN) / Cisco

Table 4: Network Component

SOLUTION WHITEPAPER

FSC Component Description

Fungible FSC 2 x FS1600 nodes

FS1600 Network Ports 6 x 100GbE

Fungible Composer Control Plane

Table 5: Fungible FSC Component

1

Page 2: SOLUTION WHITEPAPER FUNGIBLE STORAGE CLUSTER …

SOLUTION WHITEPAPER

BENCHMARK METHODOLOGYFor the FSC test, we created a total of 4 durable volumes and attached one volume to each of the Cassandra nodes over NVMe-oF with TCP via the 100GbE Mellanox CX5 network card. We formatted the volume with XFS filesystem and mounted on /var/lib/cassandra directory. We created a keyspace named “ycsb” with a replication factor of 3 (RF3) using the SimpleStrategy replication method. Then we created a table named “usertable” without any Cassandra compression method because we wanted to leverage the compression on the FSC.

The two tables below describe the commands used to create a keyspace and a table for the FSC testing.

CREATE KEYSPACE ycsb WITH replication = {‘class’: ‘SimpleStrategy’, ‘replication_factor’: ‘3’} AND durable_writes = true;

Table 6: Create Cassandra Keyspace

CREATE TABLE ycsb.usertable ( y_id text PRIMARY KEY, field0 text, field1 text, field2 text, field3 text, field4 text, field5 text, field6 text, field7 text, field8 text, field9 text) WITH bloom_filter_fp_chance = 0.01 AND caching = {‘keys’: ‘ALL’, ‘rows_per_partition’: ‘NONE’} AND comment = ‘’ AND compaction = {‘class’: ‘org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy’, ‘max_threshold’: ‘32’, ‘min_threshold’: ‘4’} AND compression = {‘enabled’: ‘false’} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = ‘99PERCENTILE’;

Table 8: Create Cassandra Table with Snappy Compression for the DAS test

CREATE TABLE ycsb.usertable ( y_id text PRIMARY KEY, field0 text, field1 text, field2 text, field3 text, field4 text, field5 text, field6 text, field7 text, field8 text, field9 text) WITH bloom_filter_fp_chance = 0.01 AND caching = {‘keys’: ‘ALL’, ‘rows_per_partition’: ‘NONE’} AND comment = ‘’ AND compaction = {‘class’: ‘org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy’, ‘max_threshold’: ‘32’, ‘min_threshold’: ‘4’} AND compression = {‘chunk_length_in_kb’: ‘64’, ‘class’: ‘org.apache.cassandra.io.compress.SnappyCompressor’} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = ‘99PERCENTILE’;

Table 7: Create Cassandra Table without Compression for the FSC test

After creating the keyspace and the non-compressed table with the FSC storage, we used the client machine to load the database with 128,000,000 records with 4KB record size. After the data load completed, we cleared the filesystem cache on all nodes to avoid any caching and started the YCSB workloadB (95/5) test. We ran the test with 64, 96, 128 and 256 threads and recorded the Transactions Per Second (TPS), Read Average and Tail Latency and Update Average and Tail Latency.

For the direct attached storage (DAS) test, we formatted the local NVMe SSD with XFS and mounted on /var/lib/cassandra directory. We used the same command to create the keyspace with RF3 and SimpleStrategy replication method as we did with the FSC storage. However, for the table creation, we leveraged Cassandra Snappy and LZ4 compression methods. This is to compare using Cassandra table compression vs. FSC hardware compression.

The following two tables show the commands that were used to create the Cassandra table with Snappy and LZ4 compression for the DAS test case.

CREATE TABLE ycsb.usertable ( y_id text PRIMARY KEY, field0 text, field1 text, field2 text, field3 text, field4 text, field5 text, field6 text, field7 text, field8 text, field9 text) WITH bloom_filter_fp_chance = 0.01 AND caching = {‘keys’: ‘ALL’, ‘rows_per_partition’: ‘NONE’} AND comment = ‘’ AND compaction = {‘class’: ‘org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy’, ‘max_threshold’: ‘32’, ‘min_threshold’: ‘4’} AND compression = {‘chunk_length_in_kb’: ‘64’, ‘class’: ‘org.apache.cassandra.io.compress.LZ4Compressor’} AND crc_check_chance = 1.0 AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = ‘99PERCENTILE’;

Table 9: Create Cassandra Table with LZ4 Compression for the DAS test

2

Page 3: SOLUTION WHITEPAPER FUNGIBLE STORAGE CLUSTER …

SOLUTION WHITEPAPER

BENCHMARK RESULTSThe graphs below show the results for a 4-node Cassandra cluster running with the Fungible Storage Cluster storage and DAS. Figure 2 shows the Transactions Per Second (TPS) while Figures 3-6 shows the read and update latency graphs.

Figure 2: 4-node Cassandra Cluster – Transactions Per Second (TPS)

28,977

CASSANDRA 95/5 WORKLOAD – TRANSACTIONS PER SECOND (Measured Test Results)

60,000

70,000

DAS-Snappy

YCSB -# OF THREADS

DAS-LZ4 FSC-RF2

64 96 128 256

10,000

20,000

30,000

40,000

50,000

0

Figure 5: 4-node Cassandra Cluster - Update Average Latency (us)

DAS-Snappy

YCSB -# OF THREADS

64 96 128 256

CASSANDRA 95/5 WORKLOAD – UPDATE AVERAGE LATENCY (US)(Measured Test Results)

DAS-LZ4 FSC-RF2

0.00

500.00

1,000.00

1,500.00

2,000.00

Figure 6: 4-node Cassandra Cluster - Update Tail Latency (us)

DAS-Snappy

YCSB -# OF THREADS

64 96 128 256

CASSANDRA 95/5 WORKLOAD – UPDATE 99% LATENCY (US)(Measured Test Results)

DAS-LZ4 FSC-RF2

0

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

18,000

DAS-Snappy

YCSB -# OF THREADS

64 96 128 256

CASSANDRA 95/5 WORKLOAD – READ AVERAGE LATENCY (US)(Measured Test Results)

DAS-LZ4 FSC-RF2

0.00

1,000.00

2,000.00

3,000.00

4,000.00

6,000.00

5,000.00

Figure 3: 4-node Cassandra Cluster – Read Average Latency (us) Figure 4: 4-node Cassandra Cluster - Read Tail Latency (us)

DAS-Snappy

YCSB -# OF THREADS

64 96 128 256

CASSANDRA 95/5 WORKLOAD – READ 99% LATENCY (US)(Measured Test Results)

DAS-LZ4 FSC-RF2

0

5,000

10,000

15,000

20,000

25,000

3

Page 4: SOLUTION WHITEPAPER FUNGIBLE STORAGE CLUSTER …

SOLUTION WHITEPAPER

KEY TAKEAWAYSThe Fungible FSC provides a high performant and simple to use NVMe-oF storage solution for any enterprises that have many NoSQL databases or RDBMS. This solution would eliminate the locally attached storage management where it could be a nightmare if you must scale storage or experience a disk failure.

It is also a perfect storage solution for a private cloud. Imagine that you want to setup multiple Cassandra clusters for various business use cases by leveraging virtual machines (VMs) plus with the ability to live migrate the VMs between KVM hosts. It is impossible to deploy such an architecture with DAS but not with the FSC. The FSC provides the flexibility of scaling out compute and storage independently and also has features such as snapshot and clone that can be utilized in your database environment.

Figure 7 shows a sample of multiple Cassandra nodes within a singleCassandra cluster or multiple Cassandra clusters.

CONCLUSIONWe have demonstrated that running Cassandra database on the Fungible Storage Cluster gives you better overall performance in terms of transactions per second (TPS), average latency and tail latency compared to direct attached storage (DAS). Those results are achieved because the Fungible Storage Cluster software optimizes the data placement on the SSDs and due to software compression being replaced by hardware compression. Though not shown in this whitepaper, it is also the case that the FSC hardware compression is superior to the table compression done by Cassandra and will lead to additional space savings. In addition, all SSDs will have wear and tear as they are used, especially in a database environment where updates are frequent, running as DAS will increase SSD wear and tear compared to pooled SSDs placed on the FSC.

Customers no longer have to compromise when deploying disaggregated storage for Cassandra as with Fungible Storage Cluster they get all the benefits of storage hyperdisaggregation combined with higher performance and lower latency!

LAN Management

Network

Fungible Composer

Fungible Storage Cluster

Cassandra Servers

NVMe-oFData Network

Figure 7: Sample Configuration with Multiple Servers

ABOUT FUNGIBLESilicon Valley-based Fungible is reimagining the performance, economics, reliability, security and agiliy of today’s data centers.

CONTACT US [email protected]

Fungible, inc.3201 Scott Blvd., Santa Clara, CA 95054, USA669-292-5522

www.fungible.com

Fungible, the Fungible logo, and all other Fungible products and services names mentioned herein are trademarks or registered trademarks of Fungible, Inc. in the United States and other countries. Other trademarks mentioned herein are the property of their respective owners. © 2020 Fungible, Inc. All rights reserved.

WP0039.00.02021026 4