34
Cassandra: No Moving Parts Cassandra on Flash Memory Matt Kennedy (@mattmorefaster ) October 17, 2013 #CassandraEU — Copyright © 2013 Fusion-io, Inc. All rights reserved.

C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

Embed Size (px)

DESCRIPTION

Speaker: Matt Kennedy, Solution Architect: Big Data at Fusion.io YouTube: http://www.youtube.com/watch?v=xu_4TAQlY2U&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=21 Flash Memory technology, deployed as server-side PCIe or solid state disks (SSDs), is emerging as a critical tool for performance and efficiency in data centers of all scales. This presentation will discuss how the use of Flash impacts Cassandra deployments in terms of configuration, DRAM requirements and performance expectations. Ideas on leveraging C*'s cutting-edge data-center awareness to blend flash and disk storage nodes for cost and workload efficiency will also be shared. Flash media itself will be examined from a physical perspective to understand endurance issues. Data on write amplification under bulk-load and operational workload conditions will be presented to explain the impact to Flash of C*'s Log Structured Merge Tree architecture and the associated compactions. Finally, we will examine strategies to make Cassandra more Flash-aware using both conventional techniques as well as emerging Non-volatile memory (NVM) programming capabilities. Lessons learned from real-world customer deployments will be shared to complete this presentation.

Citation preview

Page 1: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

Cassandra: No Moving PartsCassandra on Flash Memory

Matt Kennedy

(@mattmorefaster)

October 17, 2013

#CassandraEU — Copyright © 2013 Fusion-io, Inc.  All rights reserved.

Page 2: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

What is this talk about?

▸Efficiency• Definition:

noun 1. The state or quality of being efficient.

▸Efficient• Definition:

adjective 1. (especially of a system of machine) achieving maximum productivity with minimum wasted effort or expense

#CassandraEU2

Page 3: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU3

Flash vs Disk Cost Efficiency

▸Capacity

▸ IOPS

▸Cost per IOP

4TB3TB

150 200,000

$$$$¢¢¢¢

Page 4: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU4

What is flash?

Page 5: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

NAND Flash Memory

#CassandraEU5

Flash is a persistent memory technology invented by Dr. Fujio Masuoka at Toshiba in 1980.

BitLine

Source Line Word Line

Control Gate

Float Gate

NPN

Page 6: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU6

Consumer Volume Drives Economics

Page 7: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU7

Flash in Servers

Page 8: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU8

Direct Cut Through Architecture

PC

Ie

DRAM

Host CPU

AppOS

LEGACY APPROACH FUSION DIRECT APPROACH

PC

IeS

AS

DRAM

Data path Controller

NAND

Host CPU

RAIDController

AppOS

Goal of every I/O operation to move data to/from DRAM and flash.

SC

SC

Super Capacitors

Page 9: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU9

Page 10: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU10

Cassandra I/O - Writes

http://www.datastax.com/docs/1.2/dml/about_writes

Page 11: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU11

Cassandra I/O - Reads

http://www.datastax.com/docs/1.2/dml/about_reads

Page 12: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU12

DRAM Dictates Cassandra Scaling

▸Key Design Principle:

▸Working Set < DRAM

Page 13: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU13

DO

LL

AR

SCost of DRAM Modules

4 G B 8 G B 1 6 G B 3 2 G B0

200

400

600

800

1000

1200

1400

1600

$ $$$$$

$$$$$$

Page 14: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU14

When do we scale out?

▸A typical server…

CPU Cores: 32 with HTMemory: 128 GB

…is your working set > 128GB?

Page 15: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU15

Is there a better way?

15

▸With NoSQL Databases, we tend to scale out for DRAM

Combined ResourcesCPU Cores: 192Memory: 768 GB

• Low CPU utilization

• High Utility cost

Page 16: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU16

Flash Offers A New Architectural Choice

Milliseconds 10-3 Microseconds 10-6

Nanoseconds 10-9

CPU Cache DRAM

Disk Drives

Server-based Flash

Page 17: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU17

How can we useflash in Cassandra?

Page 18: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

18

Four Deployment Options

1. All Flash

2. Data Placement (CASSANDRA-2749)

3. Use Logical Data Centers

4. Cache Layer

#CassandraEU

Page 19: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

19

Cassandra with All-Flash Storage

#CassandraEU

Step 1: Mount ioMemory at /var/lib/cassandraStep 2:

Page 20: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

20

Data Placement

▸ https://issues.apache.org/jira/browse/CASSANDRA-2749• Thanks Marcus!

▸Takes advantage of filesystem hierarchy

▸Use mount points to pin Keyspaces or Column Families to flash:• /var/lib/cassandra/data/{Keyspace}/{CF}

▸Use flash for high performance needs, disk for capacity needs

#CassandraEU

Page 21: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

21

Data Centers for Storage Control

DC1(Interactive requests)

DC3(High density replicas)

DC2(Hadoop MR Jobs)

PERFORMANCE

CAPACITY/NODE

HIGH

MEDIUM

LOW

HIGH

Cassandra cluster

#CassandraEU

Page 22: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

Flash Caching

▸Use Flash to cache blocks from spinning disk• Larger cheaper caches than DRAM• Helps stabilize performance during compaction

▸Open-Source & Commercial options:• Flashcache: FB developed write-through/back/around cache▸ Kernel patch▸ https://github.com/facebook/flashcache/

• bcache: write-through/back/around cache▸ Kernel patch▸ http://bcache.evilpiepirate.org/

• Fusion ioTurbine: write-through, commercially supported

#CassandraEU22

Page 23: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

23 #CassandraEU

The Numbers

Page 24: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

24

YCSB Testing Setup

#CassandraEU

x4x1

YCSB Load Generator

10GB 16-cores24GB DRAM

Workloads use uniformrandom key selectioninstead of Zipfian.

150 million 1KB records, RF=3: ~ 120GB SSTables/node

Page 25: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

25

50/50 R/W Uniform distribution 10hrs

#CassandraEU

YC

SB

MIX

ED

OP

S/S

EC

40

68

01

32

01

96

02

60

03

24

03

88

04

52

05

16

05

80

06

44

07

08

07

72

08

36

09

00

09

64

01

02

80

10

92

01

15

60

12

20

01

28

40

13

48

01

41

20

14

76

01

54

00

16

04

01

66

80

17

32

01

79

60

18

60

01

92

40

19

88

02

05

20

21

16

02

18

00

22

44

02

30

80

23

72

02

43

60

25

00

02

56

40

26

28

02

69

20

27

56

02

82

01

28

84

12

94

81

30

12

13

07

61

31

40

13

20

41

32

68

13

33

21

33

96

13

46

01

35

24

13

58

81

0

10000

20000

30000

40000

50000

60000

70000

mixed ops/sec

Update LatencyAverage: 511 µs95th Pctl:1 ms99th Pctl: 2 ms

Read LatencyAverage: 7.0 ms95th Pctl: 18 ms99th Pctl: 42 ms

Page 26: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

26

95/5 R/W Uniform distribution

#CassandraEU

MIX

ED

OP

S/S

EC

10

30

50

70

90

11

0

13

0

15

0

17

0

19

0

21

0

23

0

25

0

27

0

29

0

31

0

33

0

35

0

37

0

39

0

41

0

43

0

45

0

47

0

49

0

51

0

53

0

55

0

57

0

59

0

61

0

63

0

65

0

67

0

69

0

0

10000

20000

30000

40000

50000

60000

70000

80000

75 threads 200 threads 300 threads

# threads Avg Lat. 95th pctl 99th pctl

75 1.4/0.22 ms

2/0 ms 5/0 ms

200 3.1/0.19 ms

7/0 ms 13/0 ms

300 4.4/2.2 ms 11/0 ms 19/0 ms

Page 27: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU27

Consolidation

Page 29: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU29

Real-World Cassandra on Fusion

• 3-4x consolidation factor• 3-6x reduction in latency• 2.2x ROI

Page 30: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#CassandraEU30

x4

x4

Efficiency: Performance or Consolidation?

x4

x4

x4

x4

vs.

Cassandra @ ~100,000 ops/sec (mixed workload)

Memory/DiskioMemory

x4

x4

x4

x4

http://www.fusionio.com/white-papers/accelerate-cassandra-without-the-cluster-crawl/

Page 31: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

Thank You

f u s i o n i o . c o m | S A M E P L A N E T. D I F F E R E N T W O R L D .

@mattmorefaster

Page 32: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

April 11, 2023

#Cassandra1332

Cassandra: ioDrive2 vs 10 disk RAID-0

Page 33: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

#Cassandra1333

50/50 R/W Uniform distribution

April 11, 2023

YC

SB

MIX

ED

OP

S/S

EC

10

20

30

40

50

60

70

80

90

10

01

10

12

01

30

14

01

50

16

01

70

18

01

90

20

02

10

22

02

30

24

02

50

26

02

70

28

02

90

30

03

10

32

03

30

34

03

50

36

03

70

38

03

90

40

04

10

42

04

30

44

04

50

46

04

70

48

04

90

50

05

10

52

05

30

54

05

50

0

20000

40000

60000

80000

100000

120000

mixed ops/sec

Update LatencyAverage: 311 µs95th Pctl:0 ms99th Pctl: 1 ms

Read LatencyAverage: 8.2 ms95th Pctl: 20 ms99th Pctl: 62 ms

Page 34: C* Summit EU 2013: Cassandra on Flash: Performance & Efficiency Lessons Learned

34

YCSB: Bulk Load (CL=ALL)

#CassandraEU

YC

SB

IN

SE

RT

S

1 0 1 5 0 2 9 0 4 3 0 5 7 0 7 1 0 8 5 0 9 9 0 1 1 3 0 1 2 7 0 1 4 1 0 1 5 5 0 1 6 9 0 1 8 3 0 1 9 7 0 2 1 1 0 2 2 5 0 2 3 9 0 2 5 3 0 2 6 7 0 2 8 1 00

10000

20000

30000

40000

50000

60000

70000

inserts/sec

Avg Latency: 0.9 ms95th Percentile: 1 ms99th Percentile: 4 ms