35
Supercharging Cassandra... Tom Wilkie Founder & VP Engineering @tom_wilkie

Supercharging Cassandra - GOTO Amsterdam

  • Upload
    acunu

  • View
    1.752

  • Download
    2

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Supercharging Cassandra - GOTO Amsterdam

Supercharging Cassandra...

Tom WilkieFounder & VP Engineering

@tom_wilkie

Page 2: Supercharging Cassandra - GOTO Amsterdam

Before the Flood

Old hardware

1990

BTree File systems

RAID

Small databases

BTree indexes

Page 3: Supercharging Cassandra - GOTO Amsterdam

Two Revolutions

BTree file systems

2010

New hardware

RAID

Write-optimised indexes

Distributed, shared-nothing databases

BTree file systems

New hardware

RAID

Write-optimised indexes

...

Page 4: Supercharging Cassandra - GOTO Amsterdam

Bridging the Gap

Castle

2011

Distributed, shared-nothing databases

New hardware

Castle

New hardware

...

Page 5: Supercharging Cassandra - GOTO Amsterdam

Big Data Applications

M

emca

ched

...

Acunu Storage Core

Open API

Management

Deployment

Monitoring

......

...

. . .

......

. . .

......

. . .

......

Cross-Cluster Management UI

Page 6: Supercharging Cassandra - GOTO Amsterdam

1. Predictability

Page 7: Supercharging Cassandra - GOTO Amsterdam
Page 8: Supercharging Cassandra - GOTO Amsterdam

Small random inserts Inserting 3 billion rows

Acunu powered Cassandra -‘standard’ Cassandra -

Page 9: Supercharging Cassandra - GOTO Amsterdam

Insert latency While inserting 3 billion rows

Acunu powered Cassandra x‘standard’ Cassandra +

Page 10: Supercharging Cassandra - GOTO Amsterdam

Small random range queriesPerformed immediately after inserts

Acunu powered Cassandra -‘standard’ Cassandra -

Page 11: Supercharging Cassandra - GOTO Amsterdam

Standard Acunu Benefits

inserts rate95% latency

~32k/s~32s

~45k/s~0.3s

>1.4x>100x

gets rate95% latency

~100/s~2s

~350/s~0.5s

>3.5x>4x

range queries95% latency

~0.4/s~15s

~40/s~2s

>100x>7.5x

Performance summary

Page 12: Supercharging Cassandra - GOTO Amsterdam

Doubling Array

2

9

2 9

Inserts

Buffer arrays in memory until we have > B of them

Page 13: Supercharging Cassandra - GOTO Amsterdam

Doubling Array

11

8 8 11

2 9 2 8 9 11

Inserts

etc...

Similar to log-structured merge trees (LSM), cache-oblivious lookahead array (COLA), ...

Page 15: Supercharging Cassandra - GOTO Amsterdam

B = “block size”, say 8KB at 100 bytes/entry ~= 100 entries

Update Range Query(Size Z)

B-Tree O(logB N)random IOs

O(Z/B) random IOs

Doubling Array O((log N)/B)sequential IOs

O(Z/B) sequential IOs

~ log (2^30)/log 100= 5 IOs/update

~ log (2^30)/100= 0.2 IOs/update

8KB @ 100MB/s = 13k IOs/s

8KB @ 100MB/s, w/ 8ms seek = 100 IOs/s

13k / 0.2 = 65k updates/s

100 / 5 = 20 updates/s

Page 16: Supercharging Cassandra - GOTO Amsterdam

Acunu Kernel

Userspace

Linux Kernel

Doubling Arrays

arrays range

querieskey

insert

insertqueues

Bloom filters

x

user

spac

ein

terfa

ceke

rnel

spac

ein

terfa

cedo

ublin

g a

rray

map

ping

laye

rm

odlis

t btre

em

appi

ng la

yer

bloc

k m

appi

ng &

cach

eing

laye

rlin

ux's

bloc

k &

MM

laye

rs Memory manager

"Extent" layerextent

allocator& mapper

freespacemanager

btreerange

queries

key get

key insert

Version tree

Streaming interfacekey

insertkey get

bufferedvalue get

bufferedvalue insert

range queries

Cache

flusher

extent blockcache

page cache

prefetcher

In-kernel workloads

Block layer

shared buffersasync, sharedmemory ring

Shared memory interfacekeys

values

Arrays

value arrays

btree

key get

arraysmanagement

merges

• Opensource (GPLv2, MIT for user libraries)

• http://bitbucket.org/acunu

• Loadable Kernel Module, targeting CentOS’s 2.6.18

• http://www.acunu.com/blogs/andy-twigg/why-acunu-kernel/

Castle

http://goo.gl/gzihe

http://goo.gl/wXNDQ

More

Page 17: Supercharging Cassandra - GOTO Amsterdam

2. Monitoring

Page 18: Supercharging Cassandra - GOTO Amsterdam

jQuery

VisualVM

Page 19: Supercharging Cassandra - GOTO Amsterdam

Munin, Nagios etcmx4j: Rest-JMX adapter

Page 20: Supercharging Cassandra - GOTO Amsterdam
Page 21: Supercharging Cassandra - GOTO Amsterdam
Page 22: Supercharging Cassandra - GOTO Amsterdam

3. Operations

Page 23: Supercharging Cassandra - GOTO Amsterdam
Page 24: Supercharging Cassandra - GOTO Amsterdam

-bash-3.2$ nodetool...Available commands: ring - Print informations on the token ring join - Join the ring info - Print node informations (uptime, load, ...) cfstats - Print statistics on column families version - Print cassandra version tpstats - Print usage statistics of thread pools drain - Drain the node (stop accepting writes and flush all column families) decommission - Decommission the node compactionstats - Print statistics on compactions disablegossip - Disable gossip (effectively marking the node dead) enablegossip - Reenable gossip disablethrift - Disable thrift server enablethrift - Reenable thrift server netstats [host] - Print network information on provided host (connecting node by default) move <new token> - Move node on the token ring to a new token removetoken status|force|<token> - Show status of current token removal, force completion of pending removal or remove providen token setcompactionthroughput <value_in_mb> - Set the MB/s throughput cap for compaction in the system, or 0 to disable throttling. snapshot [keyspaces...] -t [snapshotName] - Take a snapshot of the specified keyspaces using optional name snapshotName clearsnapshot [keyspaces...] -t [snapshotName] - Remove snapshots for the specified keyspaces. Either remove all snapshots or remove the snapshots with the given name. flush [keyspace] [cfnames] - Flush one or more column family repair [keyspace] [cfnames] - Repair one or more column family cleanup [keyspace] [cfnames] - Run cleanup on one or more column family compact [keyspace] [cfnames] - Force a (major) compaction on one or more column family scrub [keyspace] [cfnames] - Scrub (rebuild sstables for) one or more column family invalidatekeycache [keyspace] [cfnames] - Invalidate the key cache of one or more column family invalidaterowcache [keyspace] [cfnames] - Invalidate the key cache of one or more column family getcompactionthreshold <keyspace> <cfname> - Print min and max compaction thresholds for a given column family cfhistograms <keyspace> <cfname> - Print statistic histograms for a given column family setcachecapacity <keyspace> <cfname> <keycachecapacity> <rowcachecapacity> - Set the key and row cache capacities of a given column family setcompactionthreshold <keyspace> <cfname> <minthreshold> <maxthreshold> - Set the min and max compaction thresholds for a given column family

Page 25: Supercharging Cassandra - GOTO Amsterdam

SNAPSHOTS*

* And clones!

Page 26: Supercharging Cassandra - GOTO Amsterdam

v1 v2

v6

v5

v0

v1

v3

v4

v3

Page 27: Supercharging Cassandra - GOTO Amsterdam

Rebuild

Page 28: Supercharging Cassandra - GOTO Amsterdam

13

89

5

14

2 12 34

67 8

1 34 5

67 10

1112 1315

16

910

1114

5 2

8 9

1413 12 15

16

Disk Layout: RDArandom duplicate allocation

Page 29: Supercharging Cassandra - GOTO Amsterdam

Future

Page 30: Supercharging Cassandra - GOTO Amsterdam

Memcache + Cassandra

Castle

H/W

Castle

H/W

...

Cassandra memcache Cassandra memcache

Cass client memcachedget/insert get/put

100k random inserts/sec!

Page 31: Supercharging Cassandra - GOTO Amsterdam

v16 v24

v13

v1

v15v12 v13

v16 v24

v13

v1

v15v12 v13

v16 v24

v13

v1

v15v12 v13

v16 v24

v13

v1

v15v12 v13

Page 32: Supercharging Cassandra - GOTO Amsterdam

Beware the “write cliff”...

~device capacity

Page 33: Supercharging Cassandra - GOTO Amsterdam

• Castle: Predictable Performance for Big Data

• Monitoring: distributed, multi-master tools, give you aggregated and summarised view of your cluster

• Snapshots & Clones: addressing real problems with new workloads

• RDA: lightening fast rebuilds for massive disks

Page 35: Supercharging Cassandra - GOTO Amsterdam