32
Practical Cassandra Vitalii Tymchyshyn [email protected] @tivv00 NoSQL key-value vs RDBMS – why and when Cassandra architecture Cassandra data model Life without joins or HDD space is cheap today Hardware requirements & deployment hints

Practical Cassandra

  • Upload
    -

  • View
    2.143

  • Download
    0

Embed Size (px)

DESCRIPTION

When and why to move from RDMBS to Cassandra, it's quirks and limitations

Citation preview

Page 1: Practical Cassandra

Practical Cassandra

Vitalii [email protected]

@tivv00

NoSQL key-value vs RDBMS – why and when

Cassandra architecture

Cassandra data model

Life without joins or HDD space is cheap today

Hardware requirements & deployment hints

Page 2: Practical Cassandra

RDBMS problems

Sometimes you reach the point where single server can't cope

Relational Replication

Not write scalable

Data is not instantly visible

Sharding

No foreign keys or joins

No transactions

Reduced reliability (multiple servers)

Schema update is a pain

Page 3: Practical Cassandra

Cassandra NoSQL

Master-Master Replication + Sharding in one bottle

Peer-to-peer architecture (no SPOF)

Easy cluster reconfiguration

Eventual consistency as a standard

All data in one record – no need to join

Flexible schema

Page 4: Practical Cassandra

Our data

We have intelligent Internet cache

Intelligent means we don't cache everything or we would need Google's DC

It's still hundreds of millions of sites

And 10s of TB of packed data

Randomly updated

Analysis must be able to process all of this in term of hours

Page 5: Practical Cassandra

Cassandra ring

- server

- client

Page 6: Practical Cassandra

Ring partitioner types

Order Preserving

Each server serves key range

Range queries possible

Read/Write/Disk space hot spots possible

Complex to fix key range

Random

Data is smoothly distributed on servers

No range queries

No hot spots

Fixed key range

Page 7: Practical Cassandra

Runtime CAP-solving

The whole thing is about replication

CAP: Consistency, Availability, Partition tolerance – choose two.

With cassandra you can choose at runtime.

Page 8: Practical Cassandra

Runtime CAP-solving

Quorum read/write Fast writes

Fast reads Fast, less consistency

Page 9: Practical Cassandra

Data model

Keyspaces – much like database in RDBMS

Column Families – storage element, like tables in RDBMS

Columns – you can have million for a row, names are flexible, still like columns in RDBMS

Super Column – A column that has structured content, superseded by composite columns

Page 10: Practical Cassandra

Twitter DB

Example

Users tableID, Name, Birthday

Tweets tableUserID, TweetID,

TweetContent

Twitter Keyspace

Users CFKey: User ID

Name(Str), Birthday(Str)

Timeline CFKey: User ID

<TweetID>(TweetContent)

Page 11: Practical Cassandra

Twitter DB

Example (alternative)

Users tableID, Name, Birthday

Tweets tableUserID, TweetID,

TweetContent

Twitter Keyspace

Data CFKey: User ID

Name(Str), Birthday(Str),<TweetID>(TweetContent)

Page 12: Practical Cassandra

Example (data)

Users

Data

Tweets

ID Name

1 Tom

2 John

User ID Text

1 1 Hello

1 2 See me?

2 3 See you!

Key Data

1 Name = Tom T_1 = Hello T_2 = See me?

2 Name = John T_3 = See you!

Page 13: Practical Cassandra

Data model

You can have same key in multiple column families

You can have different set of columns for different keys in same column family

You can query a range of columns for a key (columns are sorted) with pagination

You can have (and it's useful) to have columns without values

Page 14: Practical Cassandra

ACID vs BASE

Super Heroes are good, but not scalable. So, what do we loose?

Page 15: Practical Cassandra

No Atomicity

You've got no transactions – no rollback

The maximum you have is atomic update to single row

Failed operation MAY be applied(that's why counters are not reliable)

Page 16: Practical Cassandra

Eventual Consistency

Cassandra has no central governor

This means no bottleneck

This also means no one knows if database as a whole is consistent

Regular repair is your friend!

Page 17: Practical Cassandra

No Isolation

All mutations are timestamped to restore order from chaotic arrival

You MUST have your clock synchronized

That's how operation are applied on server :)

Page 18: Practical Cassandra

Controlled Durability

Cassandra uses transaction log to ensure durability on single server

Durability of the whole database depends on both total number of replicas and write operation replication factor

Remember, single server 99% uptime means 36.6% (0.99100) of “full cluster working” uptime for 100 servers – most time you've got at least one server down!

Page 19: Practical Cassandra

Data querying

With SQL you simply ask.

You can easily scan the whole DB

Indexes may help

Any calculation is repeated each time

This can be slow on read

Page 20: Practical Cassandra

Data querying

With NoSQL you can't efficiently scan the whole db

No “group by” or “order by”

You must prepare your data beforehand

You have multiple copies of data

You must recalculate on application logic change

The precalculated reads are fast

Page 21: Practical Cassandra

Think on your queriesin advance!

There is no “I'll simply add an index, some hints and my query will become fast”

Any index is created and maintained from application code

Now cassandra have secondary indexes, but they are much inferior to custom ones

Page 22: Practical Cassandra

What's wrong with secondary indexes

They work on fixed column names

They are consistent with data

This means they live near the data they index

This means they are distributed between nodes by row key, not by indexed column value

This means you need to ask every node to get single value

Page 23: Practical Cassandra

What's wrong with secondary indexes

Node 1

A: phone=1

B: phone=3

Phone index:

1=A,3=B

Node 2

C: phone=3

D: phone=5

Phone index:

3=C,5=D

Node 4

G: phone=3

H: phone=7

Phone index:

3=G,7=H

Node 3

E: phone=1

F: phone=5

Phone index:

1=E,5=F

Page 24: Practical Cassandra

“Index” example

Column family people

Key: Fred [phone=2223355, phone2=4445566, fax=9998877]

Key: John [phone=4445566, mobile=099123456]

Column family phone_directory

Key: 2223355 [Fred]

Key: 4445566 [Fred, John]

Key: 9998877 [Fred]

Key: 099123456 [John]

Page 25: Practical Cassandra

“Join” example

Column family customer

Key: Boeing [email: [email protected]]

Key: Oracle [skype: java]

Column family orders

Key: 1 [customer: Boeing, total: 200m]

Key: 2 [customer: Oracle, total: 300m]

Key: 3 [customer: Boeing, total: 500m]

Column family customer_order_totals

Key: Boeing[ 1:200m, 3:500m]

Key: Oracle[ 2:300m]

Page 26: Practical Cassandra

Peer-to-peer replication

Your operation can return OK even if it was not written to every replica

Hinted handoff will try to repair later

Even if your operation have failed, it may have been written to some replicas

This inconsistency won't be repaired automatically

This are drawbacks of “no master” architecture

You need to repair regular!

Page 27: Practical Cassandra

Tombstones and Repair

Delete events are recorded as Tombstones to ensure arriving“before delete” data won't be used

Regular repair not only makes sureyour data is replicated, but also that your deletes are replicated.If you don't, beware of ghosts!

Page 28: Practical Cassandra

Resources & Environment

Disk space requirements

Memory requirements

Native plugins & configuration

Page 29: Practical Cassandra

Disk estimations

Say, we've got 1TB of data

Replication factor 3 make it 3TB

Data duplication make it 12TB

Tombstones/repair space make it 24TB

Backups make it 36TB

Page 30: Practical Cassandra

Memory estimations

Cassandra has certain in-memory structures that are linear to data amount

Key and Row caches – configured at column family level. Change defaults if you've got a lot of CFs

Bloom filters and key samples cache are configured globally in latest versions

Estimate minimum ~0.5% of RAM for your data amount

Page 31: Practical Cassandra

Native specifics

Cassandra (like may other large things) likes JNA. Please install.

Cassandra maps files to memory – cassandra process virtual and resident memory size will grow because of mmap.

Default heap sizes are large – tame it if it's not only task on the host

Page 32: Practical Cassandra

Q&A

Author: Vitalii [email protected]

@tivv00