10
Apache Cassandra at Talkbits Max Alexejev Moscow Cassandra Users Group 25 April 2013

Apache Cassandra at TalkBits

Embed Size (px)

DESCRIPTION

Apache Cassandra at TalkBits at the Moscow Cassandra Users Group (http://www.meetup.com/Moscow-Cassandra-Users/).

Citation preview

Page 1: Apache Cassandra at TalkBits

Apache Cassandra at Talkbits

Max Alexejev

Moscow Cassandra Users Group

25 April 2013

Page 2: Apache Cassandra at TalkBits

What is talkbits?

Maxim Alexejev
Create "JavaOne Moscow 2013" geo-channel ?
Page 3: Apache Cassandra at TalkBits

Talkbits backend

Recursive call

Page 4: Apache Cassandra at TalkBits

Talkbits backend deployment diagram

Page 5: Apache Cassandra at TalkBits

Cassandra in EC2 at Talkbits NetworkTopologyStrategy + EC2MultiRegionSnitch

1 DC, 3 racks (availability zones in S3 Region), N nodes per rack. 3N nodes total.

Data stored in 3 local copies, 1 per zone.

Write with LOCAL_QUORUM setting, read with 1 or 2.

m1.large nodes (2 cores, 4CU, 7.5Gb RAM).

Transaction log and data files are both on RAID0-ed ephemeral drive (2 drives in array). Works for SSD or EC2 disks only!

Other typical setup options for EC2:

m1.xlarge (16Gb) / m2.4xlarge (64Gb) / hi1.4xlarge (SSD) nodes

EBS-backed data volumes (not recommended. use for development only).

Page 6: Apache Cassandra at TalkBits

Cassandra consistency options

Definitions

N, R, W settings from Amazon Dynamo.

N – replication factor. Set per keyspace on keyspace creation.

Quorum: N / 2 + 1 (rounded down)

RW consistency options:

ANY, ONE, TWO, THREE, QUORUM, LOCAL_QUORUM & EACH_QUORUM (multi-dc), ALL.

Set per query.

Page 7: Apache Cassandra at TalkBits

Cassandra consistency semantics

W + R > N

Ensures strong consistency. Read will always reflect the most recent write.

R = W = [LOCAL_]QUORUM

Strong consistency. See quorum definition and formula above.

W + R <= N

Eventual consistency.

W = 1

Good for fire-n-forget writes: logs, traces, metrics, page views etc.

Page 8: Apache Cassandra at TalkBits

Cassandra backups to S3

Full backups•Periodic snapshots (daily, weekly)

•Remove from local disk after upload to S3 to prevent disk overflow

Incremental backups•SSTable are compressed and copied to S3

•Happens on IN_MOVED_TO, IN_CLOSE_WRITE events•Don’t turn on with leveled compaction (huge network traffic to S3)

Continuous backups•Compress and copy transaction log to S3 with short time intervals (for example - 5, 30, 60 mins)

Page 9: Apache Cassandra at TalkBits

Cassandra backups to S3 - tools

TableSnap from SimpleGeo

https://github.com/Instagram/tablesnap (most up-to-date fork)

3 simple Python scripts is the whole tool (tablesnap, tableslurp, tablechop). Allows to upload SSTables in real-time, restore and remove old backups uploads from S3.

Priam from Netflix

https://github.com/Netflix/Priam

Full-blown web application. Requires servlet container to run and depends on Amazon SimpleDB service for distributed token management.

Page 10: Apache Cassandra at TalkBits

Contacts

Max Alexejevhttp://ru.linkedin.com/pub/max-alexejev/51/820/ab9http://www.slideshare.net/MaxAlexejev/[email protected]