Upload
erick-ramirez
View
338
Download
0
Embed Size (px)
Citation preview
Cassandra Core Conceptsand why Netflix runs Cassandra on the cloud Erick Ramirez @flightc, DataStax Engineering
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Welcome
2
• Introducing Cassandra • Why Netflix runs Cassandra on the cloud • Feel free to ask questions
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Relational data model
3
• Normalised schema, table joins, ACID • Joins are very expensive on billions of rows • Sharding tables across systems is complex • Performance preferred over “always on” • Requires massive high-end systems
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Big data requirements
4
• Distribute data across multiple nodes • Relaxed consistency • Relaxed schema • Scale, scale, scale!
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
NoSQL landscape
5
• Graph, Key-value, Document, Column family • Consistency - same result regardless of node • Availability - high read/write volumes • Partition tolerance - survive network isolation
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
CAP theorem
6
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
What is Cassandra?
7
• Massively scalable NoSQL database • Fully distributed, no single-point-of-failure • Open sourced by Facebook • Linear horizontal scaling
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Modelling Cassandra
8
• Use Cassandra Query Language (CQL) • Similar SQL-like approach
• CREATE, ALTER, DROP • SELECT, INSERT, UPDATE, DELETE
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Modelling Cassandra
9
CREATE TABLE users ( userid text, name text, email text, PRIMARY KEY (userid));
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Why Cassandra
10
• All nodes are the same - no SPOF • Real-time, durable writes • Linear scaling on commodity servers • Real-time replication across data centres • Always on - no offline operation • Because you have a scale problem
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Why not Cassandra
11
• RDBMS excels in ACID transactions • You need to justify your purchase of massive
high-end servers
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Common use cases
12
• Personalisation/recommendations (Netflix,ebay) • Messaging (Instagram) • IoT (Riptide IO) • Fraud detection (Barracuda) • Playlists and collections (Spotify) • Graph (SpotRight)
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
A Cassandra cluster
13
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Cassandra Summit 2015
14
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
academy.datastax.com
15
Erick Ramirez© 2015 DataStax, All Rights Reserved.
@flightc
Thank you
16
• Erick Ramirez • @flightc