Upload
jimmy-mardell
View
4.127
Download
0
Embed Size (px)
October 18, 2013#CassandraEU
Playlists at Spotify
Using Cassandra to store version controlled objects at large scale
Jimmy Mårdell <[email protected]>
#CassandraEUIntro
About me
• Jimmy Mårdell• Software Engineer• 3 years at Spotify
2
#CassandraEUIntro
About Spotify
• 24 million active users– 6 million paying subscribers
• 4 000 servers in 4 data centers•Over 1 billion playlists created
3
#CassandraEUIntro
Contents
4
•Why version control?•Playlists at Spotify•Cassandra data model•Lessons learned
#CassandraEUWhy version control?
What is version control?
• “Version control is the management of changes to documents” (Wikipedia)• Stand-alone (most common)– GIT, Subversion etc
• Embedded– Google Docs
5
#CassandraEUWhy version control?
Embedded usage
• Collaborative editing• Undo functionality• Performance• Business logic depends on document history
6
#CassandraEUPlaylists at Spotify
Playlists
7
#CassandraEUPlaylists at Spotify 8
#CassandraEUPlaylists at Spotify 9
Playlist challenges
•More than 1 billion playlists• >40 000 requests/second at peak•Offline mode• Concurrent changes
#CassandraEUPlaylists at Spotify
Playlist client-server
• Every playlist is a version controlled object• All playlists are synced on login– Fetch all new changes
10
#CassandraEUPlaylists at Spotify
Playlist client-server
• Local queue of playlist modifications– Clients optimistically accept changes - fast UI
•Queue flushed to server when possible– Offline changes– Fault tolerant
11
#CassandraEUPlaylists at Spotify
Playlist version control
12
1,4ed2...: ADD(ix=0, track=A,B,C)
2,19ca...: MOV(from=2, to=1, len=1)
3,038f...: REM(from=2, len=1)
0,ROOT
ABC
AC B
AC
Representation of a playlist in the backend
2,19ca...: MOV(from=2, to=1, len=1)2,19ca...: MOV(from=2, to=1, len=1)
#CassandraEUPlaylists at Spotify
Playlist branching
• Concurrent changes– Offline
13
BA
#CassandraEUPlaylists at Spotify
Playlist branching
• Concurrent changes– Offline
• Conflict resolution– Operational Transformation
• Clients oblivious of branches
14
B
B’
A
A’
merge
#CassandraEUCassandra data model
Cassandra data model
15
#CassandraEUCassandra data model
Cassandra at Spotify
• Playlist first system to use Cassandra– Now we use it a lot...
• Started with Cassandra 0.7• Using limited set of Cassandra features– No super columns– No CQL
16
#CassandraEUCassandra data model
Planning a data model
• Start with the queries!• Three common playlist queries– SYNC: Get all changes since a particular revision– GET: Get the most recent snapshot– APPEND: Add/move/delete tracks
17
#CassandraEUCassandra data model
Playlist data model
18
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
parent=0,ROOTop=ADD(ix=0, track=A,B,C)
parent=1,4ed2...op=MOV(from=2, to=1, len=1)
parent=2,19caop=REM(from=2, len=1)
CF playlist_change
#CassandraEUCassandra data model
Playlist data model
19
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
parent=0,ROOTop=ADD(ix=0, track=A,B,C)
parent=1,4ed2...op=MOV(from=2, to=1, len=1)
parent=2,19caop=REM(from=2, len=1)
CF playlist_change
Row key 1,8a20... 2,b783... 2,dd07... 3,39ef... 3,5a9c... 4,03fc...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
prnt=0,ROOTop=...
prnt=1,8a20...op=...
prnt=1,8a20...op=...
prnt=2,dd07...op=...
prnt=2,b783...op=...
prnt=2,39ef...prnt=3,5a9c...
#CassandraEUCassandra data model
Playlists in Cassandra
•Which revision is the latest?– Changes with no children
•Multiple heads possible!– Heads may appear anywhere within the row
20
#CassandraEUCassandra data model
Playlist data model
21
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
prnt=0,ROOTop=...
prnt=1,4ed2...op=...
prnt=2,19caop=...
CF playlist_change
Row key 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
CF playlist_head
#CassandraEUCassandra data model
Playlist data model
22
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
prnt=0,ROOTop=...
prnt=1,4ed2...op=...
prnt=2,19caop=...
CF playlist_change
Row key 1,8a20... 2,b783... 2,dd07...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
prnt=0,ROOTop=...
prnt=1,8a20...op=...
prnt=1,8a20...op=...
Row key 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
CF playlist_head
Row key 2,b783... 2,dd07...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
#CassandraEUCassandra data model
Playlist data model
23
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
prnt=0,ROOTop=...
prnt=1,4ed2...op=...
prnt=2,19caop=...
CF playlist_change
Row key 1,8a20. 2,b783. 2,dd07. 3,39ef. 3,5a9c. 4,03fc.
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
prt=0,ROOTop=...
prnt=1,8a20op=...
prnt=1,8a20op=...
prnt=2,dd07op=...
prnt=2,b783op=...
prnt=2,39efprnt=3,5a9c
Row key 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
CF playlist_head
Row key 4,03fc...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
#CassandraEUCassandra data model
Playlist heads
• playlist_head is a small CF– Fits in RAM
• 95% of playlist request only read from playlist_head– Most playlists are already up-to-date
24
#CassandraEUCassandra data model
Playlist snapshots
• playlist_change works well when syncing playlists• Not so well for fetching new playlists– Snapshot cache
25
#CassandraEUCassandra data model
Playlist data model
26
Row key 1,4ed2... 2,19ca... 3,038f...
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
prnt=0,ROOTop=...
prnt=1,4ed2...op=...
prnt=2,19caop=...
CF playlist_change
Row key 1,8a20... 2,b783... 2,dd07...
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
prnt=0,ROOTop=...
prnt=1,8a20...op=...
prnt=1,8a20...op=...
Row key cache
spotify:user:spotify:playlist:3ZgmfR6lsnCwdffZUan8EA
version=3,038f...contents=A,C
CF playlist_snapshot
Row key cache
spotify:user:yarin:playlist:4Pj4dCOEEYWDixfYyJwxEf
version=2,b783...contents=...
#CassandraEUCassandra data model
Updating playlists
• Validate change– Locate snapshot– Client may append to old version
• Update all tables– playlist_head last
27
#CassandraEUCassandra data model
Cassandra consistency levels
• Replication factor 3• All writes using CL_QUORUM• Reads from playlist_head – CL_QUORUM
• Reads from playlist_change and playlist_snapshot– CL_ONE but may fallback to CL_QUORUM
28
#CassandraEULessons learned
Lessons learned
29
#CassandraEULessons learned 30
•Leveled compaction
– Improved performance a lot•Compression
–Not as impressive
–CRC checks
Optimizations
#CassandraEULessons learned
Optimizations
• Trusted Linux page cache to ensure playlist_head kept in RAM– Didn’t work
• Tried Cassandra row cache– NO!
•mlock to the rescue
31
#CassandraEULessons learned
An enterprise ready solution
bash# while true; do vmtouch -m 10000000000 -l *head* & sleep 10m kill %vmtouchdone
32
#CassandraEULessons learned
No moving parts
• Flash disks are awesome• Reduced size of cluster from 60 to 30 nodes– Thanks FusionIO!
• IOPS no longer the bottleneck
33
#CassandraEULessons learned
Tombstone hell
• Noticed requests to playlist_head took several seconds– Huh?
• Every change causes a value to be deleted in playlist_head• playlist_head is essentially a queue– Well-known anti-pattern
34
#CassandraEULessons learned
Tombstone hell
•We had rows with >500,000 tombstones• Solution: major compaction– Relatively fast since playlist_head is in RAM
35
#CassandraEULessons learned
And more...
• Large rows in playlist_change– Modify version graph
• Reduce amount of requests– Group playlists by owner
Sounds interesting? We’re hiring!
36
Questions?