Upload
duyhai-doan
View
782
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Lessons learnd, billions of contacts data from SQL to Cassandra
Citation preview
Billion Records from SQL to Cassandra, lessons learned DuyHai Doan Brice Dutheil
#CassandraSummit @doanduyhai @BriceDutheil
Who are we ?
Brice Dutheil
Mockito Java Track Lead @ Devoxx France Independant contractor @ Libon (Orange-Vallée)
DuyHai Doan
Achilles Cassandra Technical Advocate Former Java Developer @ Libon
2
#CassandraSummit @doanduyhai @BriceDutheil
Agenda • Libon context
• Migration strategy
• Business code migration
• Data Modeling
• Take Away
3
#CassandraSummit @doanduyhai @BriceDutheil
Libon Context
#CassandraSummit @doanduyhai @BriceDutheil
What is Libon ? • Messaging app
• VOIP (out)
• Custom voicemail & greetings
• SMS/chat/file transfer
• Contacts matching
5
#CassandraSummit @doanduyhai @BriceDutheil
Contact Matching
6
Libon User
#CassandraSummit @doanduyhai @BriceDutheil
Contact Matching
7
Libon User Friend
#CassandraSummit @doanduyhai @BriceDutheil
Contact Matching
8
Libon User Friend
Contact matching
#CassandraSummit @doanduyhai @BriceDutheil
Contact Matching
9
Libon User Friend
Accept link
#CassandraSummit @doanduyhai @BriceDutheil
Project Context • Application grew over the years
10
#CassandraSummit @doanduyhai @BriceDutheil
Project Context • Application grew over the years
• Already using Cassandra to handle events
• messaging / file sharing / SMS / notifications
• Cassandra R/W latencies ≈ 0,4 ms
• server response time under 10 ms
11
#CassandraSummit @doanduyhai @BriceDutheil
Project Context • About contacts …
12
#CassandraSummit @doanduyhai @BriceDutheil
Project Context • About contacts …
• stored as relational model in RDBMS (Oracle)
13
#CassandraSummit @doanduyhai @BriceDutheil
Project Context • About contacts …
• stored as relational model in RDBMS (Oracle)
• 1 user ≈ 300 contacts
14
#CassandraSummit @doanduyhai @BriceDutheil
Project Context • About contacts …
• stored as relational model in RDBMS (Oracle)
• 1 user ≈ 300 contacts
• with millions users ☞ billions of contacts to handle
15
#CassandraSummit @doanduyhai @BriceDutheil
Project Context • About contacts …
• stored as relational model in RDBMS (Oracle)
• 1 user ≈ 300 contacts
• with millions users ☞ billions of contacts to handle
• query latency unpredictable
16
#CassandraSummit @doanduyhai @BriceDutheil 17
#CassandraSummit @doanduyhai @BriceDutheil
Fixing the problem • Tune the RDBMS
18
#CassandraSummit @doanduyhai @BriceDutheil
Fixing the problem • Tune the RDBMS
• indices
19
#CassandraSummit @doanduyhai @BriceDutheil
Fixing the problem • Tune the RDBMS
• indices
• partitioning
20
#CassandraSummit @doanduyhai @BriceDutheil
Fixing the problem • Tune the RDBMS
• indices
• partitioning
• less joins, simplified relational model
21
#CassandraSummit @doanduyhai @BriceDutheil
Fixing the problem • Tune the RDBMS
• indices
• partitioning
• less joins, simplified relational model
• hardware capacity increased
22
#CassandraSummit @doanduyhai @BriceDutheil
Fixing the problem • Tune the RDBMS
• indices
• partitioning
• less joins, simplified relational model
• hardware capacity increased
That worked
23
#CassandraSummit @doanduyhai @BriceDutheil
Fixing the problem • Tune the RDBMS
• indices
• partitioning
• less joins, simplified relational model
• hardware capacity increased
That worked but …
24
#CassandraSummit @doanduyhai @BriceDutheil
Back-end application
RDBMS Cassandra
25
#CassandraSummit @doanduyhai @BriceDutheil
Next Challenges • High Availability (DB failure, site failure …)
26
#CassandraSummit @doanduyhai @BriceDutheil
Next Challenges • High Availability (DB failure, site failure …)
• Predictable performance at scale
27
#CassandraSummit @doanduyhai @BriceDutheil
Next Challenges • High Availability (DB failure, site failure …)
• Predictable performance at scale
• Going to multi data-centers
28
#CassandraSummit @doanduyhai @BriceDutheil
Going for Cassandra • Denormalize (if possible …)
29
#CassandraSummit @doanduyhai @BriceDutheil
Going for Cassandra • Denormalize (if possible …)
• Know your business ☞ know your queries
30
#CassandraSummit @doanduyhai @BriceDutheil
Going for Cassandra • Denormalize (if possible …)
• Know your business ☞ know your queries
• Linear scaling out
31
#CassandraSummit @doanduyhai @BriceDutheil
Going for Cassandra • Denormalize (if possible …)
• Know your business ☞ know your queries
• Linear scaling out
• Consistent performance
32
#CassandraSummit @doanduyhai @BriceDutheil
Data Migration Strategy
#CassandraSummit @doanduyhai @BriceDutheil
Objectives • No downtime
34
#CassandraSummit @doanduyhai @BriceDutheil
Objectives • No downtime
• No concurrency corner-cases
35
#CassandraSummit @doanduyhai @BriceDutheil
Objectives • No downtime
• No concurrency corner-cases
• Safe rollback possible
36
#CassandraSummit @doanduyhai @BriceDutheil
Objectives • No downtime
• No concurrency corner-cases
• Safe rollback possible
• Replay-ability & resume-ability
37
#CassandraSummit @doanduyhai @BriceDutheil
Strategy • 3 phases
38
#CassandraSummit @doanduyhai @BriceDutheil
Strategy • 3 phases
• Write contacts to both data stores
39
#CassandraSummit @doanduyhai @BriceDutheil
Strategy • 3 phases
• Write contacts to both data stores
• Old contacts migration
40
#CassandraSummit @doanduyhai @BriceDutheil
Strategy • 3 phases
• Write contacts to both data stores
• Old contacts migration
• Switch to Cassandra …
• … and deprecate SQL
41
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 1 Back end server
· ·
·
SQL SQL SQL
C*
C*
C* C*
C*
Write
contactUUID
42
contactId … contactUUID 129363 123e4567-
e89b-12d3… 834849
contacId(long) + contactUUID
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 1 Back end server
· ·
·
SQL SQL SQL
C*
C*
C* C*
C*
Read
43
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2
SQL SQL SQL
C*
C*
C* C*
C*
For each batch of users SELECT * FROM contacts WHERE user_id = … AND contact_uuid IS NULL
• On live production, migrate old contacts
44
Old contacts created before phase 1
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2
SQL SQL SQL
C*
C*
C* C*
C*
For each batch of users SELECT * FROM contacts WHERE user_id = … AND contact_uuid IS NULL
Logged batches of INSERT INTO contacts(..) VALUES(…) USING TIMESTAMP now() - 1 week
• On live production, migrate old contacts
45
Old contacts created before phase 1
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2
USING TIMESTAMP now() - 1 week 😳
46
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2 • During data migration …
47
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2 • During data migration …
• … concurrent writes from the migration batch …
48
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2 • During data migration …
• … concurrent writes from the migration batch …
• … and updates from production for the same contact
49
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2
contact_uuid name (now -1 week) … name (now) …
Johny … Johnny …
Insert from batch (to the past)
Update from production
50
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2
contact_uuid name (now -1 week) … name (now) …
Johny … Johnny …
Future reads pick the most up-to-date value
51
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 2
"Write to the Past… to save the Future"
Libon – 2014/10/08
52
#CassandraSummit @doanduyhai @BriceDutheil
Migration Phase 3 Back end server
· ·
·
SQL SQL SQL
C*
C*
C* C*
C*
Write
❌ 53
#CassandraSummit @doanduyhai @BriceDutheil
Business Code Refactoring
#CassandraSummit @doanduyhai @BriceDutheil
Code Inventory • Written for RDBMS
55
#CassandraSummit @doanduyhai @BriceDutheil
Code Inventory • Written for RDBMS
• Lots of joins (no surprise)
56
#CassandraSummit @doanduyhai @BriceDutheil
Code Inventory • Written for RDBMS
• Lots of joins (no surprise)
• Designed around transactions
57
#CassandraSummit @doanduyhai @BriceDutheil
Code Inventory • Written for RDBMS
• Lots of joins (no surprise)
• Designed around transactions
• Spring @Transactional everywhere
58
#CassandraSummit @doanduyhai @BriceDutheil
Code Inventory cont. • Entities go through Services & Repositories
59
Repositories
Services
ContactEntity
#CassandraSummit @doanduyhai @BriceDutheil
Code Inventory cont. • Hibernate is auto-magic
60
#CassandraSummit @doanduyhai @BriceDutheil
Code Inventory cont. • Hibernate is auto-magic
• lazy loading
• 1st level cache
• N+1 select
61
Repositories
Services
ContactEntity
#CassandraSummit @doanduyhai @BriceDutheil
Which options ? • Throw existing code …
• … and re-design from scratch for Cassandra
62
#CassandraSummit @doanduyhai @BriceDutheil
Which options ? • Throw existing code …
• … and re-design from scratch for Cassandra No way !
63
#CassandraSummit @doanduyhai @BriceDutheil
Code Quality • Existing business code has…
• … ≈ 3500 unit tests
64
#CassandraSummit @doanduyhai @BriceDutheil
Code Quality • Existing business code has…
• … ≈ 3500 unit tests
• and ≈600+ integration tests
65
#CassandraSummit @doanduyhai @BriceDutheil
Code Quality • We are TDD aficionados …
66
#CassandraSummit @doanduyhai @BriceDutheil
Code Quality • We are TDD aficionados …
• … and we love our code coverage
67
#CassandraSummit @doanduyhai @BriceDutheil
Code Quality
"The code coverage is one of your most
valuable technical asset" Libon – since beginning
68
#CassandraSummit @doanduyhai @BriceDutheil
Repositories
Services
Refactoring Strategy
ContactMatchingService ContactService ContactSync
ContactEntity
n 1 n n
69
#CassandraSummit @doanduyhai @BriceDutheil
Repositories
Services
Refactoring Strategy
ContactMatchingService ContactService
ContactNoSQLEntity
ContactSync
ContactEntity
n 1 n n
70
Proxy
#CassandraSummit @doanduyhai @BriceDutheil
Repositories
Services
Refactoring Strategy
ContactMatchingService ContactService
ContactNoSQLEntity
ContactSync
ContactEntity
n 1 n n
Denorm2 … DenormN Denorm1
71
Proxy
#CassandraSummit @doanduyhai @BriceDutheil
Refactoring Strategy • Use CQRS
• ContactReadRepository
• ContactWriteRepository
• ContactUpdateRepository
• ContactDeleteRepository
72
#CassandraSummit @doanduyhai @BriceDutheil
Refactoring Strategy • ContactReadRepository
• direct sequential read
• no joins
• 1 read ≈ 1 SELECT
73
#CassandraSummit @doanduyhai @BriceDutheil
Refactoring Strategy • ContactWriteRepository
• write to all denormalized tables
• using CQL logged batches
• use TTLs
74
#CassandraSummit @doanduyhai @BriceDutheil
Refactoring Strategy • ContactUpdateRepository
• read-before-write most of the time 😟
• rare updates ☞ acceptable perf penalty
75
#CassandraSummit @doanduyhai @BriceDutheil
Refactoring Strategy • ContactDeleteRepository
• delete
• update contact modification date
76
#CassandraSummit @doanduyhai @BriceDutheil
Outcome • 5 months of 2 men work
77
#CassandraSummit @doanduyhai @BriceDutheil
Outcome • 5 months of 2 men work
• Many iterations to fix bugs (thanks to IT)
78
#CassandraSummit @doanduyhai @BriceDutheil
Outcome • 5 months of 2 men work
• Many iterations to fix bugs (thanks to IT)
• Lots of performance benchmarks using Gatling
79
#CassandraSummit @doanduyhai @BriceDutheil
Gatling Output
80
#CassandraSummit @doanduyhai @BriceDutheil
Outcome • 5 months of 2 men work
• Many iterations to fix bugs (thanks to IT)
• Lots of performance benchmarks using Gatling
☞ data model & code validation
81
#CassandraSummit @doanduyhai @BriceDutheil
Outcome • 5 months of 2 men work
• Many iterations to fix bugs (thanks to IT)
• Lots of performance benchmarks using Gatling
☞ data model & code validation
• … we are almost there for production
82
#CassandraSummit @doanduyhai @BriceDutheil
Data Model
#CassandraSummit @doanduyhai @BriceDutheil
Denormalization, the good • Support fast reads
• 1 read ≈ 1 SELECT
• Worthy because mostly read, few updates
84
#CassandraSummit @doanduyhai @BriceDutheil
Denormalization, the bad • Updating mutable data can be nightmare
• Data model bound by existing client-facing API
• Update paths very error-prone without tests
85
#CassandraSummit @doanduyhai @BriceDutheil
Data model in detail
Contacts_by_id
Contacts_by_identifiers
Contacts_in_profiles
Contacts_by_modification_date
Contacts_by_firstname_lastname
Contacts_linked_user
86
#CassandraSummit @doanduyhai @BriceDutheil
Data model in detail
Contacts_by_id
Contacts_by_identifiers
Contacts_in_profiles
Contacts_by_modification_date
Contacts_by_firstname_lastname
Contacts_linked_user
87
user_id always component
of partition key
#CassandraSummit @doanduyhai @BriceDutheil
Scalable design
88
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
user_id1
user_id2
user_id3
user_id4
user_id5
#CassandraSummit @doanduyhai @BriceDutheil
Scalable design
89
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
user_id1 user_id2
user_id3
user_id4
user_id5
#CassandraSummit @doanduyhai @BriceDutheil
Bloom filters in action
90
• For some tables, partition key = (user_id, contact_id)
☞ fast look-up, leverages Bloom filters
☞ touches 1 SSTable most of the time
#CassandraSummit @doanduyhai @BriceDutheil
Data model in detail
Contacts_by_id
Contacts_by_identifiers
Contacts_in_profiles
Contacts_by_modification_date
Contacts_by_firstname_lastname
Contacts_linked_user
91
Wide partition Bucketed
#CassandraSummit @doanduyhai @BriceDutheil
A "queue" story
92
• contacts_by_modification_date
• queue-like pattern 😭
#CassandraSummit @doanduyhai @BriceDutheil
A "queue" story
93
• contacts_by_modification_date
• queue-like pattern 😭
☞ buckets to the rescue
user_id:2014-12 date35 date12 … … date47
… … … …
user_id:2014-11 date11 date12 … … date34
… … … …
#CassandraSummit @doanduyhai @BriceDutheil
Data model summary • 7 tables for denormalization
94
#CassandraSummit @doanduyhai @BriceDutheil
Data model summary • 7 tables for denormalization
• Normalize some tables because rare access
95
#CassandraSummit @doanduyhai @BriceDutheil
Data model summary • 7 tables for denormalization
• Normalize some tables because rare access
• Read-before write in most update scenarios 😟
96
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • In SQL, auto-generated long using sequence
• In Cassandra, auto-generated timeuuid
97
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • How to store both types ?
98
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • How to store both types ?
• As text ? ☞ easy solution …
99
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • How to store both types ?
• As text ? ☞ easy solution …
• … but waste of space !
• because encoded as UTF-8 or ASCII in Cassandra
100
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • Long ☞ 8 bytes
• Long as text(UTF-8: 1 byte) ☞ "digits count" bytes
101
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • UUID ☞ 16 bytes
• 32 hex chars + 4 hyphens = 36 chars
• UUID as text(UTF-8: 1 byte) ☞ 36 bytes
• Bytes overhead = 36 – 16 = 20 bytes
102
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • 20 bytes wasted per contact uuid
103
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • 20 bytes wasted per contact uuid
• × 7 denormalizations = 140 bytes per contact uuid
104
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • 20 bytes wasted per contact uuid
• × 7 denormalizations = 140 bytes per contact uuid
• × 109 contacts = 140 GB wasted
😠 105
not even counting replication factor …
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • ☞ just save contact id as byte[ ]
106
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • ☞ just save contact id as byte[ ]
• Achilles @TypeTransformer for automatic conversion (see later)
107
#CassandraSummit @doanduyhai @BriceDutheil
Notes on contact_id • ☞ just save contact id as byte[ ]
• Achilles @TypeTransformer for automatic conversion (see later)
• Use blobAsBigInt( ) or blobAsUUID( ) to view data
108
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Advanced "object mapper"
• Fluent API
• Tons of features
• TDD friendly
109
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Dirty checking, why is it important ?
110
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Dirty checking, why is it important ?
• 1 contact ≈ 8 mutable fields
111
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Dirty checking, why is it important ?
• 1 contact ≈ 8 mutable fields
• × 7 denormalizations = 56 update combinations …
112
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Dirty checking, why is it important ?
• 1 contact ≈ 8 mutable fields
• × 7 denormalizations = 56 update combinations …
• and not even counting multiple fields updates …
113
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Are you going to manually generate 56+ prepared
statements for all possible updates ?
114
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Are you going to manually generate 56+ prepared
statements for all possible updates ?
• Or just use dynamic plain string statements and get some perf penalty ?
115
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Dirty check in action
//No read-before-write ContactEntity proxy = manager.forUpdate(ContactEntity.class, contactId); proxy.setFirstName(…); proxy.setLastName(…); //type-safe updates proxy.setAddress(…);
manager.update(proxy);
116
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
117
Empty Entity
DirtyMap
Proxy Setters interception
PrimaryKey
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Dynamic statements generation
UPDATE contacts SET firstname=?, lastname=?,address=? WHERE contact_id=?
118
prepared statements are cached, of course
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Insert strategy, what is it ?
119
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Simple INSERT prepared statement
INSERT INTO contacts(contact_id,name,age,address,gender,avatar,…) VALUES(?, ?, ?, ? … ?);
120
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Runtime values binding
• some columns are optional
preparedStatement.bind(49374,’John DOE’,33, null, null, …, null);
121
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
Wait … are you saying inserting null in CQL???
😳
122
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
Inserting null ≡ creating tombstones
123
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
Inserting null ≡ creating tombstones × 7 denormalizations
124
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
Inserting null ≡ creating tombstones × 7 denormalizations
× billions of contacts created
😱 125
not even counting replication factor …
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
@Entity(table = "contacts_by_id ») @Strategy(insert = InsertStrategy.NOT_NULL_FIELDS) public class ContactById {
}
126
• Simple annotation
#CassandraSummit @doanduyhai @BriceDutheil
Achilles • Runtime dynamic INSERT statement
INSERT INTO contacts(contact_id, name, age, address,) VALUES(:contact_id, :name, :age, :address);
127
prepared statements are cached, of course
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
@PartitionKey @Column(name = "contact_id") @TypeTransformer(valueCodecClass = ContactIdToBytes.class) private ContactId contactId;
128
• Remember the contactId ⇄ byte[ ] conversion ?
BYOC ☞ Bring Your Own Codec
#CassandraSummit @doanduyhai @BriceDutheil
Achilles public interface Codec<FROM, TO> { Class<FROM> sourceType(); Class<TO> targetType(); TO encode(FROM fromJava) FROM decode(TO fromCassandra); }
129
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
130
2014-12-01 14:25:20,554 Bound statement : [INSERT INTO contacts.contacts_by_modification_date(user_id,month_bucket,modification_date,...) VALUES (:user_id,:month_bucket,:modification_date,...) USING TTL :ttl;] with CONSISTENCY LEVEL [LOCAL_QUORUM] 2014-12-01 14:25:20,554 bound values : [222130151, 2014-12, e13d0d50-7965-11e4-af38-90b11c2549e0, ...]
2014-12-01 14:25:20,701 Bound statement : [SELECT birthday,middlename,avatar_size,... FROM contacts.contacts_by_modification_date WHERE user_id=:user_id AND month_bucket=:month_bucket AND (modification_date)>=(:modification_date) ORDER BY modification_date ASC;] with CONSISTENCY LEVEL [LOCAL_QUORUM] 2014-12-01 14:25:20,701 bound values : [222130151, 2014-10, be6bc010-6109-11e4-b385-000038377ead]
• Dynamic logging in action
#CassandraSummit @doanduyhai @BriceDutheil
Achilles
131
• Dynamic logging
• runtime activation
• no need to recompile/re-deploy
• save us hours of debugging
• TRACE log level ☞ query tracing
#CassandraSummit @doanduyhai @BriceDutheil
Take Away
#CassandraSummit @doanduyhai @BriceDutheil
Conditions for success • Data modeling is crucial
133
#CassandraSummit @doanduyhai @BriceDutheil
Conditions for success • Data modeling is crucial
• Double-run strategy & timestamp trick FTW
134
#CassandraSummit @doanduyhai @BriceDutheil
Conditions for success • Data modeling is crucial
• Double-run strategy & timestamp trick FTW
• Data type conversion can be tricky
135
#CassandraSummit @doanduyhai @BriceDutheil
Conditions for success • Data modeling is crucial
• Double-run strategy & timestamp trick FTW
• Data type conversion can be tricky
• Benchmark !
136
#CassandraSummit @doanduyhai @BriceDutheil
Conditions for success • Data modeling is crucial
• Double-run strategy & timestamp trick FTW
• Data type conversion can be tricky
• Benchmark !
• Mindset shifts for the team
137
Thank You