View
1.180
Download
4
Category
Preview:
DESCRIPTION
Data modeling, cluster sizing, and planning can be difficult when transitioning an existing product to Cassandra. Especially when the new Cassandra deployment needs to handle millions of operations per second on day one! In this talk I'll discuss our strategy for data modeling, cluster sizing, and our novel approach to data replication across data centers.
Citation preview
Integrated with 100,000+ publishers Connected to 35+ DSPs
Partnerships with Industry-Leading Trading Desks 10,000+ Brand Name Advertisers
Who We Are
• Holistic video advertising platform for publishers
• Most transparent global marketplace for sellers
• Founded in 2007, 180+ employees globally
• First to market with video RTB in 2010
• Integrated with over half of comScore top 100 pubs
Ad decisions per day
2+ Billion
Serving impressions in
100+ Countries
Uniques every month 335+ Million
Reaching
● Over 2 billion ad auc1ons per day ● Each auc1on generates an average of 20-‐30 “records”
● Audience data ● Bid data ● Event tracking
● A “record everything” approach would result in approximately 50 billion records per day ● Normalized: ~ 1.5 TB / day uncompressed ● Denormalized: ~ 5 TB / day uncompressed
● Possibly up to 150 TB of data per month ● We are not currently using a “record everything” approach, but we want
to get there
How Big is Our Data?
How Fast Does Our Data Grow?
0
500000000
1E+09
1.5E+09
2E+09
2.5E+09
10/9/13 11/9/13 12/9/13 1/9/14 2/9/14 3/9/14 4/9/14 5/9/14 6/9/14 7/9/14 8/9/14 9/9/14
Auctions
Auctions
Growth Curve
● Typically our numbers double every 6 months ● We expect more rapid growth over the next year or two
How Fast Does Our Data Grow?
0
2E+09
4E+09
6E+09
8E+09
1E+10
1.2E+10
Auctions
Auctions
Growth Curve
● Typically our numbers double every 6 months ● We expect more rapid growth over the next year or two
● Over 10 billion ad auc1ons per day ● Each auc1on generates an average of 30-‐40 “records”
● Audience data ● Bid data ● Event tracking
● A “record everything” approach would result in approximately 350 billion records per day ● Normalized: ~ 10.5 TB / day uncompressed ● Denormalized: ~ 35 TB / day uncompressed
● Possibly up to 1 PB of data per month
How Big Might Our Data Get in a Year?
Excited to see how we’re using Cassandra for all this?
Too bad, we aren’t (yet)!
Where Do We Start?
● Informa1on about the people that are viewing ads ● Segment data (demographics, browsing history, etc) ● Ads viewed ● ID syncing
● Used for adver1sers to reach their target audience ● “My product is relevant only to bald, le_-‐handed, highly educated
immigrants from Uzbekistan.” ● Historically stored in cookies ● Technology advancement necessitates abandoning the cookie strategy
● Track users on mul1ple devices ● Mobile devices and connected TVs don’t typically support cookies
● Offline availability of data provides analy1cs opportuni1es ● Discover trends ● Look-‐alike segments
Audience Data
Cookie-‐based Workflow Browser SpotXchange
Browser requests an ad via HTTP
Server responds with an ad The ad payload includes data partner URLs
Data Partner
Browser requests partner URL Request payload includes partner’s cookies
Data provider replies with a redirect containing segment information Browser redirects to us
We respond with our own cookies containing their segment data
Browser requests an ad via HTTP Now including our cookies
Server responds with an ad targeted at audience segments
● Cookies are overly constraining and gefng worse ● Limited to desktop traffic ● Payload is expensive
● Bandwidth ● Processing (encryp1on and encoding)
● Impossible to run deep analy1cs ● Impossible to perform server-‐to-‐server synchroniza1on
● Newer iden1fica1on standards are emerging ● Apple IDFA, Android ID, UIDH ● Facebook/Google ID ● Device Fingerprin1ng
● Moving audience data onto the server allows data to be associated with any iden1fier and even tying mul1ple iden1fiers together
Moving Away from Cookies
Server-‐side Storage Workflow Browser SpotXchange
Browser requests an ad via HTTP
Server responds with an ad The ad payload includes data partner URLs
Data Partner
Browser requests partner URL with SpotX audience ID attached
Data provider replies with a redirect containing segment information and partner audience ID Browser redirects to us
Browser requests an ad via HTTP
Server responds with an ad targeted at audience segments
We store segment information on the server
Addi1onal Capabili1es Browser SpotXchange Data Partner
User visits a site that provides the partner new data about that user Provider
recognizes that they have synced this user with us in the past
Browser requests an ad via HTTP
Server responds with an ad targeted at the new audience segments
Partner calls us server-to-server with the user information, including our ID and new data
We store the new information
Storing Audience Data In Cassandra
Data Modeling Cluster Sizing
Replica1on Strategy
Data Modeling
● Solu1on must minimize latency ● Ajempt to constrain to one read or one write per event whenever
possible {! "audience_id" : "12345678-1234-1234-1234-123456789012",! "segments" : {"123": 1, "456": 3, "789": 1},! "foreign_ids" : {! "7180" : "967992447104804725",! "7347" : "bWv2-HOyJD8y6D",! "6960" : "404_53e3bfa26d377"! },! "pacing" : {! "2235" : 1412892591! }!}!
Data Modeling
● Ad auc1oning requires reading nearly all the data at once ● Most events write to one and only one data type (segments, ids, etc) {! "audience_id" : "12345678-1234-1234-1234-123456789012",! "segments" : {"123": 1, "456": 3, "789": 1},! "foreign_ids" : {! "7180" : "967992447104804725",! "7347" : "bWv2-HOyJD8y6D",! "6960" : "404_53e3bfa26d377"! },! "pacing" : {! "2235" : 1412892591! }!}!
Data Modeling
● Store an en1re user record in one row so it can be read all at once ● All data can be represented as a tuple with a unique iden1fier
CREATE TABLE audience_data (!!audience_id uuid,!!type int,!!key text,!!value text,!!PRIMARY KEY (audience_id, type, key)!!);!
!SELECT * FROM audience_data WHERE!
!audience_id = 12345678-1234-1234-1234-123456789012;!!SELECT * FROM audience_data WHERE!
!audience_id = 12345678-1234-1234-1234-123456789012 AND!!type = 1;!
!INSERT INTO audience_data (audience_id, type, key, value) VALUES!
!(12345678-1234-1234-1234-123456789012, 1, '123', '1');!
Data Modeling
Cluster Sizing
● Distributed a modified version of our implementa1on to produc1on ● Replaced Cassandra calls with writes to a log file
● Created a spreadsheet detailing each opera1on and how much load to expect during peak 1mes
● Used peak load to size the cluster for each data center ● Used formula provided by Aaron Morton at The Last Pickle
system_constant * #cores * #nodes = ops / sec!replication_factor .!
!
!ops = 1 read or write to one row (cluster in a partition)!
!system_constant = !3000 for AWS!
! ! !4000 for spinning disk!! ! !7-12K for SSD!
Cluster Sizing
● Typically clusters start small and grow as product adop1on grows ● Our cluster will be working hardest when we first turn it on
● Exis1ng cookie data needs to migrate to Cassandra ● As data migrates the load will decrease, normalize, and then increase
slowly over the next few months ● Don’t expect to match original load for nearly a year
Our Backwards Scenario
0
20000
40000
60000
80000
100000
120000
140000
Peak OPS
Peak OPS
Cluster Sizing
den01 iad02 lon01 hkg01 % of total traffic 40% 40% 13% 7% Normal tag rate 0.1 0.1 0.1 0.1 Migra1on tag rate 0.75 0.75 0.75 0.75
SELECT DC Avg 46,296 46,296 15,046 8,102 Peak 138,889 138,889 45,139 24,306
FE Avg 126 263 684 675 Peak 377 789 2,052 2,025
UPDATE tag (typical load) DC Avg 4,630 4,630 1,505 810 Peak 13,889 13,889 4,514 2,431
FE Avg 13 26 68 68 Peak 38 79 205 203
UPDATE tag (migra1on) DC Avg 30,093 30,093 9,780 5,266 Peak 90,278 90,278 29,340 15,799
FE Avg 82 171 445 439 Peak 245 513 1,334 1,317
Total DC ops (normal load) Avg 51,389 51,389 16,701 8,993 Peak 154,167 154,167 50,104 26,979
Total DC ops (migra1on) Avg 87,963 87,963 28,588 15,394 Peak 263,889 263,889 85,764 46,181
Constant
Nodes required (8 core) Spinning disk 4000 17 17 6 3 SSD 7000 10 10 4 3
Cluster Sizing
den01 iad02 lon01 hkg01 % total traffic 40.00% 40.00% 13.00% 7.00%
Tag Daily GB 0.9 0.9 0.3 0.2 Total GB 84 84 27 15
Frqcap Daily GB 0.6 0.6 0.2 0.1 Total GB 3.9 3.9 1.3 0.7
Partner Daily GB 8 8 3 1 Total GB 1509 1509 490 264
Total GB 3193 3193 1038 559
Per Node GB 456 456 346 186
Replication Strategy
● Typical Cassandra replication is expensive ● Each write is replicated to all data centers ● Each cluster must be approximately the same size ● Need a large pipe between data centers
● 3.7 million columns updated per second at peak load ● Amount of replica1on needed increases with each new data center
Replica1on Strategy
● Alternate strategies suggested: ● Offline copying of SSTables ● Maintain a log of changed records and run a process to copy those
periodically ● We realized that this data doesn’t need to be available in all places at all
1mes ● People don’t o_en move far enough to switch data centers ● Data integrity is of fairly low importance
● If our data isn’t replicated the user will appear to be new when they switch data centers, but that only has a minor short-‐term impact on applica1on performance
● Other replica1on strategies we considered: ● None ● Just-‐in-‐1me ● Queued
Replica1on Strategy
● Don’t replicate at all ● Each data center has its own completely self-‐contained cluster ● Advantage: Simplicity ● Disadvantage: Limits our ability to target users when they move or we
reassign regions to a different data center
Replica1on Strategy: None
● Each data center has its own completely self-‐contained cluster ● The user’s iden1fier cookie contains a data center iden1fier ● When an incoming request’s cookie says it’s from a different data center,
read from that data center in real 1me and replicate on the fly to the local data center
● Reassign the cookie using the new data center ● Advantage
● Audience data is (almost) always available (99.99%) ● Disadvantages
● Addi1onal latency while wai1ng for user data ● In cookie-‐less situa1ons we’d need to query all data centers if the
local data center has no data
Replica1on Strategy: Just-‐In-‐Time
if(cookie != null) {!!audience_id = cookie[id]!!audience_dc_id = cookie[dc_id]!
}!else {!
!audience_id = some other identifier!}!!if(audience_dc_id == local_dc) {!
!audience_data = local_dc->cassandra->fetch(audience_id)!}!else {!
!other_dcs = audience_dc_id != null ?!! ! !{audience_dc_id} : {dc1, dc2, dc3}!!for dc in other_dcs {!! !audience_data = dc1->cassandra->fetch(audience_id)!! !if(audience_data != null) {!! ! !local_dc->cassandra->write(!! ! ! !audience_id, audience_data!! ! ! !)!! ! !break;!! !}!!}!
}!
Replica1on Strategy: Just-‐In-‐Time
● Each data center has its own completely self-‐contained cluster ● When a fetch ajempt misses, the user ID is added to a queue for
reconcilia1on ● Treat the user as a new user and store their data locally ● Background process consumes IDs from the queue and ajempts to fetch
data from other data centers for reconcilia1on ● Advantages
● Audience data is mostly available (98%) ● Minimal addi1onal latency introduced
● Disadvantages ● Addi1onal opera1onal complexity ● Occasional data misses
Replica1on Strategy: Queued
if(cookie != null) {!!audience_id = cookie[id]!!audience_dc_id = cookie[dc_id]!
}!else {!
!audience_id = some other identifier!}!!audience_data = local_dc->cassandra->fetch(audience_id)!if(audience_data == null) {!
!local_dc->cassandra->queue_for_migration(!! !audience_id, audience_dc_id!! !)!
}!
Replica1on Strategy: Queued
audience_migrations = local_dc->fetch_from_queue()!for {audience_id, audience_dc_id} in audience_migrations {!
!other_dcs = audience_dc_id != null ?!! ! !{audience_dc_id} : {dc1, dc2, dc3}!!for dc in other_dcs {!! !audience_data = dc1->cassandra->fetch(audience_id)!! !if(audience_data != null) {!! ! !local_dc->cassandra->write(!! ! ! !audience_id, audience_data!! ! ! !)!
! !}!!}!
} !
Replica1on Strategy: Queued
THANK YOU Andrew Ku0g Send ques5ons, deck requests, complaints, cat videos, and resumes to: aku0g@spotxchange.com
Recommended