Upload
amazon-web-services
View
251
Download
1
Embed Size (px)
Citation preview
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
November 30, 2016
How DataXu Scaled Its Attribution
System to Handle Billions of Events
per Day with Amazon DynamoDB
Padma Malligarjunan, AWS
Yekesa Kosuru, DataXu
Rohit Dialani, DataXu
Agenda
• Benefits of NoSQL
• Fully managed features of Amazon DynamoDB
• DynamoDB integration with AWS services
• DataXu’s DynamoDB use case
Traditional SQL NoSQL
DB
Primary Secondary
Scale up
DB
DB
DBDB
DB DB
Scale out
SQL (Relational) vs. NoSQL (Non-relational)
SQL (Relational)
Price Desc.
$11.50
$8.99Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
Product
IDType
1 Book
2 Album
3 Movie
Products
SQL (Relational)
Price Desc.
$11.50
$8.99Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
Product
IDType
1
2
3
Title Date
Harry
Potter…2010
Book ID Author
1 JK Ro..
Books
Products
Book
Album
Movie
SQL (Relational)
Price Desc.
$11.50
$8.99Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
Product
IDType
1
2
3
Title Date
Harry
Potter…2010
Book ID Author
1 JK Ro..
BooksAlbums
Title
The Fox
Album
IDArtist
2 Ylvis
Products
Book
Album
Movie
SQL (Relational)
Price Desc.
$11.50
$8.99Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
Product
IDType
1
2
3
Title Date
Harry
Potter…2010
Book ID Author
1 JK Ro..
BooksAlbums
Title
The Fox
Album
IDArtist
2 Ylvis
Genre Director
ActionZack
Snyder
Movie ID Title
3Batman
vs Super
Movies
Products
Book
Album
Movie
SQL (Relational) vs. NoSQL (Non-relational)
Product
IDType
Harry
Potter..
JK
Rowling1 Book ID
2 Album ID The Fox
3 Movie IDBatman
vs Super
Ylvis
Attributes
Schema is defined per item
Items
Partition Key Sort Key
Price Desc.
$11.50
$8.99Must
watch..
Columns
Rows
Primary Key Index
$14.95
One of 2
major …
The
Sounds..
3Movie ID:
Actor ID
Ben
Affleck
Action
2010
Zack
Snyder
Primary Key
Product
IDType
1
2
3
Title Date
Harry
Potter…2010
Book ID Author
1 JK Ro.. Title
The Fox
Album
IDArtist
2 Ylvis
Genre Director
ActionZack
Snyder
Movie ID Title
3Batman
vs Super
Products Products
Book
Album
Movie
BooksAlbums
Movies
NoSQL design optimizes for
compute instead of storage
Why NoSQL?
Optimized for storage Optimized for compute
Normalized/relational Denormalized/hierarchical
Ad hoc queries Instantiated views
Scale vertically Scale horizontally
Good for OLAP Built for OLTP at scale
SQL NoSQL
Fully managed
Fast, consistent performance
Highly scalable
Flexible
Event-driven programming
Fine-grained access control
DynamoDB Benefits
Ad Tech Gaming MobileIoT Web
Scaling High-Velocity Use Cases with DynamoDB
Table and Item API
Admin CRUD
Create Table Put/Get Item
Update Table Batch Put/Get Item
Delete Table Update Item
Describe Table Delete Item
Query
Scan
DynamoDB
Streams
ListStreams
DescribeStream
GetShardIterator
GetRecords
Stream of updates to a table
Asynchronous
Exactly once
Strictly ordered
• Per item
Highly durable
• Scale with table
24-hour lifetime
Sub-second latency
DynamoDB Streams
Stream
Table
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
Table
Shard 1
Shard 2
Shard 3
Shard 4
KCL
Worker
KCL
Worker
KCL
Worker
KCL
Worker
Amazon Kinesis Client
Library application
DynamoDB
client application
Updates
DynamoDB Streams and
Amazon Kinesis Client Library
Triggers
Lambda functionNotify change
Derivative tables
Amazon Elasticsearch
Service
Amazon
ElastiCache
Analytics with
DynamoDB Streams
Collect and de-dupe data in DynamoDB
Aggregate data in memory and flush periodically
Performing real-time aggregation and
analytics
DataXu’s DynamoDB Use Case
• Who is DataXu
• Attribution Use Case
• Why DynamoDB
• Deployment Architecture
• Capacity & Performance
• Tips & Lessons Learned
DataXu
• Who
• Spun out of MIT Labs
• A petabyte-scale digital
marketing platform
• One of the fastest growing
companies in Inc. 5000
• What
• Help world’s most
valuable brands
understand and engage
with their consumers
• Maximize ROI
Quick Statistics
• 2M+ bid requests per second
• Billions of impressions per
month, petabytes of data
• ~10ms round-trip response time
• 180+ TB logs per day
• 2 PB data analyzed
• 3000+ servers powering the
platform
• 13 regions, 24x7
DataXu Reads and Writes on DynamoDB
X-Axis = Day
Y-Axis = Read/Write Capacity used
X-Axis = Time (6 hour intervals)
Y-Axis = Read/Write Capacity used
Attribution is the science of allocating credit from an activity/sale to
the marketing touchpoints that a customer was exposed to prior to
the purchase/activity.
Attribution
Online
Purchase
Impression ClickImpression
Customer Journey
EI EventImpression A Activity
Generalized Event Chains
AI E I A
Time
• Billions of events and activities are organized into sequences.
• Events are correlated based on time and user to construct paths leading to an
activity.
EI Event
E
Impression A Activity
I E
Why DynamoDB
• Managed Service
• Easy to use
• Elastic scaling, no need to overprovision
• API driven
• Fast & Predictable Performance (millisecs)
• Fast lookup/scan of user events
• Consistent & predictable read/write performance
• TCO
• Reasonable capex and no opex
DataXu Flows
CDN
Real-Time
Bidding
Retargeting
Platform
Streams
(Amazon
Kinesis)
Advanced Analytics
(Third-Party)
Reporting Tools
(Third-Party)Machine
Learning
(Spark)
S3All Data
(Amazon
S3)
ETL (SPARK
SQL)
Attribution (MR)
Ecosystem of tools and services
Attribution Engine
Meta
Amazon
EMR
JobAmazon
Cloud
Watch
DynamoDB
AWS
Data
Pipeline
3rd
Party
S3
Buckets
1st
Party
AWS Direct
Connect
Amazon
VPC
Amazon
EC2
Amazon
RDS
Amazon SNS
AWS
IAM
Inside DynamoDB: Events Table
User Events
Table
Users Events_<month_1>
hash=userid
range=timestamp
<payload>
Put Item Events
Users Events_<month_2>
hash=userid
range=timestamp
<payload>
Property Value
Storage 25 TB
Avg. Record
Size
4 KB
1:N Relationship
Events Table Schema
(partition key) (sort key) (attributes)
Userid-1 epoch1 ..
Userid-1 epoch2 ..
userid timestamp payload
rLsWAQZU1C00TU5 1475624579321 <Binary compressed>
rLsWAQZU1C00TU5 1477762942692 <Binary compressed>
rLsWAQZU1C00TU5 1475624579695 <Binary compressed>
rLsWAQZU1C00TU5 1475624579703 <Binary compressed>
SS2U6KnX1BWziP5 1476829764673 <Binary compressed>
I
I
E
A
R/W Operations vs. R/W Capacity Units
What influences capacity units for your table?
• Item size: Capacity unit size
• 4 KB per Read or 1 KB per Write
• Read/write request rate: Item Gets and Puts by your
Application
• Consistency: Strongly Consistent Read is counted double of
Eventually Consistent Read
• Local Secondary Index: Synchronized with the table
Capacity Planning: Unit of Scaling
• Partition:
• Storage: 10 GB per partition
• Compute: 3000 RCU or 1000 WCU per partition
• Partitions(for throughput) = (RCU/3000) + (WCU/1000)
• Partitions(for size) = Storage used in GB/10
• Total Number of Partitions =
Ceiling(MAX (Partitions(for throughput) , Partitions(for size)))• e.g., Ceiling(Max(100/10, 9000/3000+3000/1000)) = 10
Capacity Examples
Storage Provisioned
RCU
Provisioned
WCU
Partitions Reads
/Sec
/Partition
Writes
/Sec
/Partition
35 GB* 1000 500 4 250 125
1000 GB* 1000 500 100 10 5
100 GB* 9000 3000 10 900 300
100 GB* 90K 30K 60 1500 500
100 GB** 9000 3000 10 450 60
* Item size of 1 KB or less
** Item size 5 KB
Throttling
100 GB 9000 3000 10 900 300
Storage Provisioned
RCU
Provisioned
WCU
Partitions Reads Per
Partition
Writes Per
Partition
900 Reads and 300 Writes Per Partition
Throttling kicks in > 900 R and 300 W
Partitions
Design Tips
• Understand Scaling
• Understand Hot Keys/Throttling
• Capture Application Metrics
• Configure Table Alarms
• Application Tuning for Outliers
• Retry w/Backoff
• DynamoDB Best Practices
• http://docs.aws.amazon.com/amazondynamodb/latest/developergui
de/BestPractices.html
• AWS Service Limits
• http://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html
Lessons Learned
• Reduce RCU and WCU
• Combined Reads and Writes, Batch API
• Combined multiple rows that share the same hash key to the
same row (3X less puts)
• LZ4 compression
• How do we handle Deletes?
• Table rotation to match attribution windows
• Drop entire table when it is no longer necessary
Lessons Learned
• Dynamic scaling to large number of partitions takes time
• Debugging
• Application logging/metrics
• TCP dumps
• Turn on Request ID logging
• CloudWatch
• Local DynamoDB for testing
• http://docs.aws.amazon.com/amazondynamodb/latest/develo
perguide/Tools.DynamoDBLocal.html