© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
DAT101 - Production NoSQL in an Hour:
Introduction to Amazon DynamoDB
Introduction to Amazon DynamoDB
Amazon DynamoDB is the result of everything we’ve learned from building large-scale, non-relational databases for Amazon.com and
building highly scalable and reliable cloud computing services at AWS.”
What is Amazon DynamoDB?
Design Philosophy
Design Philosophy
Design Philosophy
Design Philosophy
Design Philosophy
Flexible Data Model
Access and Query Model • Two primary key options
• Hash key: Key lookups: “Give me the status for user abc”
• Composite key (Hash with Range): “Give me all the status updates for user ‘abc’
that occurred within the past 24 hours”
• Support for multiple data types – String, number, binary… or sets of strings, numbers, or binaries
• Supports both strong and eventual consistency – Choose your consistency level when you make the API call
– Different parts of your app can make different choices
• Local Secondary Indexes
High Availability and Durability
I want to build a production-ready database…
This used to be the only way…
You Choose:
• Memory
• CPU
• Hard drive specs
• Software
• …
To get the database
performance you want:
• Throughput rate
• Latency
• …
This used to be the only way…
You Choose:
• Memory
• CPU
• Hard drive specs
• Software
• …
To get the database
performance you want:
• Throughput rate
• Latency
• …
Provisioned Throughput Model
Tell us the performance you want
Let us handle the rest
Provisioned Throughput Model
Every DynamoDB table has:
• Provisioned write capacity
• Provisioned read capacity
• No limit on storage
Provisioned Throughput Model
Provisioned Throughput Model
Change your throughput capacity as needed
Pay for throughput capacity and storage used
Seamless Scalability
Change scale with the click of a button
Capacity Forecasting is Hard
When you run your own database, you need to:
• Try to forecast the scale you need
• Invest time and money learning how to scale your
database
• React quickly if you get it wrong
Timid Forecasting:
Plan for a lot more capacity than you probably need
Benefits:
• Safety – you know you’re ready
Risks:
• Buy too much capacity
• Lose development resources to
scale testing/planning
• Do more work than necessary
Aggressive Forecasting:
Cut it close! Plan for less capacity. Hope you don’t need
more… Benefits:
• Lower costs if all goes well
Risks:
• Last-minute scaling emergencies
• How does your database behave at an
unexpected scale?
Typical Holiday Season Traffic at Amazon
Capacity
Actual
traffic
76%
24%
Unused
Capacity
Reduce Costs by Matching Capacity to Your Needs
Actual
traffic
Capacity we can
provision with
DynamoDB
Capacity we needed before
DynamoDB
Reduce Forecasting Risk by using
DynamoDB
Reduce Forecasting Risk by using
DynamoDB
What does DynamoDB handle for me?
Focus on your building your app, not running your database
Try it out!
aws.amazon.com/dynamodb
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
David R. Albrecht, Senior Engineer in Operations, Crittercism
November 13, 2013
Production NoSQL in an hour
Mobile application performance management
HTTP Requests
600 million devices
None of this adds differentiating
business value.
#import "Crittercism.h” [Crittercism enableWithAppID: @"<YOUR_CRITTERCISM_APP_ID>"]; [Crittercism setUsername:(NSString *)username];
Metadata: session id via usernames
we tried a lot of things
most of them failed
Our first attempt: sharded MongoDB on EC2
AZ 1
AZ 2
orange apple durian Each shard:
2x m2.4xlarge, EBS opt
Gross: 2x 3200 GB
Net: 1.6 TB, RAID 10
Cost:
EBS standard: $704/mo
EC2 compute: $2650/mo
Price floor: $1.45/GB-mo
But storage capacity wasn’t the problem!
Second attempt: Redis ring
Each shard:
2x m2.4xlarge
Gross: 2x 64 GB RAM
Net: 64 GB RAM
O(10k) iops performance
Cost:
EC2 compute: $2650/mo
Price floor: $41.45/GB-mo,
but is an ops nightmare.
Master
Slave
Consistent hashing: Karger et al.
Lesson: db scaling is 2d
iops
capacity
RAM
SSD
HDD
A horizontally-scalable, tabular, indexed database with user-defined
consistency semantics.
Benefit: Pay only for consumed capacity
Benefit: load spike insurance
Benefit: application-appropriate scaling
iops
capacity
Benefit: no operational burden
Lessons learned
• Database scaling is a 2D
problem
• Don't try to roll your own
sharding scheme
• Dynamo works for us.
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
100 Billion (with a B) Requests a Day with
Amazon DynamoDB
Valentino Volonghi, AdRoll Chief Architect
November 13, 2013
Pixel “fires”
Pixel “fires”
Serve ad?
Pixel “fires”
Serve ad?
Ad served
If you can’t reply in 100ms… It doesn’t matter anymore!
But you really only get 40ms!
Network 40
Buffer 20
Processing 40
Big picture slide
Data must be available all over the world!
7/2011 - ~50GB/day
4/2013 - ~5TB/day
10/2013 - ~20TB/day
What were our requirements?
Key-Value Store Requirements
• <10ms random key lookup with 100bytes values
• 5-10B items stored
• Scale up and down without performance hit
• ~100% uptime, this is money for us
• Consistent and sustained read/write throughput
Why DynamoDB instead of…
• Hbase: hbck like rain, really hard to manage
• Cassandra: still immature when we needed it
• Redis: limited by available memory, no
clustering
• Riak: great product, not fast enough for us
• MongoDB: not consistent write throughput
But the real reason…
They all require people to manage them!
And they all are hard to run in the cloud!
DynamoDB by Our Numbers
• 4 regions in use with live traffic replication
• 120B+ key fetches worldwide per day
• 1.5TB of data stored per region
• 30B+ items stored in reach region
• <3ms uniform query latency, <10ms 99.95%
What did we learn after all?
Batch operations as much as possible!
Query with GetItem – Update with UpdateItem
Low write throughput – Key splitting when exceeding max size – Write contention
HashKey
KeyValue
Query with Query – Update with BatchPutItem
HashAndRangeKey
KeyValue
Properly balance your structures!
• Evenly distribute keys in
hash range
• All values should be about
the same size
• Cache reads for a few
seconds
• Buffer writes, when
necessary
• Exponential back-off
retries
Tips for Optimum Performance
What do you mean you don’t care about the money?
Why do we pay so much for snacks again?
Snacks DynamoDB
We have this huge database
Pretty much always available
And we barely know it’s there
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
DAT101