Building High Performance Apps using NoSQLfiles.meetup.com/8763012/AWS_NoSQL Event_090513.pdf · Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager,

Building High Performance Apps using NoSQL

Swami Sivasubramanian General Manager, AWS NoSQL

Building high performance apps •  There is a lot to building high performance apps

§  Scalability §  Performance at high percentiles §  Availability

•  Database choice on has disproportionate impact on these

History of Databases in Amazon •  Amazon.com is a big startup

§  Composed of thousands of service §  Each service is developed and operated independently

•  No central mandate or coordination (slows down execution..)

•  A vast ecosystem.. Survival of the fittest!

•  We have gone through multiple iterations on the following question

What is the right database architecture for Amazon apps?

Relational Era

•  Amazon.com page composed of responses from 1000s of independent services •  Query patterns for different service are different

§  Catalog service is usually heavy key-value §  Ordering service is very write intensive (key-value) §  Catalog search has a different pattern for querying

•  However: Usually, a relational database is the default database choice!

Relational Era (contd..) •  How did it go?

§  Poor availability §  Poor Scalability (Q4/Christmas was a big project) §  Exorbitantly high costs for hardware, software and administration

Lessons •  Relational Database is used even when they are not the right tool! •  Didn’t need all the query capabilities RDBMS provide •  Need a database that can provide:

§  Extreme availability §  Seamless scalability: No more re-architecture for planning §  Embrace failures and make it part of normal operations §  (Hey, this was early 2000s when these were not obvious)

Distributed Systems Era: Amazon Dynamo Replicated DHT with consistency management •  Consistent hashing •  Optimistic replication •  “Sloppy quorum” •  Anti-entropy mechanisms •  Object versioning

•  Specialist tool: §  Limited query capabilities §  Simpler consistency

Amazon Dynamo Usage

•  Higher Availability •  Incremental Scalability •  Lower costs •  .. however less query capability

•  Adopted by services for which scale and availability are most important •  Dynamo inspired many other internal variants for distributed caching,

messaging etc.. •  Services that needed complex queries used RDBMS

Amazon Dynamo: Lessons learned •  What could have been better?

§  Lack of strong consistency •  Forced a model which may not fit every app

§  We forced every engineer to learn distributed systems •  Version clocks, sloppy quorum, anti-entropy, Cluster balancing,…

§  Operational complexity •  Required each service to carry pagers •  Manage their fleet •  Deal with performance tuning

In the end, Amazon developers wanted Dynamo as a service not a product!

Cloud Era •  Time when AWS was just starting •  Developers loved:

§  Amazon S3 for storage §  Amazon EC2 for compute

•  Why? §  Lets them focus on their app §  Not deal with operations

•  They wanted equivalent of S3 for databases §  Seamless scalability §  No operational overhead

Cloud Era: Amazon DynamoDB

Non-‐Rela)onal

Fast & Predictable Performance

Seamless Scalability

Easy Administra)on

We built DynamoDB to make developers life easier…

Where is Amazon.com right now? •  NoSQL (DynamoDB) has been a huge central piece for Amazon •  Most of the online workloads are using DynamoDB •  No other solution meets our scale, availability and cost needs

•  We use other cloud databases too! §  RDS for relational workloads §  ElastiCache for caching §  Redshift for warehouse applications §  EMR for analytics

So much for Amazon.com, what about AWS and its customers?

State of NoSQL in AWS: Brief Recap

DynamoDB: Looking ahead •  We will continue to invest in making sure DynamoDB continues to be

§  Secure §  Extremely reliable

•  Three datacenter replication •  Synchronous replication •  Extremely well tested replication pipeline •  No compromise on reliability for costs or performance

§  Seamlessly scalable §  Cost effective

•  Launched in April: 75% price drop for storage + 35% drop for throughput •  Reserved capacity options: 1-year = 53% discount; 3-year = 76% discount •  4KB read capacity units

DynamoDB: Looking ahead (contd..) •  We will continue to improve query capabilities

•  Launched Local secondary indexes (April 2013) •  Launched Parallel Scan API (May 2013) •  Launched geospatial indexing library today! •  Lot more to come..

•  We will continue to reduce your operational overhead §  Example: Dynamic DynamoDB, autoscale-dynamodb, etc..

•  We will continue to integrate with other AWS services seamlessly §  EMR integration §  One click copy to Redshift (Feb 2013) §  Data Pipeline template to backup/restore (Mar 2013) §  More to come..

ElastiCache •  Managed caching service •  Offered memcache as a service •  Added Redis support yesteday! •  Lookout for more caching features here..!

Run your own database on EC2 •  There is a rich ecosytem of NoSQL solutions in EC2

§  MongoDB §  Cassandra §  Riak §  Graph databases §  …

•  Pick the right solutions based on your needs.

Getting back to original question…

How do I choose the right database for my app?

So many choices, what to pick?

Choose the right tool for each job.

Redux.. •  Decision point #1: Optimize Query patterns •  Decision point #2: Plan for (business) success •  Decision point #3: Plan for (infrastructure) failures •  Decision point #4: What is the operational expense for my pick?

Decision #1: Choosing right query patterns •  Understand your apps’s query pattern carefully

•  Identify which queries need to scale linearly with growth in user base •  For those queries, pick a database architecture that scales linearly

§  Perf should be same for 10MB table or 10GB table or 10TB table.

•  If your db does not grow with your business growth §  Signing up for operational hell §  Don’t think about sharding as after thought

Decision #1: Choosing right query patterns (contd..) •  Separate query patterns carefully

§  Interactive part of your apps need to perform well and scale §  Avoid non-scalable queries in interactive user workflow

•  Good real-time query §  Example: Load user preferences, set user preferences

•  Bad real-time query §  Example: Compute all friends of friends for user A who are interested in X

•  Perform complex queries, pre-compute and store in a cache §  Example: Compute recommendations for user-A and store in a cache

Optimize Query Patterns •  For time series data

§  Separate cold data from hot data §  Enables you to separate read heavy workload from write heavy workload

•  Example:

§  Ordering application is a great example for time series data §  Past few days orders are “hot” §  6 month old orders are cold

•  Recommendation: §  Create an ordering table every week §  Store recent orders in this week’s table §  Archive the old tables or dial down their read throughput §  You can query across tables

Decision #2: Plan for success •  Understand scale needs

§  Talk to your CFO/product visionary/business owner §  What does success look like? §  Don’t postpone tough decisions until you are successful §  Re-architecting while dealing with growth is a pain

•  Pick query flexibility vs. scalability carefully §  Don’t take shortcuts §  Plan to sleep well for other 51 weekends

Decision #2: Plan for success (contd..) •  Test for scale

§  You will find strange bottlenecks in these tests •  Connection timeouts •  Cluster reconfiguration issues •  Load balancing..

§  Test how system scales •  More throughput capacity (for DynamoDB) •  More cache nodes (for elasticache) •  More ec2 instances (for run your own database)

Decision #3: Plan for failure •  Do not treat failure as a special case

•  Replication and redundancy is key!

•  Pick replication technology carefully §  Synchronous vs. Asynchronous

•  Hint: If you care about your data, pick synchronous replication §  Multi-AZ vs. Single-AZ

•  Hint: If you care about availability, pick Multi-AZ replication

•  Pick replication factor carefully §  Two is a terrible number in distributed systems §  Three is better (and is not a crowd)

Decision #3: Test for failures.. •  Plans are only good intentions .. •  In DynamoDB, we test for failures

§  Unit tests §  Mock tests §  Cluster tests §  Performance tests §  Datacenter failure tests §  Network degradation tests §  Dependency failure tests

•  Also, we use strong theoretical foundation when necessary.. •  Fault injection testing is key!

Decision #4: What is the operational overhead? •  Understand the operational costs of your app •  Don’t underestimate the cost of

§  Managing hardware §  Maintaining and patching software §  Configuring and keeping multi-AZ replication §  Plan for repeated game days and hardware upgrades §  Plan for optimizing costs §  Plan for operations staff

•  If a cloud service works for you and meets your needs (#1 to #3) – great! •  If not, do it your own but plan accordingly.

Simple rule of thumb.. •  When you need seamless scale and super availability: DynamoDB

•  Complex query workloads and need relational capabilities §  Choose Amazon RDS §  Usually MySQL is a good choice

•  Caching §  ElastiCache - memcached for scaling key-value §  ElastiCache – Redis for advanced datastructures

•  For data warehousing: Choose Amazon Redshift

•  Cases where these services are not the right fit: Build your own on EC2!

Thank you!

[email protected]

Documents

Building High Performance Apps using NoSQLfiles.meetup.com/8763012/AWS_NoSQL Event_090513.pdf · Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager,