Fast growing startups building high scale applications demand a lot from their infrastructure and in particular from their databases. Often, databases become the bottleneck of the startups’ technology stack, with the risk of inhibiting fast growth as they are not easy to set up, operate and scale in the cloud. This webinar focuses on how to build scalable databases in the Cloud and covers how to effectively combine the use of relational, NoSQL, and even data warehouse databases, which have become a reality for startups with the launch of Amazon Redshift. Key takeaways: Understand the trade-off between SQL and NoSQL and when to go for a hybrid model. Best practices in setting up your database in the AWS cloud whether using managed services or managing it yourself. Learn how to minimize the costs of your database with the right architecture and pricing models. Who should attend: DBA’s Startup CTO’s Developers Engineers Architects Growth Hackers
Text of AWS Activate webinar - Scalable databases for fast growing startups
Scalable Databases for Fast Growing Startups Blair Layton
Business Development Manager Database Services Amazon Web Services
- APAC
Agenda Self-Managed or Managed Database Services? NoSQL or
Relational? Performance Tips and Tricks How to scale from 1 to
10,000,000 users? How do I save money? Summary Q&A
backup & recovery, data load & unload performance
tuning 25%25%25%25%40%40%40%40% 5%5%5%5% 5%5%5%5% scripting &
coding security planning install, upgrade, patch and migrate
documentation, licensing & training Why Managed Databases?
If You Host Your Databases On-premises Power, HVAC, net Rack
& stack Server maintenance OS patches DB s/w patches Database
backups Scaling High availability DB s/w installs OS installation
you App optimization
Power, HVAC, net Rack & stack Server maintenance OS patches
DB s/w patches Database backups Scaling High availability DB s/w
installs OS installation you App optimization If You Host Your
Databases On-premises
If You Host Your Databases in EC2 Power, HVAC, net Rack &
stack Server maintenance OS patches DB s/w patches Database backups
Scaling High availability DB s/w installs OS installation you App
optimization
OS patches DB s/w patches Database backups Scaling High
availability DB s/w installs you App optimization Power, HVAC, net
Rack & stack Server maintenance OS installation If You Host
Your Databases in EC2
If You Choose a Managed Database Service Power, HVAC, net Rack
& stack Server maintenance OS patches DB s/w patches Database
backups App optimization High availability DB s/w installs OS
installation you Scaling
differentiated effort increases the uniqueness of an
application
Amazon RDS Amazon DynamoDB Amazon Redshift Amazon ElastiCache
Compute Storage AWS Global Infrastructure Database Application
Services Deployment & Administration Networking AWS Database
Services Scalable High Performance Application Storage in the
Cloud
Relational Databases Fully managed; zero admin MySQL, Oracle,
Postgres, SQL Server Trillions of I/O requests/month Amazon
RDS
Flipboard relies on Amazon RDS Flipboard is an online magazine
with millions of users and billions of flips per month Uses Amazon
RDS and its Multi-AZ capabilities to store mission critical user
data "We were able to go from concept to delivered product in about
six months with just a handful of engineers." - Greg Scallan, Chief
Architect, Flipboard
Manageability Rapid deployment with pre-configured parameters
Patch Management Monitoring and Metrics Availability and Data
Durability Automated Backups and Point-In-Time-Recovery DB
Snapshots Automatic Host Replacement (Single-AZ) Multi-AZ
deployments Scalability Push-Button Scaling Storage, Memory and
Compute Read Replicas Key Features
RDS for Production Workloads AmazonAmazonAmazonAmazon
RDSRDSRDSRDS ConfigurationConfigurationConfigurationConfiguration
ImproveImproveImproveImprove
AvailabilityAvailabilityAvailabilityAvailability
IncreaseIncreaseIncreaseIncrease
ThroughputThroughputThroughputThroughput ReduceReduceReduceReduce
LatencyLatencyLatencyLatency PushPushPushPush----Button
ScalingButton ScalingButton ScalingButton Scaling
MultiMultiMultiMulti AZAZAZAZ ReadReadReadRead
ReplicasReplicasReplicasReplicas Provisioned IOPSProvisioned
IOPSProvisioned IOPSProvisioned IOPS Read ReplicasPush-Button
Scaling Provisioned IOPS Region Multi-AZ availability zone
availability zone
In-Memory Cache Elastic and reliable Memcached or Redis Fully
managed; zero admin Amazon ElastiCache
ElastiCache: Fully Managed Cache Service Easy to Deploy Deploy
master- slave(s) configuration with a few button clicks or API
calls Easy to Migrate Compatible with memcached or Redis Existing
code will work when you update node end points Easy to Administer
ElastiCache automatically replaces failed nodes and patches
software as needed CloudWatch enables you to monitor cache
performance metrics Easy to Secure Supports VPC and Security Group
configurations Easy to Scale Provide assisted scale up and scale
out capability
Application Server Hot Items Small, frequently-accessed items
are ideal candidates for read caching Reduce server-side latency to
3TB of storage on RDS
NoSQL or Relational?
Not available on AWS Spectrum of Database Options SQL NoSQL Low
Cost High Cost Do-it Yourself Fully Managed
Spectrum of Database Options SQL NoSQL Do-it Yourself Fully
Managed
Thinking About the Questions Should I use SQL or NoSQL? Should
I use MySQL or PostgreSQL? Should I use Redis, Memcache, or
ElastiCache? ?Should I use MongoDB, Cassandra, or DynamoDB?
Actually, Thinking About the Right Questions What are my scale
and latency needs? What are my transactional and consistency needs?
What are my read/write, storage and IOPS needs? What are my time to
market and server control needs? ?
Factors to Consider Factors SQL NoSQL Application App with
complex business logic? Web app with lots of users? Transactions
Complex transactions, joins, updates? Simple data model, updates,
queries? Scale Developer managed Automatic, on-demand scaling
Performance Developer architected Consistent, high performance at
scale Availability Architected for fail-over Seamless and
transparent Core Skills SQL + Java/Ruby/Python/PhP NoSQL +
Java/Ruby/Python/PhP Best of both worlds: Possible to Use SQL and
NoSQL models in one AppBest of both worlds: Possible to Use SQL and
NoSQL models in one App
Performance Tips and Tricks
Performance Tips and Tricks Understand your workload Read:Write
ratio, I/O requirements, CPU requirements Identify bottlenecks CPU,
Memory, Disk I/O, Network latency/bandwidth Use Cloudwatch and OS
metrics Choose the right instance type High CPU, High Memory, High
Storage, etc. Understand EBS!
EBS =
Amazon EBS Magnetic Amazon Elastic Block Storage (EBS) IOPS:
~100 IOPS steady-state, with best-effort bursts to hundreds. 40-200
IOPS in terms of variability. Throughput: variable by workload,
best effort to 10s of MB/s. Latency: Varies, reads typically 100
First lets separate out our single host into more than one Web
Database Use RDS to make your life easier Web Instance Elastic IP
RDS DB Instance Amazon Route 53 User
User > 1000 Next lets address our lack of failover and
redundancy issues Elastic Load Balancing Another web instance In
another Availability Zone Enable Amazon RDS multi-AZ Web Instance
RDS DB Instance Active (Multi-AZ) Availability Zone Availability
Zone Web Instance RDS DB Instance Standby (Multi-AZ) Elastic Load
Balancing Amazon Route 53 User
User >10 ks100 ks RDS DB Instance Active (Multi-AZ)
Availability Zone Availability Zone RDS DB Instance Standby
(Multi-AZ) Elastic Load Balancing RDS DB Instance Read Replica RDS
DB Instance Read Replica RDS DB Instance Read Replica RDS DB
Instance Read Replica Web Instance Web Instance Web Instance Web
Instance Web Instance Web Instance Web Instance Web Instance Amazon
Route 53 User
This will take us pretty far honestly, but we care about
performance and efficiency, so lets clean this up a bit
Shift Some Load Around Lets lighten the load on our web and
database instances Move static content from the web instance to
Amazon S3 and CloudFront Move dynamic content from the Elastic Load
Balancing to CloudFront Move session/state and DB caching to
ElastiCache or DynamoDB Web Instance RDS DB Instance Active
(Multi-AZ) Availability Zone Elastic Load Balancing Amazon S3
Amazon CloudFront Amazon Route 53 User ElastiCache Amazon
DynamoDB
User >500k+ Availability Zone Amazon Route 53 User Amazon S3
Amazon Cloudfront Availability Zone Elastic Load Balancing DynamoDB
RDS DB Instance Read Replica Web Instance Web Instance Web Instance
ElastiCache RDS DB Instance Read Replica Web Instance Web Instance
Web Instance ElastiCacheRDS DB Instance Standby (Multi-AZ) RDS DB
Instance Active (Multi-AZ)
From 500K to 1 Million Users Getting serious now Significant
user base Plenty of attention if things go wrong Interesting phase
for startups with funding rounds
Time to make some radical improvements at the web & app
layers
SOAing Move services into their own tiers or modules. Treat
each of these as 100% separate pieces of your infrastructure and
scale them independently. Use queues! Amazon.com and AWS do this
extensively! It offers flexibility and greater understanding of
each component.
Users > 1 Million RDS DB Instance Active (Multi-AZ)
Availability Zone Elastic Load Balancer RDS DB Instance Read
Replica RDS DB Instance Read Replica Web Instance Web Instance Web
Instance Web Instance Amazon Route 53 User Amazon S3 Amazon
Cloudfront Amazon DynamoDB Amazon SQS ElastiCache Worker Instance
Worker Instance Amazon CloudWatch Internal App Instance Internal
App Instance Amazon SES
The next big steps
From 5 to 10 Million Users You may start to run into issues
with your database around contention on the write master. How can
you solve it? Federation (splitting into multiple DBs based on
function) Sharding (splitting one data set up across multiple
hosts) Moving some functionality to other types of databases NoSQL
for hot tables, lookup tables, leaderboards/scoring, meta data Data
warehouse for analytics: user behavior, performance monitoring, a/b
testing results, KPIs/dashboards.
How do I Save Money?
Saving $$$ Use managed database services Focus your limited
resources on the application Elasticache can reduce your database
costs Understand how to scale from the start Save redesign work and
unhappy customers Start and stop instances as required Use the AWS
platform Dont reinvent the wheel, concentrate on your core
competency Using CloudFront will reduce your costs on EC2
dramatically Purchase RIs and use spot instances Constantly monitor
and right-size your environment
Sorry, How do I Scale my Database?
Summary Decide on self-managed or managed database services
Choose the right database for your use case and skillsets to start
with Use Multi-AZ for your infrastructure Choose the right instance
family and size for your workloads Understand the 3 types of EBS
(Magnetic, General Purpose and PIOPS) Make use of self-scaling
services (Elastic Load Balancing, Amazon S3, Amazon SNS, SQS,
Amazon SES, etc.) Build in redundancy at every level Blend SQL
& NoSQL wisely Use a data warehouse to offload large analytical
queries from your main database Cache data both inside and outside
your infrastructure Purchase RIs and use Spot instances Split tiers
into individual services (SOA) Use autoscaling once you are ready
for it Use automation tools in your infrastructure Make sure you
have good metrics, monitoring, and logging tools in place Dont
reinvent the wheel