Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Ian Massingham, Developer Technology Evangelism, AWS
@IanMmmm
Building Global, Multi-Region Serverless Backends(powered by DynamoDB Global Tables)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Session objectives
1. Understand System Reliability and Availability.
2. Understand why we build a Multi-Region Active-Active architecture.
3. Understand how to build a Multi-Region Active-Active architecture
on AWS.
4. How To: Building Multi-Region Serverless Applications
5. Conclusion.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
System Reliability and Availability
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Failures are a given and everything will eventually fail over time.
Werner VogelsCTO – Amazon.com
“ “
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
System failure rateEarly FailuresWear Out Failures
Observed Failures
Random Failures
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
System failure rateFor high-velocity deployments
Early FailuresWear Out Failures
Observed Failures
Random Failures
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
System Availability
Availability = Normal Operation Time
Total Time
MTBF**
MTBF** + MTTR*=
* Mean Time To Repair (MTTR)
**Mean Time Between Failure (MTBF)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability
Availability Downtime per year Categories
95% (1-nine) 18 days 6 hoursBatch processing, Data extraction, Load jobs.
99% (2-nines) 3 days 15 hours Internal Tools, Project Tracking
99.9% (3-nines) 8 hours 45 minutes Online Commerce
99.99% (4-nines) 52 minutes Video Delivery, Broadcast systems
99.999% (5-nines) 5 minutes Telecom Industry (ATM Transactions)
99.9999% (6-nines) 31 seconds Answering to my loved one*
* Joke J
http://royal.pingdom.com/wp-content/uploads/2015/04/pingdom_uptime_cheat_sheet.pdf
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability in Series
Part X Part Y
A = Ax Ay
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability in Series
Component Availability Downtime
X 99% (2-nines) 3 days 15 hours
Y 99.99% (4-nines) 52 minutes
X and Y Combined 98.99% 3 days 16 hours 33 minutes
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability in Parallel
A = 1 – (1 – Ax)2
Part X
Part X
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability in Parallel
Component Availability Downtime
X 99% (2-nines) 3 days 15 hours
Two X in parallel 99.99% (4-nines) 52 minutes
Three X in parallel 99.9999% (6-nines) 31 seconds
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“Component redundancy increases availability significantly!”
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability Zone A Availability Zone B Availability Zone C
AWS Region
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Availability Zone A Availability Zone B Availability Zone C
Multi-AZ Well-Architected
Application
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AmazonDynamoDB
AmazonRDS
Amazon ElastiCache
AmazonS3
Amazon EFS
AmazonSQS
Amazon Kinesis
Amazon ElasticSearch
Default
Configurable for multi-AZ deployment
Some of the Regional AWS Services
AWSLambda
AmazonAPI Gateway
AWSELB
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Regional services
AZ1 AZ2 AZ3
Service XYZ
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
No servers to provision or manage
Scales with usage
Never pay for idle Availability and fault tolerance built in
Why Serverless components??
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• 18 Geographic Regions • 50 Availability Zones (AZs) • 4 regions and 12 more Availability Zones announced
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cost of Availability (approx.)
Cost
Availability
Com
plexity
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Generally speaking a reliable machine has high availability but an available machine may or
may not be very reliable.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
On reliability
Ability of a system to :
1. Recover from infrastructure or service disruptions
2. Dynamically acquire computing resources to meet demand
3. Mitigate disruptions such as misconfigurations or transient network issues.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
2 important lesson learned
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Exponential Backoff
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Message passing for async. patterns
AQueue
B
AQueue
BListener
Pub-Sub
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
W eb Instances
W orkerInstance
W orkerInstance
Queue
APIInstance
APIInstance
APIInstance
API: {DO foo}
PUT JOB: {JobID: 0001, Task: DO foo}
API: {JobID: 0001}
GET JOB: {JobID: 0001, Task: DO foo}
Cache
Result: {
JobID: 0001, Result: bar
}
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
W orkerInstance
W orkerInstance
Q ueue
APIInstance
APIInstance
APIInstance
Cache
Am azon SN S
Push N otification
U ser
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Well-Architected Framework
Operational Excellence
Security
Reliability
Performance Efficiency
Cost Optimization
https://aws.amazon.com/architecture/well-architected/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why build a Multi-Region Active-Active architecture?
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Multi-Region?
1. Improve Latency for end-users
~300ms
~140ms
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Multi-Region?
1. Improve Latency for end-users2. Disaster Recovery
Applications in US West
Applications in US East
Users from San Francisco
Users from New York
Service 1
Service 2
Service 3
Service 4
Service 1
Service 2
Service 3
Service 4
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Why Multi-Region?
1. Improve Latency for end-users2. Disaster Recovery3. Business Requirements
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Netflix 2013
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Netflix 2016
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Chaos Engineering
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How to build a Multi-Region Architecture on AWS.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Replication
Component A
Component B
Component C
Latency < 5 msSynchronous Asynchronous
Latency > 5 ms
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Process A Process B Process A Process B
Synchronous Asynchronous
WaitingWorkingContinues
get or fetch resultGet result
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Reliable & Secure Network
AWS RegionA
AWS RegionBAmazon Global Network
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
James Hamilton – 2016 re:InventV i c e P r e s i d e n t & D i s t i n g u i s h e d E n g i n e e r
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Multi-Region Multi-VPC Connectivity
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
S3 - Cross-Region Replication
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cross-Region Read Replicas for Amazon RDS
** For Aurora, MySQL, MariaDB and PostgreSQL engines.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
“Simple” Cross-Region Usage Pattern
• Regional Reads• All critical writes traffic
to a single master
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Aurora multi-master - scale out reads & writes First MySQL compatible DB service with scale-out across multiple data centers
Zero application downtime from ANY instance failure
Zero application downtime from ANY AZ failure
Faster write performance and higher scale
Sign up for single-region multi-master preview today;
Multi-Region Multi-Master coming in 2018Availability
Zone 1
Scale out both reads and writes
Availability Zone 2
Availability Zone 3
Application
Read/WriteMaster 1
Shared distributed storage volume
Read/WriteMaster 2
Read/WriteMaster 3
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDBFast and flexible NoSQL database service for any scale
Fast, consistent performance
Highly scalable Fully managed Business critical reliability
Consistent single-digit millisecond latency; DAX in-memory
performance reduces response times to microseconds
Auto-scaling to hundreds of terabytes of data that
serve millions of requests per second
Automatic provisioning, infrastructure
management, scaling, and configuration with
zero downtime
Data is replicated across fault tolerant Availability Zones, with fine-grained
access control
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Prime Day 2017 Metrics
Block Storage – Use of Amazon Elastic Block Store (EBS) grew by 40% year-over-year, with aggregate data transfer jumping to 52 petabytes (a 50% increase) for the day and total I/O requests rising to 835 million (a 30% increase).
NoSQL Database – Amazon DynamoDB requests from Alexa, the Amazon.com sites, and the Amazon fulfillment centers totaled 3.34 trillion, peaking at 12.9 million per second.
Stack Creation – Nearly 31,000 AWS CloudFormation stacks were created for Prime Day in order to bring additional AWS resources on line.
API Usage – AWS CloudTrail processed over 50 billion events and tracked more than 419 billion, all in support of Prime Day.
Configuration Tracking – AWS Config generated over 14 million Configuration items for AWS resources.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB Global Tables (GA)First fully managed, multi-master, multi-region database
Build high performance, globally distributed applications
Low latency reads & writes to locally available tables
Disaster proof with multi-region redundancy
Easy to set up and no application rewrites required
Globally dispersed users
Replica (N. America)
Replica (Europe)
Replica (Asia)
Global App
Global Table
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB Streams
• Each stream record appears exactly once in the stream.• For each item that is modified in a DynamoDB table, the
stream records appear in the same sequence as the actual modifications to the item
AmazonDynamoDB
AWSLambda
Streams
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon Route 53
• AWS’s Authoritative Domain Name Service.• Highly available and scalable.• Supports Traffic Flow through a variety of routing, all
of which can be combined with DNS Failover.• Enable a variety of low-latency, fault-tolerant
architectures.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traffic Routing with Route53
1. Latency Based Routing
Amazon Route53
Resource A
Resource B
137ms latency
76ms latency
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traffic Routing with Route53
1. Latency Based Routing2. Geo DNS
Amazon Route53
Resource AIn US
Resource B in EU
User in US
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traffic Routing with Route53
1. Latency Based Routing2. Geo DNS3. Weighted Round Robin
Amazon Route53
Resource AIn US
Resource B in EU
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Traffic Routing with Route53
1. Latency Based Routing2. Geo DNS3. Weighted Round Robin4. DNS Failover
Amazon Route53
Resource AIn US
Resource B in EU
User in US
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
How To: Building Multi-Region Serverless Applications
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
AWS LambdaAmazon API Gateway
Amazon DynamoDB
AWS LambdaAmazon API Gateway
Amazon Route53
eu-west-1
us-east-1
Global Tables
https://globalddb.adhorn.me/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
Amazon DynamoDB
eu-west-1
us-east-1
Global Tables
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
AWS Lambda
Amazon DynamoDB
AWS Lambda
eu-west-1
us-east-1
Global Tables
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Lambda Function
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
AWS LambdaAmazon API Gateway
Amazon DynamoDB
AWS LambdaAmazon API Gateway
eu-west-1
us-east-1
Global Tables
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
us-west-2
us-east-1
C lient
Am azonRoute 53
RegionalAPI
Endpoint
RegionalAPI
Endpoint
Custom Domain Name
Custom Domain Name
API Gateway
API Gateway
Lambda
Lambda
glob
aldd
b.ad
horn
.me
CNAM E
CNAM
E
Multi-Region with API Gateway
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Route53: Traffic Policy
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon DynamoDB
AWS LambdaAmazon API Gateway
Amazon DynamoDB
AWS LambdaAmazon API Gateway
Amazon Route53
eu-west-1
us-east-1
Global Tables
https://globalddb.adhorn.me/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Conclusion
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
We learned about
1. System Reliability and Availability.
2. Why to build a Multi-Region Active-Active architecture.
3. How to build a Multi-Region Active-Active architecture on AWS.
4. We looked at a Multi-Region Serverless App
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thanks!@adhorn
https://medium.com/@adhorn