Building Global, Multi -Region Serverless Backends - Amazon Web...

Preview:

Citation preview

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Ian Massingham, Developer Technology Evangelism, AWS

@IanMmmm

Building Global, Multi-Region Serverless Backends(powered by DynamoDB Global Tables)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Session objectives

1. Understand System Reliability and Availability.

2. Understand why we build a Multi-Region Active-Active architecture.

3. Understand how to build a Multi-Region Active-Active architecture

on AWS.

4. How To: Building Multi-Region Serverless Applications

5. Conclusion.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

System Reliability and Availability

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Failures are a given and everything will eventually fail over time.

Werner VogelsCTO – Amazon.com

“ “

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

System failure rateEarly FailuresWear Out Failures

Observed Failures

Random Failures

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

System failure rateFor high-velocity deployments

Early FailuresWear Out Failures

Observed Failures

Random Failures

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

System Availability

Availability = Normal Operation Time

Total Time

MTBF**

MTBF** + MTTR*=

* Mean Time To Repair (MTTR)

**Mean Time Between Failure (MTBF)

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Availability

Availability Downtime per year Categories

95% (1-nine) 18 days 6 hoursBatch processing, Data extraction, Load jobs.

99% (2-nines) 3 days 15 hours Internal Tools, Project Tracking

99.9% (3-nines) 8 hours 45 minutes Online Commerce

99.99% (4-nines) 52 minutes Video Delivery, Broadcast systems

99.999% (5-nines) 5 minutes Telecom Industry (ATM Transactions)

99.9999% (6-nines) 31 seconds Answering to my loved one*

* Joke J

http://royal.pingdom.com/wp-content/uploads/2015/04/pingdom_uptime_cheat_sheet.pdf

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Availability in Series

Part X Part Y

A = Ax Ay

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Availability in Series

Component Availability Downtime

X 99% (2-nines) 3 days 15 hours

Y 99.99% (4-nines) 52 minutes

X and Y Combined 98.99% 3 days 16 hours 33 minutes

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Availability in Parallel

A = 1 – (1 – Ax)2

Part X

Part X

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Availability in Parallel

Component Availability Downtime

X 99% (2-nines) 3 days 15 hours

Two X in parallel 99.99% (4-nines) 52 minutes

Three X in parallel 99.9999% (6-nines) 31 seconds

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

“Component redundancy increases availability significantly!”

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Availability Zone A Availability Zone B Availability Zone C

AWS Region

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Availability Zone A Availability Zone B Availability Zone C

Multi-AZ Well-Architected

Application

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AmazonDynamoDB

AmazonRDS

Amazon ElastiCache

AmazonS3

Amazon EFS

AmazonSQS

Amazon Kinesis

Amazon ElasticSearch

Default

Configurable for multi-AZ deployment

Some of the Regional AWS Services

AWSLambda

AmazonAPI Gateway

AWSELB

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Regional services

AZ1 AZ2 AZ3

Service XYZ

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

No servers to provision or manage

Scales with usage

Never pay for idle Availability and fault tolerance built in

Why Serverless components??

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

• 18 Geographic Regions • 50 Availability Zones (AZs) • 4 regions and 12 more Availability Zones announced

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cost of Availability (approx.)

Cost

Availability

Com

plexity

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Generally speaking a reliable machine has high availability but an available machine may or

may not be very reliable.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

On reliability

Ability of a system to :

1. Recover from infrastructure or service disruptions

2. Dynamically acquire computing resources to meet demand

3. Mitigate disruptions such as misconfigurations or transient network issues.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

2 important lesson learned

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Exponential Backoff

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Message passing for async. patterns

AQueue

B

AQueue

BListener

Pub-Sub

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

W eb Instances

W orkerInstance

W orkerInstance

Queue

APIInstance

APIInstance

APIInstance

API: {DO foo}

PUT JOB: {JobID: 0001, Task: DO foo}

API: {JobID: 0001}

GET JOB: {JobID: 0001, Task: DO foo}

Cache

Result: {

JobID: 0001, Result: bar

}

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

W orkerInstance

W orkerInstance

Q ueue

APIInstance

APIInstance

APIInstance

Cache

Am azon SN S

Push N otification

U ser

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Well-Architected Framework

Operational Excellence

Security

Reliability

Performance Efficiency

Cost Optimization

https://aws.amazon.com/architecture/well-architected/

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Why build a Multi-Region Active-Active architecture?

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Why Multi-Region?

1. Improve Latency for end-users

~300ms

~140ms

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Why Multi-Region?

1. Improve Latency for end-users2. Disaster Recovery

Applications in US West

Applications in US East

Users from San Francisco

Users from New York

Service 1

Service 2

Service 3

Service 4

Service 1

Service 2

Service 3

Service 4

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Why Multi-Region?

1. Improve Latency for end-users2. Disaster Recovery3. Business Requirements

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Netflix 2013

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Netflix 2016

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Chaos Engineering

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

How to build a Multi-Region Architecture on AWS.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Data Replication

Component A

Component B

Component C

Latency < 5 msSynchronous Asynchronous

Latency > 5 ms

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Process A Process B Process A Process B

Synchronous Asynchronous

WaitingWorkingContinues

get or fetch resultGet result

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Reliable & Secure Network

AWS RegionA

AWS RegionBAmazon Global Network

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

James Hamilton – 2016 re:InventV i c e P r e s i d e n t & D i s t i n g u i s h e d E n g i n e e r

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Multi-Region Multi-VPC Connectivity

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

S3 - Cross-Region Replication

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Cross-Region Read Replicas for Amazon RDS

** For Aurora, MySQL, MariaDB and PostgreSQL engines.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

“Simple” Cross-Region Usage Pattern

• Regional Reads• All critical writes traffic

to a single master

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Aurora multi-master - scale out reads & writes First MySQL compatible DB service with scale-out across multiple data centers

Zero application downtime from ANY instance failure

Zero application downtime from ANY AZ failure

Faster write performance and higher scale

Sign up for single-region multi-master preview today;

Multi-Region Multi-Master coming in 2018Availability

Zone 1

Scale out both reads and writes

Availability Zone 2

Availability Zone 3

Application

Read/WriteMaster 1

Shared distributed storage volume

Read/WriteMaster 2

Read/WriteMaster 3

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon DynamoDBFast and flexible NoSQL database service for any scale

Fast, consistent performance

Highly scalable Fully managed Business critical reliability

Consistent single-digit millisecond latency; DAX in-memory

performance reduces response times to microseconds

Auto-scaling to hundreds of terabytes of data that

serve millions of requests per second

Automatic provisioning, infrastructure

management, scaling, and configuration with

zero downtime

Data is replicated across fault tolerant Availability Zones, with fine-grained

access control

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Prime Day 2017 Metrics

Block Storage – Use of Amazon Elastic Block Store (EBS) grew by 40% year-over-year, with aggregate data transfer jumping to 52 petabytes (a 50% increase) for the day and total I/O requests rising to 835 million (a 30% increase).

NoSQL Database – Amazon DynamoDB requests from Alexa, the Amazon.com sites, and the Amazon fulfillment centers totaled 3.34 trillion, peaking at 12.9 million per second.

Stack Creation – Nearly 31,000 AWS CloudFormation stacks were created for Prime Day in order to bring additional AWS resources on line.

API Usage – AWS CloudTrail processed over 50 billion events and tracked more than 419 billion, all in support of Prime Day.

Configuration Tracking – AWS Config generated over 14 million Configuration items for AWS resources.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon DynamoDB Global Tables (GA)First fully managed, multi-master, multi-region database

Build high performance, globally distributed applications

Low latency reads & writes to locally available tables

Disaster proof with multi-region redundancy

Easy to set up and no application rewrites required

Globally dispersed users

Replica (N. America)

Replica (Europe)

Replica (Asia)

Global App

Global Table

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon DynamoDB Streams

• Each stream record appears exactly once in the stream.• For each item that is modified in a DynamoDB table, the

stream records appear in the same sequence as the actual modifications to the item

AmazonDynamoDB

AWSLambda

Streams

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon Route 53

• AWS’s Authoritative Domain Name Service.• Highly available and scalable.• Supports Traffic Flow through a variety of routing, all

of which can be combined with DNS Failover.• Enable a variety of low-latency, fault-tolerant

architectures.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Traffic Routing with Route53

1. Latency Based Routing

Amazon Route53

Resource A

Resource B

137ms latency

76ms latency

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Traffic Routing with Route53

1. Latency Based Routing2. Geo DNS

Amazon Route53

Resource AIn US

Resource B in EU

User in US

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Traffic Routing with Route53

1. Latency Based Routing2. Geo DNS3. Weighted Round Robin

Amazon Route53

Resource AIn US

Resource B in EU

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Traffic Routing with Route53

1. Latency Based Routing2. Geo DNS3. Weighted Round Robin4. DNS Failover

Amazon Route53

Resource AIn US

Resource B in EU

User in US

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

How To: Building Multi-Region Serverless Applications

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon DynamoDB

AWS LambdaAmazon API Gateway

Amazon DynamoDB

AWS LambdaAmazon API Gateway

Amazon Route53

eu-west-1

us-east-1

Global Tables

https://globalddb.adhorn.me/

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon DynamoDB

Amazon DynamoDB

eu-west-1

us-east-1

Global Tables

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon DynamoDB

AWS Lambda

Amazon DynamoDB

AWS Lambda

eu-west-1

us-east-1

Global Tables

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

AWS Lambda Function

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon DynamoDB

AWS LambdaAmazon API Gateway

Amazon DynamoDB

AWS LambdaAmazon API Gateway

eu-west-1

us-east-1

Global Tables

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

us-west-2

us-east-1

C lient

Am azonRoute 53

RegionalAPI

Endpoint

RegionalAPI

Endpoint

Custom Domain Name

Custom Domain Name

API Gateway

API Gateway

Lambda

Lambda

glob

aldd

b.ad

horn

.me

CNAM E

CNAM

E

Multi-Region with API Gateway

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Route53: Traffic Policy

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Amazon DynamoDB

AWS LambdaAmazon API Gateway

Amazon DynamoDB

AWS LambdaAmazon API Gateway

Amazon Route53

eu-west-1

us-east-1

Global Tables

https://globalddb.adhorn.me/

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Conclusion

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

We learned about

1. System Reliability and Availability.

2. Why to build a Multi-Region Active-Active architecture.

3. How to build a Multi-Region Active-Active architecture on AWS.

4. We looked at a Multi-Region Serverless App

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Thanks!@adhorn

https://medium.com/@adhorn