25
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Webinars Prahlad Rao, AWS Solutions Architect Balaji Iyer, AWS Professional Services Consultant Mar 21, 2017 Optimizing the Data Tier for Serverless Web Applications

Optimizing the Data Tier for Serverless Web Applications - March 2017 Online Tech Talks

Embed Size (px)

Citation preview

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.Webinars

Prahlad Rao, AWS Solutions ArchitectBalaji Iyer, AWS Professional Services Consultant

Mar 21, 2017

Optimizing the Data Tier for Serverless Web Applications

What to Expect from the Session

• Anatomy of Serverless Apps• Web applications• Mobile backends

• Hierarchy of choice for data tier options on AWS• Data tier for Serverless architectures• SQL vs. NoSQL considerations• AWS Lambda with Amazon DynamoDB• AWS Lambda with Amazon RDS database • AWS Lambda with Amazon ElastiCache

• Additional Best Practices• Caching, and retries

Anatomy of serverless applications

Web application architecture

AmazonCognito

Amazon API Gateway

AWSLambda

AmazonDynamoDB

Amazon ElastiCache

AmazonRDS

View Blog Posts(GETs)

Manage/ Edit Blog Posts(POSTs)

Web-federated Identity&

Cognito User Pools

AWSLambda

Triggers for sign-ups

AmazonSES

Mailers

Mobile backend

AmazonDynamoDB

https://api.myapp.com Amazon API Gateway

AmazonRDS

Amazon ElastiCache

AWS STS

AmazonCognitoAWS Lambda

Functions

AmazonS3

Core Business Logic

Data tier options on AWS

Data tier options on AWS

Amazon DynamoDB

Document and Key-

Value Store

Amazon RDS

SQL Database Engines

Amazon ElastiCache

In-Memory Key-Value

Store

Amazon Redshift

Data Warehouse

NoSQL vs. SQL for a new app: how to choose?

• Strong schema, complex relationships, transactions and joins

• Single/Cluster system scaling• Focus on ACID consistency and

availability• SQL tables will have faster query

performance when running complex queries

• Structured data sources, large ecosystem of SQL toolsets

• Partial Schema, easy reads and writes, simple data model

• Focus on performance and availability at scale

• Varied data sources, dynamic• High data volume, denormalized• Horizontal scaling

NoSQL SQL

Amazon DynamoDB use cases

Ad Tech IoT Gaming Mobile& Web

Ad serving,

retargeting, ID

lookup, user

profile

management,

session-

tracking, RTB

Tracking state,

metadata and

readings from

millions of

devices, real-

time

notifications

Recording

game details,

leaderboards,

session

information,

usage history,

and logs

Storing user

profiles,

session details,

personalization

settings, entity

specific

metadata

AWS Lambda with DynamoDB

• Configuration• No VPC configuration required• IAM roles for access and authentication• Leverage FGAC (Fine Grained Access Control) for

granular access to DynamoDB tables

AWS Lambda with DynamoDB

• Performance• Simple API model• Invoke concurrent connections at scale • Query consistency with volume growth• Simply dial-up read/write capacity units for scaling• Use DynamoDB for storing persistent data,

complement with ElastiCache for better read performance

RDS use cases

Applicable wherever you need relational databases

eCommerce Gaming

Websites IT Solutions

Apps

Reporting

Amazon Aurora: fast, available, and MySQL-compatible

SQL

Transactions

AZ 1 AZ 2 AZ 3

Caching

Amazon S3

ü 5x faster than MySQL on same hardware

ü Sysbench: 100K writes/sec and 500K reads/sec

ü Designed for 99.99% availability

ü 6-way replicated storage across 3 AZs

ü Scale to 64 TB and 15 read replicas

AWS Lambda with RDS• VPC Configuration

• Lambda functions by default have access to internet• Grant Lambda functions access to resources (RDS, EC2, ElastiCache) in

your own VPC by adding:§ VPC subnet IDs and security group IDs to Lambda configuration§ Lambda function execution role (AWSLambdaVPCAccessExecutionRole)§ Security group inbound rules on VPC resources should allow appropriate

ports for the subnet• Allows access to peered VPCs, VPN endpoints, and private S3 endpoints• Lambda access to VPC is optional, unless you need to access VPC

resources

AWS Lambda with RDS• VPC Configuration

• Functions configured for VPC access lose internet access• Even with “Auto-assign Public IP” enabled, Internet gateway and security

group allows all outbound traffic• If functions need access to both Internet and VPC, attach to private subnet

with Internet access through a NAT instance or Amazon VPC NAT gateway• Ensure subnets have enough IPs for ENIs• Avoid DNS resolution of public hostnames for your VPC when accessing

through Lambda function

AWS Lambda with RDS

• Performance• RDS instance type important for high Lambda concurrency• Concurrency control using ”Kinesis sandwich” (Lambda -> Kinesis -> Lambda -

> Storage tier). Allows throttle on backend at a different rate than frontend (may increase latency)

• Instantiate database connections outside scope of handler for connection re-use, other options use language frameworks (nodejs knex, sequelize) or open source libraries like Hibernate

• Faster query performance for complex queries• Fine tune max_connections based on DB instance type

AWS Lambda with Aurora on Amazon RDS and KMSDatabase Authentication

AWS Lambda

RDS Database

AWS KMS VPC NAT Gateway

Master Keys for encrypt/decrypt

1

2

3

4

3

1. Encrypt db password file with KMS

2. Package encrypted db password file along with lambda deployment package and upload to Lambda

3. When function is invoked, Lambda will connect with KMS through NAT gateway to decrypt password file

4. Lambda connects with database using extracted credentials to read/write records

ElastiCache use cases

Caching layer for performance or cost optimization of an underlying database

Storage of ephemeral key-value data

High-performance application patterns such as leaderboards (for gaming users), session management, event counters, in-memory lists

AWS Lambda with ElastiCache• Configuration

• Lambda configuration to access ElastiCache resources inside VPC• Use IAM roles for access and authentication• Leverage additional libraries (pymemcache, node discovery) within

your function

AWS Lambda with ElastiCache

• Performance• Invoke concurrent connections at scale • Use Redis pipeline to maximize number of operations per second• Handle high throughput by scaling instance types• ElastiCache offers faster performance with lowest latency• Write-through vs. lazy load based on applications• Memcache for read heavy workloads• Instead of updating the cache and persistent database, invalidate cache and

let the readers update it• Redis for write heavy workloads• Move data structures outside of the web apps to the data stores

AWS Lambda with API Gateway and Amazon ElastiCache

Amazon API Gateway

Amazon ElastiCache

AWS Lambda

1

2 34

1. Users authenticate via social identity providers or using Cognito

2. Amazon API gateway receives incoming request with query string parameters

3. Lambda function gets invoked, does a look up on the Redis cache

4. Lambda returns data based on the supplied criteria

Amazon Cognito

Additional best practices

Closing out – additional best practices• Local Caching

• Instantiate AWS clients and database connections outside event handler for connection re-use

• Initialization code is executed once per function, before handler is called first time

• Connection re-use on frequent invocations will reduce latency• Files stored in /tmp space (512 MB) will exist on connection re-use• Schedule a function to keep it warm

Closing out – additional best practices• Retries and Event Ordering

• Lambda function called synchronously• Using the AWS SDK? Set retry logic there• Direct RESTful call to Lambda? Client control retries entirely• Ordering is up to the caller

• Amazon S3 or SNS triggers Lambda function, or asynchronous calls• 3 tries, total, then event is discarded• Loosely ordered• Let the function fail, Lambda drops the event and puts it on an SQS/SNS for retries –

Dead Letter Queue• Lambda polls Amazon Kinesis or Amazon DynamoDB update stream

• Attempts to process batch of records until data expires from source stream, ordering preserved

New Feature

Webinars

Thank You

Additional reference:AWS Serverless Multi-tier Architectures -https://d0.awsstatic.com/whitepapers/AWS_Serverless_Multi-Tier_Architectures.pdf

AWS Lambda - https://aws.amazon.com/lambda/

Serverless Computing - https://aws.amazon.com/serverless/