97
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Michael Labib, Specialist Solutions Architect, AWS Brian Kaiser, CTO, Hudl November 29, 2016 DAT306 Amazon ElastiCache Deep Dive Best Practices and Usage Patterns

AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Embed Size (px)

Citation preview

Page 1: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Michael Labib, Specialist Solutions Architect, AWS

Brian Kaiser, CTO, Hudl

November 29, 2016

DAT306

Amazon ElastiCache Deep DiveBest Practices and Usage Patterns

Page 2: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

What to Expect from the Session

• Why we’re here – In-Memory Data Stores

• Amazon ElastiCache Overview

• Usage Patterns

• Scale with Redis Cluster

• Best Practices

• Hudl Presentation

Page 3: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

In-Memory Data Stores

Page 4: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Why we’re here

Amazon

ElastiCache

µs are the new ms

Page 5: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

In-Memory Key-Value Store

High-performance

Redis and Memcached

Fully managed; Zero admin

Highly Available and Reliable

Hardened by Amazon

Amazon

ElastiCache

Page 6: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Request RateHigh Low

LatencyLow High

Str

uctu

reLow

High

Data VolumeLow High

Amazon

RDS

Amazon S3AmazonGlacier

AmazonCloudSearch and

Amazon Elasticsearch Service

Amazon

DynamoDB

Amazon

ElastiCache

HDFS

Page 7: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Memcached – Fast Caching

Slab allocator

In-memory key-value datastore

Supports strings, objects

Multi-threaded

Insanely fast!

Very established

No persistence

Open Source

Easy to Scale

Page 8: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Redis – The In-Memory Leader

Powerful ~200 commands + Lua scripting

In-memory data structure server

Utility data structuresstrings, lists, hashes, sets, sorted

sets, bitmaps & HyperLogLogs

Simple

Atomic operationssupports transactions

Ridiculously fast!<1ms latency for most commands

Highly Availablereplication

Persistence

Open Source

Page 9: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Redis Data Types - String

• Binary safe.

• Can contain a max value of 512 MB.

• Great for storing Counters, HTML, Images, JSON objects, etc.

valueKey

Page 10: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Key

Redis Data Types - Set

• A collection of unique unordered Strings values

• Great for Deduplicating and Grouping related information

value: 75 value: 1 value: 39 value: 63 value: 63

Duplicate!

Page 11: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Key

Redis Data Types - Sorted Set

• A collection of unique Strings values ordered by score

• Great for Deduplicating, Grouping and Sorting related information

value: mike

score: 50 score: 75

value: dan value: emma

score: 79

value: lina

score: 123

value: luke

score: 350

Page 12: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Key

Redis Data Types - List

HEAD value 1 value 2 value 3 TAIL

• A collection of Strings stored in the order of their insertion

• Push and Pop from head or tail of the list

• Great for message queues and timelines

Page 13: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Key

Redis Data Types - Hashes

Field 1 value 1

• A collection of unordered fields and values

• Great for representing objects

• Ability to Add, GET, and DEL individual fields by Key

Field 2 value 2

Field 3 value 3

Field 4 value 4

Page 14: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Memcached vs. RedisRedis Memcached

Simple Cache offload to database pressure and lower latency

Atomic counter support

Data Sharding (supported in Redis 3.X)

Need support for advanced datatypes such as Lists, Sets, Hashs

Multi-threaded Architecture (takes full advantage of all CPU cores)

Need ability to auto sort data to support Ranking or Leaderboards

Need Pub/Sub capabilities

High Availability and Failover

Persistence

Data volume max size 3.5 TiB 4.7 TiB +

Max key/value size 512MB | 512MB 256 bytes | 1MB

Page 15: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Memcached vs. RedisRedis Memcached

Simple Cache offload to database pressure and lower latency

Atomic counter support

Data Sharding (supported in Redis 3.X)

Need support for advanced datatypes such as Lists, Sets, Hashs

Multi-threaded Architecture (takes full advantage of all CPU cores)

Need ability to auto sort data to support Ranking or Leaderboards

Need Pub/Sub capabilities

High Availability and Failover

Persistence

Data volume max size 3.5 TiB 4.7 TiB +

Max key/value size 512MB | 512MB 256 bytes | 1MB

Page 16: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Amazon ElastiCache

Page 17: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Amazon

ElastiCache

Redis Multi-AZ with Automatic Failover

Open-Source Compatible

Fully Managed

Enhanced Redis Engine

Easy to Deploy, Use and Monitor

No Cross-AZ Data Transfer Costs

Extreme Performance at Cloud Scale

ElastiCache - Customer Value

Page 18: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Enhanced Redis Engine – Hardened by Amazon

Optimized Swap Memory

•Mitigate the risk of increased swap usage during syncs and snapshots.

Dynamic write throttling

•Improved output buffer management when the node’s memory is close to being exhausted.

Smoother failovers

•Clusters recover faster as replicas avoid flushing their data to do a full re-sync with the primary.

Amazon

ElastiCache

Page 19: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Usage Patterns

Page 20: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Caching

Clients

Amazon

ElastiCache

Amazon

DynamoDB

Cache

Reads/Writes

DB

Reads/Writes

Elastic Load

BalancingAmazon

EC2

Amazon

RDS

Better Performance - Microseconds Speed

Cost Effective

Higher Throughput - ~ 20M / RPS

DB

Reads/Writes

AWS

Lambda

Page 21: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Caching

# Write Through

def save_user(user_id, values):

record = db.query("update users ... where id = ?", user_id, values)

cache.set(user_id, record, 300) # TTL

return record

# Lazy Load

def get_user(user_id):

record = cache.get(user_id)

if record is None:

record = db.query("select * from users where id = ?", user_id)

cache.set(user_id, record, 300) # TTL

return record

# App code

save_user(17, {"name": “Big Mike"})

user = get_user(17)

Amazon

ElastiCache

Page 22: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Caching

# Write Through

def save_user(user_id, values):

record = db.query("update users ... where id = ?", user_id, values)

cache.set(user_id, record, 300) # TTL

return record

# Lazy Load

def get_user(user_id):

record = cache.get(user_id)

if record is None:

record = db.query("select * from users where id = ?", user_id)

cache.set(user_id, record, 300) # TTL

return record

# App code

save_user(17, {"name": “Big Mike"})

user = get_user(17)

Amazon

ElastiCache

Write Through1. Updated DB

2. SET in Cache

Lazy Load1. GET from cache.

2. If MISS get from DB

3. Then SET in Cache

Page 23: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

1) Install php, apache php memcache client

e.g. yum install php apache php-pecl-memcache

2) Configure “php.ini”

session.save_handler = memcache

session.save_path=

"tcp://node1:11211, tcp://node2:11211"

3) Configure “php.d/memcache.ini”

memcache.hash_strategy = consistent

memcache.allow_failover = 1

memcache.session_redundancy=3*

4) Restart httpd

5) Begin using Session Data:

For situations where you need an

external session store

• Especially needed when using ASGs

• Cache is optimal for high-volume

reads

PHP ExampleSession Caching

https://github.com/mikelabib/elasticache-memcached-php-demo

Page 24: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

IoT Device Data

AWS

IoT

AWS

IoT DeviceAmazon

EC2

AWS

Lambda

Hot Data

Amazon

ElastiCache

Amazon

DynamoDB

Longer

Retention

Data Lake

Amazon

S3

Amazon

Glacier

Cold Data

Amazon

Kinesis

Firehose

Amazon

ElastiCache

Page 25: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Lambda Trigger for IoT Rule

var redis = require("redis");

exports.handler = function(event, context) {

client = redis.createClient("redis://your-redis-endpoint:6379");

multi = client.multi();

multi.zadd("SensorData", date, event.deviceId);

multi.hmset(event.deviceId, "temperature", event.temperature,

"deviceIP", event.deviceIP,

"humidity", event.humidity,

"awsRequestId", context.awsRequestId);

multi.exec(function (err, replies) {

if (err) {

console.log('error updating event: ' + err);

context.fail('error updating event: ' + err);

} else {

console.log('updated event ' + replies);

context.succeed(replies);

client.quit();

}

});

}

AWS

Lambda

Amazon

ElastiCache

AWS IoT

Page 26: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Lambda Trigger for IoT Rule

var redis = require("redis");

exports.handler = function(event, context) {

client = redis.createClient("redis://your-redis-endpoint:6379");

multi = client.multi();

multi.zadd("SensorData", date, event.deviceId);

multi.hmset(event.deviceId, "temperature", event.temperature,

"deviceIP", event.deviceIP,

"humidity", event.humidity,

"awsRequestId", context.awsRequestId);

multi.exec(function (err, replies) {

if (err) {

console.log('error updating event: ' + err);

context.fail('error updating event: ' + err);

} else {

console.log('updated event ' + replies);

context.succeed(replies);

client.quit();

}

});

}

AWS

Lambda

Amazon

ElastiCache

AWS IoT

Transaction block start

SET

• Sorted Set

• Hash

Transaction block end

https://github.com/mikelabib/IoT-Sensor-Data-and-Amazon-ElastiCache

Page 27: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Streaming Data

Amazon

ElastiCache

Amazon

EC2AWS

Lambda

Amazon

Kinesis

Streams

Amazon

DynamoDB

Hot Data

Longer

Retention

Amazon

ElastiCache

Data

Sources

Page 28: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Amazon Kinesis

Analytics

AWSLambda

Amazon Kinesis

Streams

Amazon Kinesis

Streams

Data

Sources

Amazon

ElastiCache

De-duplicate,

Aggregate, Sort,

Enrich, etc.

cleansed

stream

Streaming Data Enrichment

Page 29: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Streaming Data Analytics

Data

Sources

1

Amazon

Kinesis

Streams

Amazon

EMR

(Spark Streaming)

Amazon

ElastiCache

Amazon

S3

Amazon

EC2

Amazon Redshift

Spark Redis Connector

Data Lake

Amazon

ElastiCache

Page 30: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache Redis with Multi-AZ

Page 31: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Prim

ary

Availability Zone A Availability Zone B

Re

plic

a

Re

plic

a

writes

Use Primary Endpoint

reads

Use Read Replicas

Auto-Failover

Chooses replica with

lowest replication lag

DNS endpoint is same

ElastiCache for Redis Multi-AZ

ElastiCache

for Redis

ElastiCache

for RedisElastiCache

for Redis

Automatic Failover to a read replica in case of

primary node failure

ElastiCache

Automates

snapshots for

persistence

Page 32: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache with Redis Multi-AZ

Region

Availability Zone A Availability Zone B

ElastiCache Cluster

Auto Scaling

PrimaryRead

Replica

Page 33: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache with Redis Multi-AZ

Region

Availability Zone A Availability Zone B

PrimaryRead

Replica

Auto Scaling

ElastiCache Cluster

Page 34: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache with Redis Multi-AZ

Region

Availability Zone A Availability Zone B

PrimaryRead

Replica

Auto Scaling

ElastiCache Cluster

Page 35: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Get ReplicationGroup Replica endpointspublic List getReplicationGroupEndpoints(String replicationGroupId) {

List<String> replicaEndpoints = new ArrayList<String>();

if (replicationGroupId!=null) {

try {

DescribeReplicationGroupsRequest request = new DescribeReplicationGroupsRequest();request.setReplicationGroupId(replicationGroupId);DescribeReplicationGroupsResult result = elastiCacheClient.describeReplicationGroups(request);Object[] nodeMembers;

if (result != null) {

for (ReplicationGroup replicationGroup : result.getReplicationGroups()) {

for (NodeGroup node : replicationGroup.getNodeGroups()) {

nodeMembers = node.getNodeGroupMembers().toArray();

for (int i = 0; i < nodeMembers.length; i++) {

String nodeDescriptions = nodeMembers[i].toString();

if (nodeDescriptions.contains("replica")) { …

Amazon

ElastiCache

Page 36: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Get ReplicationGroup Replica endpointspublic List getReplicationGroupEndpoints(String replicationGroupId) {

List<String> replicaEndpoints = new ArrayList<String>();

if (replicationGroupId!=null) {

try {

DescribeReplicationGroupsRequest request = new DescribeReplicationGroupsRequest();request.setReplicationGroupId(replicationGroupId);DescribeReplicationGroupsResult result = elastiCacheClient.describeReplicationGroups(request);Object[] nodeMembers;

if (result != null) {

for (ReplicationGroup : result.getReplicationGroups()) {

for (NodeGroup node : replicationGroup.getNodeGroups()) {

nodeMembers = node.getNodeGroupMembers().toArray();

for (int i = 0; i < nodeMembers.length; i++) {

String nodeDescriptions = nodeMembers[i].toString();

if (nodeDescriptions.contains("replica")) { …

Amazon

ElastiCache

DescribeReplicationGroups

https://github.com/mikelabib/ElastiCacheRedisLoadBalancer

Page 37: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

What’s New!

Page 38: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Features

• Horizontal Scale of up to 3.5 TiB per cluster

• Up to 20 million reads per second

• Up to 4.5 million writes per second

• Enhanced Redis Engine within ElastiCache

• Up to 4x times failover than with Redis 2.8

• Cluster-level Backup and Restore

• Fully Supported by AWS CloudFormation

• Available in all AWS Regions

New - October 2016Redis 3.2 Support

Amazon

ElastiCache

Page 39: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

• GEOADD locations 87.6298 41.8781 chicago

• GEOADD locations 122.3321 47.6062 seattle

• ZRANGE locations 0 -1

1) "chicago"

2) "seattle"

• GEODIST locations chicago seattle mi

"1733.4089"

• GEORADIUS locations 122.4194 37.7749 1000 mi

WITHDIST

1) 1) "seattle"

2) "679.4848"

Geospatial Commands

• GEOPOS locations chicago

1) 1) "87.62979894876480103

2) "41.87809901914020116"

• GEORADIUSBYMEMBER locations chicago 2000 mi

WITHDIST

1) 1) "chicago"

2) "0.0000"

2) 1) "seattle"

2) "1733.4089“

• GEOHASH locations chicago

• ZREM locations seattle

Page 40: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Scaling with Redis Cluster

Page 41: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Setting up Redis Cluster - Console

Cluster Mode

Page 42: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Redis Cluster – Automatic Client-Side Sharding

S5

S1

S2

S4 S3Client

• 16384 hash slots per Cluster

• Slot for a key is CRC16 modulo {key}

• Slots are distributed across the Cluster

into Shards

• Developers must use a Redis cluster client!

• Clients are redirected to the correct shard

• Smart clients store a map

Shard S1 = slots 0 – 3276

Shard S2 = slots 3277 – 6553

Shard S3 = slots 6554 – 9829

Shard S4 = slots 9830 – 13106

Shard S5 = slots 13107 - 16383

Page 43: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Availability Zone A

slots 0 - 5454 slots 5455 – 10909

Redis Cluster

Redis Cluster – Architecture

slots 10910 – 16363

Availability Zone B Availability Zone C

slots 5455 – 10909slots 5455 – 10909slots 0 - 5454 slots 0 - 5454

slots 10910 – 16363slots 10910 – 16363

Redis Cluster – Multi AZA cluster consists of 1 to 15 shards

Page 44: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Availability Zone A

slots 0 - 5454

Redis Cluster

Redis Cluster – Architecture

slots 10910 – 16363

Availability Zone B Availability Zone C

slots 5455 – 10909slots 5455 – 10909slots 0 - 5454 slots 0 - 5454

slots 10910 – 16363

Shard

ReplicaReplicaPrimary

Each shard has a Primary Node

and up to 5 replica nodes

slots 5455 – 10909

slots 10910 – 16363

Page 45: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Availability Zone A

slots 0 - 5454 slots 5455 – 10909

Redis Cluster

Redis Cluster – Architecture

slots 10910 – 16363

Availability Zone B Availability Zone C

slots 5455 – 10909slots 5455 – 10909

Shard

ReplicaReplica Primary

Each shard has a Primary Node

and up to 5 replica nodes

slots 0 - 5454 slots 0 - 5454

slots 10910 – 16363slots 10910 – 16363

Page 46: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Availability Zone A

slots 0 - 5454

Redis Cluster

Redis Cluster – Architecture

slots 10910 – 16363

Availability Zone B Availability Zone C

slots 10910 – 16363slots 10910 – 16363

Shard

Replica PrimaryReplica

Each shard has a Primary Node

and up to 5 replica nodes

slots 5455 – 10909 slots 0 - 5454slots 5455 – 10909

slots 0 - 5454 slots 5455 – 10909

Page 47: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Setting up Redis Cluster - Console

Cluster Name

Page 48: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Setting up Redis Cluster - Console

Redis Version

Page 49: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Setting up Redis Cluster - Console

Instance

Page 50: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Setting up Redis Cluster - Console

# of Shards

Page 51: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Setting up Redis Cluster - Console

# of Replicas

Page 52: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Slots Distribution

Setting up Redis Cluster - Console

Page 53: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Select AZs

Setting up Redis Cluster - Console

Page 54: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Redis Failure Scenarios

Page 55: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Availability Zone A

slots 0 - 5454 slots 5455 – 10909

Redis Cluster

slots 10910 – 16363

Availability Zone B Availability Zone C

slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454

slots 10910 – 16363 slots 10910 – 16363

Scenario 1: Single Primary Shard Failure

Page 56: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Availability Zone A

slots 0 - 5454 slots 5455 – 10909

Redis Cluster

Scenario 1: Single Primary Shard Failure

slots 10910 – 16363

Availability Zone B Availability Zone C

slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454

slots 10910 – 16363

Mitigation:

1. Promote Read Replica Node

2. Repair Failed Node

slots 10910 – 16363

Page 57: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Availability Zone A

slots 0 - 5454 slots 5455 – 10909

Redis Cluster

Scenario 2: Two Primary Shards Fail

slots 10910 – 16363

Availability Zone B Availability Zone C

slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454

slots 10910 – 16363slots 10910 – 16363

Page 58: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Availability Zone A

slots 0 - 5454 slots 5455 – 10909

Redis Cluster

Scenario 2: Two Primary Shards Fail

slots 10910 – 16363

Availability Zone B Availability Zone C

slots 5455 – 10909 slots 5455 – 10909slots 0 - 5454 slots 0 - 5454

Mitigation: Redis enhancements on ElastiCache

• Promote Read Replica Nodes

• Repair Failed Nodes

slots 10910 – 16363slots 10910 – 16363

Page 59: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Migrating to a Cluster

1. Create new Cluster

2. Make snapshot of old CacheCluster

3. Restore snapshot to new Cluster

4. Update Client

5. Terminate old Cluster

S5

S1

S2

S4 S3

Client

Old

< 3.2Client

Page 60: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Enhanced CloudFormation

• Support for Clusters

• Delete Policy: set as Snapshot

• Take one last backup before

deleting

• Replication Group tagging

• Replication Group: add more replicas

• User-defined resource identifiers

• use Cluster name, Replication

Group ID and Subnet group name

to identify appropriate resources by

assigning Physical Resource

Identifier

Page 61: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

{

"AWSTemplateFormatVersion" : "2010-09-09",

"Description" : "Test template for ReplicationGroup",

"Resources" : {

"BasicReplicationGroup" : {

"Type" : "AWS::ElastiCache::ReplicationGroup",

"Properties" : {

"AutomaticFailoverEnabled" : true,

"AutoMinorVersionUpgrade" : true,

"CacheNodeType" : "cache.r3.large",

"CacheSubnetGroupName" : { "Ref" : "CacheSubnetGroup" },

"Engine" : "redis",

"EngineVersion" : "3.2",

"NumNodeGroups" : "2",

"ReplicasPerNodeGroup" : "2",

"Port" : 6379,

"PreferredMaintenanceWindow" : "sun:05:00-sun:09:00",

"ReplicationGroupDescription" : "CFN RG test",

"SecurityGroupIds" : [

{ "Ref" : "RGSG" }

],

"SnapshotRetentionLimit" : 5,

"SnapshotWindow" : "10:00-12:00",

"

CloudFormation: Infrastructure as Code

AWS

CloudFormation

AWS

CloudFormation

Template

Amazon

ElastiCache

Page 62: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Best Practices

Page 63: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Redis

• Avoid very short key names - while lengthening a name does adds bytes, it also simplifies

app development when key names are predictable

• Create a logical schema such as: [Object]:{value]. Use colon rather than “.” or “-”

• Hashes, Lists, Sets are encoded to be much more efficient - use them!

• Avoid small Strings values given the overhead of the data type. Otherwise use Hashes.

• Avoid “KEYS” command and other long running commands

• Max Key Size, Max Value Size = 512MB

• List, Sets, Hashes size = 2^32-1

Page 64: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Architecting for Availability

• Upgrade to the latest engine version – 3.2.4

• Set reserved-memory to 30% of total available memory

• Swap usage should be zero or very low. Scale if not.

• Put read-replicas in a different AZ from the primary

• For important workloads use 2 read replicas per primary

• Write to the primary, read from the read-replicas

• Take snapshots from read-replicas

• For Redis Cluster have odd number of shards.

Page 65: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Monitoring Your Cluster

Page 66: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Key ElastiCache CloudWatch Metrics

• CPUUtilization

• Memcached – up to 90% ok

• Redis – divide by cores (ex: 90% / 4 = 22.5%)

• SwapUsage low

• CacheMisses / CacheHits Ratio low / stable

• Evictions near zero

• Exception: Russian doll caching

• CurrConnections stable

• Setup alarms with CloudWatch Metrics

Whitepaper: http://bit.ly/elasticache-whitepaper

Page 67: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache Modifiable Parameters

• Maxclients: 65000 (unchangeable)

• Use connection pooling

• timeout – Closes a connection after its been idle for a given interval

• tcp-keepalive – Detects dead peers given an interval

• Databases: 16 (Default)

• Logical partition

• Reserved-memory: 0 (Default)

• Recommended

50% of maxmemory to use before 2.8.22

30% after 2.8.22 – ElastiCache

• Maxmemory-policy:

• The eviction policy for keys when maximum memory usage is reached

• Possible values: volatile-lru, allkeys-lru, volatile-random, allkeys-random,

volatile-ttl, noeviction

Page 68: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Session Recap

• Amazon ElastiCache provides the performance needed for demanding real-time applications

• With a few lines of code, you can power your applications with an In-Memory datastore

• Redis Cluster allows you to scale to terabytes of data and support millions of IOPS

Page 69: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Brian Kaiser, CTO

11/29/2016

ElastiCache @ Hudl

Page 70: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)
Page 71: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)
Page 72: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)
Page 73: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

130k teams

Page 74: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

4.5M active users

Page 75: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

> 2B videos on S3

Page 76: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

35 hr/min of video

Page 77: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

15k API requests/sec

Page 78: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)
Page 79: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Web - Auto Scaling Group

Routing layer

AZ #1MongoDb

Squad Cluster

AZ #2MongoDb AZ #3MongoDb

ELB

Supporting

Services

Page 80: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Couchbase/Memcached

Page 81: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)
Page 82: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

public async Task<TResult> Get<TResult>(string key) where TResult : class

{

if (!_redisEnabled.Value)

{

return default(TResult);

}

var value = await _connection.Database.StringGetAsync(key);

if (!value.HasValue || value.IsNull)

{

return default(TResult);

}

return _serializer.Deserialize<TResult>(value);

}

Page 83: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

public async Task Put(string key, object item, TimeSpan ttl)

{

if (!_redisEnabled.Value || string.IsNullOrWhiteSpace(key))

{

return;

}

var data = _serializer.Serialize(item);

await _connection.Database.StringSetAsync(key, data, ttl);

}

Page 84: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

public async Task<TResult> GetAndPut<TResult>(string key, TimeSpan ttl, Func<TResult> valueAccessor)

where TResult : class

{

if(!_redisEnabled.Value)

{

return valueAccessor();

}

var cachedValue = await Get<TResult>(key);

if (cachedValue != null)

{

return cachedValue;

}

cachedValue = valueAccessor();

await Put(key, cachedValue, ttl);

return cachedValue;

}

Page 85: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Basic Object Caching Examples

• Auth Token

• User information

• Team Information

Page 86: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

The Feed

Page 87: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)
Page 88: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

http://amzn.to/2fGS9nx

Page 89: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Distributed Locking

S3 S3 MongoDb

ElastiCache

Workers

Page 90: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache

Page 91: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)
Page 92: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache

Auto Scaling group

Routing layer

AZ #1MongoDb

Squad Cluster

Auto Scaling group

AZ #2MongoDb

Auto Scaling group

AZ #3MongoDb

Primary Replica Replica

Page 93: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache – Redis Cluster

Page 94: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

ElastiCache – Redis Cluster

Page 95: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Some best practices

• Always Multi-AZ Replicas

• Setup predictive alerts

• Understand Eviction Policies

• Learn Redis data structures and Big O complexity

Page 96: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Thank you!

Page 97: AWS re:Invent 2016: ElastiCache Deep Dive: Best Practices and Usage Patterns (DAT306)

Remember to complete

your evaluations!