AWS September Webinar Series - Getting Started with DynamoDB Streams

Preview:

Citation preview

dynamodb-pm@

Event Driven Computing Enabled by DynamoDB Streams

Launch update on Cross-region replication and Database Triggers with AWS Lambda Integration

© 2015 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.

• What are DynamoDB Streams?• How to build with DynamoDB Streams• Why DynamoDB Streams?

dynamodb-pm@

What are DynamoDB Streams?

• Time ordered and partitioned change log

• Provides a stream of updates, inserts, deletes

• Guaranteed to be delivered only once

• Use Kinesis Client Library (KCL), Lambda, or API to query pre-image, post-image, key, timestamp

• Scales with your table

DynamoDB StreamsDynamoDB

dynamodb-pm@

dynamodb-pm@

Use Cases for DynamoDB Streams

Cross-region replication

post

DynamoDB Streams

Cross Region ReplicationAsia Pacific (Tokyo) EU (Ireland) Replica

US East (N. Virginia)

dynamodb-pm@

Cross region replication library

• Bootstrapping• Horizontal scaling with KCL Workers• Load balancing• Fault tolerance with check pointing

dynamodb-pm@

Cross region replication

Shard 1

DynamoDB Stream

DynamoDB

Partition 1

Shard 2

Shard 3

Partition 2

Table

KCL Worker

KCL Worker

KCL Worker

US East (N. Virginia) EU (Ireland)

dynamodb-pm@

Cross region replication

DynamoDB Stream

DynamoDBTable

US East (N. Virginia) EU (Ireland)

Partition 1

Partition 2

Partition 3

Partition 4

Partition 5

Table

Shard 1

Shard 2

Shard 3

Shard 4

KCLWorker

KCLWorker

KCLWorker

KCLWorker

dynamodb-pm@

Consuming Streams (KCL)AWSDynamoDBstreamsAdapterClient adapterClient = new AWSDynamoDBstreamsAdapterClient(updateStreamsCredentials, .. );..

AmazonDynamoDBClient dynamoDBClient = new AmazonDynamoDBClient(dynamoDBCredentials, ..);..

KinesisClientLibConfiguration workerConfig = new KinesisClientLibConfiguration (.., streamId, updateStreamsCredentials, ..)

.withMaxRecords(100) .withInitialPositionInStream(InitialPositionInStream.TRIM_HORIZON);

Worker worker = new Worker(recordProcessorFactory, workerConfig, adapterClient, dynamoDBClient, ..);

Thread t = new Thread(worker);t.start();

Full code available online dynamodb-pm@

Processing streams (KCL)

public class StreamsRecordProcessor implements IRecordProcessor { .. @Override public void processRecords(List<Record> records,.. ) { for(Record record : records) {

if (record instanceof RecordAdapter) {

Record ddbStreamRecord = ((RecordAdapter) record).getInternalObject();

switch(ddbStreamRecord.getEventName()) { case "INSERT" : case "MODIFY" : DemoHelper.putItem(dynamoDBClient, tableName,

ddbStreamRecord.getDynamodb().getNewImage()); break;

case ”REMOVE" : DemoHelper.deleteItem(dynamoDBClient, tableName,

ddbStreamRecord.getDynamodb().getKeys().get(“Id”).getN()); break; }...

parixitpo@l

dynamodb-pm@

DynamoDB Triggers

• Trigger AWS Lambda functions• Example – Validate address, send notifications

DynamoDB Streams and AWS Lambda

Triggers

Lambda FunctionNotify change

Aggregate tables

External views

CloudSearchElastiCache Lambda Function

dynamodb-pm@

Real-Time Voting

Write-heavy items

Requirements for voting

• Allow each person to vote only once• No changing votes• Real-time aggregation• Voter analytics, demographics

Real-time voting architecture

AggregateVotesTable

Voters

RawVotes Table

Voting App

Partition 11000 WCUs

Partition K1000 WCUs

Partition M1000 WCUs

Partition N1000 WCUs

Votes Table

Candidate A Candidate B

Scaling bottlenecks

50,000/sec

70,000/sec

Voters

Provision 200,000 WCUs

Write sharing

Candidate A_2

Candidate B_1

Candidate B_2

Candidate B_3

Candidate B_5

Candidate B_4

Candidate B_7

Candidate B_6

Candidate A_1

Candidate A_3

Candidate A_4Candidate A_7 Candidate B_8

Candidate A_6 Candidate A_8

Candidate A_5

Voter

Votes Table

Write sharding

Candidate A_2

Candidate B_1

Candidate B_2

Candidate B_3

Candidate B_5

Candidate B_4

Candidate B_7

Candidate B_6

Candidate A_1

Candidate A_3

Candidate A_4Candidate A_7 Candidate B_8

UpdateItem: “CandidateA_” + rand(0, 10)ADD 1 to Votes

Candidate A_6 Candidate A_8

Candidate A_5

Voter

Votes Table

Votes Table

Shared aggregation

Candidate A_2

Candidate B_1

Candidate B_2

Candidate B_3

Candidate B_5

Candidate B_4

Candidate B_7

Candidate B_6

Candidate A_1

Candidate A_3

Candidate A_4

Candidate A_5

Candidate A_6 Candidate A_8

Candidate A_7 Candidate B_8

Periodic Process

Candidate ATotal: 2.5M

1. Sum2. Store Voter

Correctness in voting

UserId Candidate DateAlice A 2013-10-02

Bob B 2013-10-02

Eve B 2013-10-02

Chuck A 2013-10-02

RawVotes Table

Segment VotesA_1 23

B_2 12

B_1 14

A_2 25

AggregateVotes Table

Voter1. Record vote and de-dupe; retry 2. Increment candidate counter

Correctness in aggregation?

UserId Candidate DateAlice A 2013-10-02

Bob B 2013-10-02

Eve B 2013-10-02

Chuck A 2013-10-02

RawVotes Table

Segment VotesA_1 23

B_2 12

B_1 14

A_2 25

AggregateVotes Table

Voter

Real-time voting architecture (improved)

AggregateVotesTable

Amazon Redshift Amazon EMR

Your Amazon Kinesis–

Enabled App

Voters RawVotes TableVoting App RawVotesDynamoDB

Stream

Real-time voting architecture

AggregateVotesTable

Amazon Redshift Amazon EMR

Your Amazon Kinesis-

Enabled App

Voters RawVotes TableVoting App RawVotesDynamoDB

Stream

Handle any scale of election

Real-time voting architecture

AggregateVotesTable

Amazon Redshift Amazon EMR

Your Amazon

Kinesis-Enabled app

Voters RawVotes TableVoting App RawVotesDynamoDB

Stream

Vote only once, no changing votes

Real-time voting architecture

AggregateVotesTable

Amazon Redshift Amazon EMR

Your Amazon

Kinesis–Enabled App

Voters RawVotes TableVoting app RawVotesDynamoDB

Stream

Real-time, fault-tolerant, scalable aggregation

Real-time voting architecture

AggregateVotesTable

Amazon Redshift Amazon EMR

Your Amazon

Kinesis–Enabled App

Voters RawVotes TableVoting app RawVotesDynamoDB

Stream

Voter analytics, statistics

Analytics with DynamoDB Streams

• Collect and de-dupe data in DynamoDB• Aggregate data in-memory and flush periodically• Important when: Performing real-time aggregation and analytics

Op: PUTJohnTokyo

Op: UPDATEJohnPluto

Op: UPDATEJohnMars

DynamoDB writes and Streams

Operation #

DynamoDB Operation

Data inDynamoDB

Data in Streams

1 PUT : {John:Tokyo}

{John:Tokyo} PUT John Tokyo

2 UPDATE:{John:Mars}

{John:Mars} UPDATE John Mars

3 UPDATE:{John:Pluto}

{John:Pluto} UPDATE John Pluto

Data: {Name:Destination}

dynamodb-pm@

View Type Destination

Old Image – Before update Name = John, Destination = Mars

New Image – After update Name = John, Destination = Pluto

Old and New Images Name = John, Destination = MarsName = John, Destination = Pluto

Keys Only Name = John

View types

dynamodb-pm@

Features of DynamoDB Streams

dynamodb-pm@

Streams Characteristics

• Each item update appears exactly once • Records are strictly ordered by time• Streams are Asynchronous

dynamodb-pm@

Durability & high availabilityHigh throughput consensus protocolReplicated across multiple AZs

dynamodb-pm@

Managed StreamsSimply enable streams

dynamodb-pm@

ElasticityAdjusts to table throughput

dynamodb-pm@

PerformanceDesigned for sub-second latency

| |

Sub-second latency

dynamodb-pm@

DurabilityRecords available for 24 hours

dynamodb-pm@

How much does it cost?

• Free to turn it on• First 2.5 million reads per month are Free• $0.20 per million reads after that

DynamoDB StreamsDynamoDBdynamodb-pm@

dynamodb-pm@

What Customers Are Saying

Mapbox

ProblemMaking mapping data highly available, even faster.

DynamoDB Streams use-caseCross-region Replication

In their own words“DynamoDB Streams unlocks cross-region replication - a critical feature that enabled us to fully migrate to DynamoDB. Cross-region replication allows us to distribute data across the world for redundancy and speed.” - Jake Pruitt, Software Developer, Mapbox

TOKYU HANDS ProblemAugmenting Point of Sale system to react in real time to inventory and customer data

DynamoDB Streams use-caseDynamoDB Triggers (DynamoDB Streams + AWS Lambda)

In their own words“TOKYU HANDS is running in-store Point Of Sales system backed by DynamoDB and various AWS services. We really like full-managed services such as DynamoDB. I believe DynamoDB Streams would help us making the system more sophisticated and more automated.” - Yamazaki-san, Cloud Architect, TOKYU HANDS.

The local version of DynamoDB

• Desktop Installable• Development & Testing• Publicly available at DynamoDB.com

Now supports DynamoDB Streams

Cross-region replication app:http://tinyurl.com/DynamoDBCrossRegionReplication

Open sourced Cross-region library is availablehttp://tinyurl.com/DynamoDBReplicationLibrary

dynamodb-pm@

dynamodb-pm@

Thank you!

Recommended