AWS July Webinar Series - Getting Started with Amazon DynamoDB

Nate Slater, AWS Solutions Architect

July 30, 2015

Introduction to DynamoDB

Agenda

• What is DynamoDB?• DynamoDB Fundamentals• Typical Workloads and Use Cases• Demo

What is DynamoDB?

DynamoDB is a fully managed, NoSQL document and key-value data store.

What is NoSQL?

NoSQL is a term to describe data stores that trade full ACID compliance for high availability and scale.

solation

urability

onsistency

tomicity Single row/single item only

Eventual consistency

Dirty Read

Data replication on commodity storage

Why NoSQL?

• Dirty Reads?• Eventual Consistency?• Single row transactions only?• Why would anybody trade ACID compliance for this?

NoSQL – Availability and Scale

Traditional SQL NoSQL

DBPrimary Secondary

Scale Up

Scale Out

Scale Up vs Scale Out

Scale-Up

Scale-Out

Complexity

The CAP Theorem

Network partitions will happen in distributed systems:

Consistency

Availability

Partition Tolerance

Why NoSQL?

• Horizontal Scaling allows for infinite scalability• Cheaper to scale out than to scale up• Full consistency or availability that can survive a network

partition• Full ACID compliance is often not needed

What is DynamoDB?

DynamoDB is a fully managed, NoSQL document and key-value data store.

What is a Managed Service?

• A managed service is a web service in which consumers of the service never need to interact directly with the underlying compute, storage, and network resources.

Why use a Managed Service?

DynamoDB is a Managed Service

• AWS runs all the database infrastructure for you!• All the benefits and none of the operational overhead of running a

distributed system:• Infinitely scalable read and write I/O• High availability within a region• Data durably stored in 3 availability zones• Cross-region replication• Easily export data to S3• Triggers using Lambda functions• Tight integration with Kinesis, Lambda, EMR, and Redshift• Pay only for what you use, when you need it

DynamoDB Fundamentals

DynamoDB TableTable

Attributes

HashKey

RangeKeyMandatory

Key-value access patternDetermines data distribution

OptionalModel 1:N relationshipsEnables rich query capabilities

All items for a hash key==, <, >, >=, <=“begins with”“between”sorted resultscountstop/bottom N valuespaged responses

Data types

String (S)

Number (N)

Binary (B)

String Set (SS)

Number Set (NS)

Binary Set (BS)

Boolean (BOOL)

Null (NULL)

List (L)

Map (M)

Used for storing nested JSON documents

00 55 A954 AA FF

Hash tableHash key uniquely identifies an item

Hash key is used for building an unordered hash index

Table can be partitioned for scale

Id = 1Name = Jim

Hash (1) = 7B

Id = 2Name = AndyDept = Engg

Hash (2) = 48

Id = 3Name = KimDept = Ops

Hash (3) = CD

Key Space

Partitions are three-way replicated

Id = 1Name = Jim

Replica 1

Replica 2

Replica 3

Partition 1 Partition 2 Partition N

Hash-range table• Hash key and range key together uniquely identify an Item.• Within unordered hash index, data is sorted by the range key.• No limit on the number of items (∞) per hash key.

• Unless you have local secondary indexes

00:0 FF:∞

Hash (2) = 48

Customer# = 2Order# = 10Item = Pen

Customer# = 2Order# = 11Item = Shoes

Customer# = 1Order# = 10Item = Toy

Customer# = 1Order# = 11Item = Boots

Hash (1) = 7B

Customer# = 3Order# = 10Item = Book

Customer# = 3Order# = 11Item = Paper

Hash (3) = CD

55 A9:∞54:∞ AA

Partition 1 Partition 2 Partition 3

Local Secondary Index (LSI)

alternate range key + same hash keyindex and table data is co-located (same partition)

10 GB max per hash key, i.e. LSIs limit the # of range keys!

Global Secondary Index

any attribute indexed as new hash and/or range key

RCUs/WCUs provisioned separately for GSIs

Online indexing

LSI or GSI?

LSI can be modeled as a GSI

If data size in an item collection > 10 GB, use GSI

If eventual consistency is okay for your scenario, use GSI!

CreateTable

UpdateTable

DeleteTable

DescribeTable

ListTables

PutItem

UpdateItem

DeleteItem

BatchWriteItem

GetItem

BatchGetItem

ListStreams

DescribeStream

GetShardIterator

GetRecords

DynamoDB API

Stream API

DynamoDB Streams and AWS Lambda

Emerging Architecture Pattern

Throughput

Provisioned at the table level• Write capacity units (WCUs) are measured in 1 KB per second• Read capacity units (RCUs) are measured in 4 KB per second

• RCUs measure strongly consistent reads• Eventually consistent reads cost 1/2 of consistent reads

Read and write throughput limits are independent

WCURCU

Partitioning example Table size = 8 GB, RCUs = 5000, WCUs = 500

RCUs per partition = 5000/3 = 1666.67WCUs per partition = 500/3 = 166.67Data/partition = 10/3 = 3.33 GBRCUs and WCUs are uniformly spread across partitions

# of partitions (IO capacity) = 5000/3000 RCU + 500/1000 WCU = 2.17

# of partitions (storage) = 8/10 GB = 0.8

# of partitions = ceiling(max(2.17, 0.8)) = 3

Typical Workloads and Use-Cases

DynamoDB table examplescase class CameraRecord( cameraId: Int, // hash key ownerId: Int, subscribers: Set[Int], hoursOfRecording: Int, ...)

case class Cuepoint( cameraId: Int, // hash key timestamp: Long, // range key type: String, ...)HashKey RangeKey Value

Key Segment 1234554343254

Key Segment1 1231231433235

Typical Workloads• Ad-tech• IoT• Gaming• Web Analytics• Mobile Applications• Large Scale Websites

…And much more!

AWS July Webinar Series - Getting Started with Amazon DynamoDB

Technology

NoSQL and AWS Dynamodb

AWS Webcast - Four Tips for Faster Development With DynamoDB

Asa: Amazon Shopping Assistant€¦ · AWS Lambda & AWS API Gateway serverless infrastructure AWS DynamoDB for any sort of persisted data •Software Platforms / Technologies Node

AWS Webcast - Data Modeling for low cost and high performance with DynamoDB

Massive Message Processing with Amazon SQS and Amazon DynamoDB (ARC301) | AWS re:Invent 2013

AWS re:Invent 2016: Cross-Region Replication with Amazon DynamoDB Streams (DAT201)

Masterclass Webinar: Amazon DynamoDB July 2014

AWS DynamoDB Streams - A quick introduction

AWS June 2016 Webinar Series - AWS Quarterly Update

AWS September Webinar Series - Getting Started with DynamoDB Streams

Exam Code: Amazon-AWS-Developer-Associate · Exam Code: Amazon-AWS-Developer-Associate ... B. Automated scaling to ... What kind of service is provided by AWS DynamoDB? A. Relational

How to deploy DSH-PREDMNT in proprietary AWS account · AWS services by class AWS Lambda Amazon DynamoDB Amazon CloudWatch AWS CloudFormation AWS IoT Core AWS IoT Greengrass Amazon

AWS Webcast - Data Modeling and Best Practices for Scaling your Application with Amazon DynamoDB

AWS TechWorkshop DynamoDB / Customer Presentation: Infopark AG

AWS re:Invent 2016: Streaming ETL for RDS and DynamoDB (DAT315)

AWS Webcast - Build high-scale applications with Amazon DynamoDB

AWS APAC Webinar Week - Understanding AWS Storage Options

Amazon DynamoDB Design Patterns for Ultra-High Performance Apps (DAT304) | AWS re:Invent 2013

SAP on AWS Webinar - Sprinklr · PDF file · 2017-10-16SAP on AWS Webinar Deployment of SAP Solutions on AWS ... AWS AWS IAM CloudTrail AWS Quick Starts AWS Service Catalog ... Follows

AWS Under the covers with Amazon DynamoDB IP Expo 2013