30
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Darin Briskman, ElastiCache Developer Outreach Michael Labib, Specialist Solutions Architect Managing IoT and Time Series Data with Amazon ElastiCache for Redis

Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Darin Briskman, ElastiCache Developer Outreach Michael Labib, Specialist Solutions Architect

Managing IoT and Time Series Data with Amazon ElastiCache for Redis

Page 2: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

2

Learning Objectives

Understand Time Series Data & Challenges Learn about Amazon ElastiCache for Time

Series Data Learn how to build Time Series Solutions

with Amazon ElastiCache Explore a Sensor Data Demonstration

2

Page 3: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Time Series Data

Page 4: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Time Series Data – What is it? Device / Sensor Data Website Clickstream events Logging and metrics Social Media and sentiment

analysis Can be analyzed upon

collection or batches

4

Page 5: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Time Series Data – Challenges What type of database should you use? Relational or NoSQL

How should you model the data to support the queries and analysis needed? What types of aggregations will you need? What type of information would do you need to gather?

How will the applications access the data? How do you build the solution in a cost effective manner? How long should you retain the data? How do you manage scalability given the amount of data and sensors grow? And so on…

5

Page 6: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Database Families Relational + Mature Technology + SQL is widely adopted + Durable and Consistent - Inflexible data model - Hard to scale - Performance limitations

- Updates and other changes can be slow - Limited to speed of storage technology

1 = Row 2 = Column

6

Page 7: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Database Families NoSQL + ’Schema-less’ = flexible data model + Quickly adapt to new requirements + Easier to scale + Performance + Generally higher throughput and

+ lower latency than Relational DB - Can be less durable and consistent - Query tools less mature

7

Page 8: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Database Families NoSQL: In-Memory Key-Value + Very fast performance

+ Sub millisecond reads and writes + High throughput

+ Schema-Less – Flexible data model + Redis - Advanced Data Structures - Still emerging technology

- Query and other tools less mature - Less connectors than other database

options - Less durable

8

Page 9: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Request Rate High Low

Latency Low High

Stru

ctur

e Low

High

9 Data Volume

Low High

Amazon RDS

Amazon S3 Amazon Glacier

Amazon CloudSearch and

Amazon Elasticsearch

Amazon DynamoDB

Amazon ElastiCache

HDFS

Page 10: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Ingesting Time Series Data with Amazon ElastiCache

AWS Lambda

Amazon Kinesis Streams

AWS IoT

Data Processor

Streaming Real-time

Data

Device Data

Amazon ElastiCache for Redis

Datastore

1

Amazon EC2

10

Page 11: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Amazon ElastiCache Overview

Page 12: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

In-Memory Key-Value Store

High-performance

Redis and Memcached

Fully managed; Zero admin

Highly Available and Reliable

Hardened by Amazon

12

Page 13: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Redis – The In-Memory Leader

Powerful ~200 commands + Lua scripting

In-memory data structure server

Utility data structures strings, lists, hashes, sets, sorted sets, bitmaps & HyperLogLogs

Simple

Atomic operations supports transactions has ACID properties

Ridiculously fast! <1ms latency for most commands

Highly Available replication

Persistent snapshots or append-only log

Open Source

13

Page 14: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Fully managed service = Automated Operations

Redis datastore hosted on Amazon EC2 Amazon ElastiCache for Redis

14

Page 15: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

ElastiCache - Customer Value Extreme

Performance

Sub-millisecond access latencies

Engineered for Cloud Scale

Open Source Compatible

Compatible with Redis and

Memcached

Existing code will work when

you update node end points

Fully Managed

Automates tasks such as failed

node replacement,

software patching,

upgrades and backups

CloudWatch enables you to monitor cache performance

metrics

Secure and Hardened

Supports Amazon VPC and IAM for

secure and fine grained access.

Monitors your nodes and

applies security patches when

necessary

Highly Available

and Scalable

Multi-AZ with automatic

failover to a read replica, no

human intervention

required.

Easily scale your Redis

(vertically) and Memcached (horizontally) environments

Cost Effective

Pay as low as US$0.017 per

hour. Get started with 750 free hours per

month of a micro node for a

year.

No cross availability zone

data transfer costs.

15

Page 16: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Building Time Series Solutions with Amazon ElastiCache

Page 17: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Streaming Data

Amazon ElastiCache

Amazon EC2

Data Sources

AWS Lambda

(Dashboard)

1

Amazon Kinesis Streams

Amazon DynamoDB

Hot Data

Longer Retention

17

Page 18: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Streaming Data Analytics

Data Sources

1

Amazon Kinesis Streams

Amazon EMR

(Spark Streaming)

Amazon ElastiCache

Amazon S3

Amazon EC2

(Dashboard)

Amazon Redshift

Spark Redis Connector

Data Lake

18

Page 19: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Capturing Moving Data Equity prices, clickstreams, sensor streams … Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure

order among incoming data Kinesis provides an ordered Stream Redis provides a Sorted Set

Fault Tolerant Data Retention – Data accessible in Kinesis 24 hours after its

been added to the stream and can be extended up to 7 days. ElastiCache provides HA with automatic failover to a read replica

ElastiCache is fast and flexible Can query and retrieve hot data quickly

19

Page 20: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

ElastiCache for Streaming Data Analytics

Access all Redis Data structures directly from Spark

Significantly accelerate Spark performance over disk-based solutions

Simplify analytics by leveraging Redis Data Structures reducing processing overhead and complexity

Improve application performance by accessing aggregations and hot data from Redis

20

Page 21: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Event Driven Data Capture time ordered changes from a

DynamoDB table using DynamoDB Streams and propagate to ElastiCache

AWS Lambda

Amazon DynamoDB

Streams

Amazon SES

Amazon S3

Amazon Cognito

Amazon SNS

Amazon CloudWatch

Amazon Kinesis Streams

Amazon API

Gateway

Event Sources

21

Hot Data

Amazon ElastiCache

AWS Lambda Longer

Retention Amazon

DynamoDB

Page 22: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Responding to Events Can cover a huge variety of different data sources Security events, application events, messaging events, user actions… For IoT, often created when a sensor crosses a threshold

Lambda captures the incoming event data and routes it Hot data to ElastiCache Cold data to slower, disk-based databases Lambda function to keep databases synchronized

Cost effective Use the speed and flexibility of ElastiCache for hot data queries ‒ No incremental cost per query – price is driven by amount of

memory available Use the DynamoDB or RDS for longer data retention

22

Page 23: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Device Data

AWS IoT

AWS IoT Device

Amazon EC2

(Dashboard)

AWS Lambda

Hot Data

Amazon ElastiCache

Amazon DynamoDB

Longer Retention

Historical | Data Lake Amazon

S3 Amazon Glacier

Cold Data Amazon Kinesis

Firehose 23

Page 24: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Sensors and Data Bursts “We are all glorified motion sensors. Some things only become

visible to us when they undergo change” – Vera Nazarian, author Sensors enable detailed understanding of conditions Quickly perceive and react to changing conditions

Device data can be streaming (time-based delivery) or event driven (threshold-based delivery) Or both! A temperature sensor might deliver regular readings

and also send an alert when it becomes too hot or too cold Sensor data solutions must scale to meet loads Sensor clusters can generate large data bursts AWS IoT auto scales ElastiCache for Redis allows flexible sizing to meet changing

workloads 24

Page 25: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Data Modeling for Sensors

Sensor data is usually of high variety Different sensors, and different versions of the same

sensor, collect different data in different formats A single device can have many sensors

The flexible schema of ElastiCache for Redis allows non-disruptive evolution of data models Quickly test and use new or modified data models with

no need to migrate schemas, rebalance shards or other onerous administrative tasks

25

Page 26: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Demonstration

Page 27: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Sensor Demo

AWS IoT

AWS IoT Device

Amazon EC2

(Dashboard)

AWS Lambda

Amazon ElastiCache

http://www.seeedstudio.com/wiki/Intel%C2%AE_Edison_and_Grove_IoT_Starter_Kit_Powered_by_AWS

Intel Edison – Grove Temperature Sensor

27

Page 28: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Sensor Demo - Data Model

SensorData

timestamp

deviceId

temperature

humidity

climate

timestamp

Pub/Sub

deviceId

Hash

If (temperature> 95 F) publish notification

deviceIP

requestId

Strings (INCR/DECR)

climate:warm

climate:cool

climate:hot

climate:cold

Time-Series events Sort by timestamp (score)

Aggregations For dashboards, analytics, etc.

Event details

Data structures updated within a transaction block using Redis MULTI

Which events occurred at a particular time range?

What were the last N events?

Which devices match a particular pattern?

What is the temperature for a specific device? What is all the data for a specific device?

What are the totals for each climate type?

get climate:warm get climate:cool get climate:cold get climate:hot

zrevrangebyscore SensorData +inf -inf WITHSCORES LIMIT 0 5

zrangebyscore SensorData (1410000000000 14400000000000

zscan SensorData 0 MATCH 2* COUNT 100

hget 2221 temperature hgetall 2221

Sorted Set

28

Page 29: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among

Summary

Time series data can present high volume, velocity and variety challenges

ElastiCache for Redis is scalable, performant and cost effective service for sensor data and other time series data sets

29

Page 30: Managing IoT and Time Series Data with Amazon ElastiCache ...œˆ… · Amazon Kinesis and AWS Lambda are serverless Amazon Kinesis + Amazon ElastiCache for Redis ensure order among