42
Home of Redis Redis for Fast Data Ingest

Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

Home of Redis

Redis for Fast Data Ingest

Page 2: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

2

Agenda

• Fast Data Ingest and its challenges

• Redis for Fast Data Ingest• Pub/Sub• List• Sorted Sets as a Time Series

Database

• The Demo

• Scaling with Redise Flash

Page 3: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

Fast Data Ingest Scenarios

Page 4: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

4

IOT

Page 5: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

5

Network Traffic Inspection

Page 6: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

6

Social Media Analysis

Page 7: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

7

More Scenarios

And more…

Log Collection

User Activity Tracking

Multi-player Gaming

Fintech

Page 8: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

8

Fast Data Ingest Challenges

• Keeping up with the pace of data arrival

• Data from multiple sources with no standard data format

• Filter, analyze, and transform data in real-time

• Managing data arriving from sources distributed geographically

Page 9: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

9

Requirements for Fast Data Ingest

• Physical infrastructure – network, computational resources, etc.

• Software stack to:

• Filter• Aggregate• Transform• Distribute

data in real-time with sub-millisecond latency

Page 10: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

Fast Data Ingest with Redis

Page 11: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

11

About Redis

Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case.

The open source home and commercial provider of Redis Enterprise (Redise) technology, platform, products & services.

Page 12: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

12

Redis for Fast Data Ingest

Page 13: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

13

Redis for Fast Data Ingest

ListsSorted Sets

Hashes Hyperloglog

Geospatial Indexes

Bitmaps

SetsStrings

Bit field

Redis Data Structures

Publisher Channel

Subscriber 1

Subscriber 2

Subscriber 3

Subscriber n

Redis Pub/Sub

Page 14: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

Common Ingest Techniques in Redis

Page 15: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

15

Pub/Sub

CommandsPublisher: publish <channel name> <message>

Subscriber: subscribe <channel name>

Publisher Channel

Subscriber 1

Subscriber 2

Subscriber 3

Subscriber n

Page 16: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

16

List

Publisher

Subscriber 1

Subscriber 2

Subscriber 3

Subscriber n

CommandsPublisher: lpush <list name> <message>

Subscriber: brpop <list name> <timeout>

Page 17: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

17

Sorted Set

CommandsPublisher: zadd <timeseries name> <timestamp> <message>

Subscriber: zrangebyscore <timeseries name> <last timestamp> <current timestamp> WITHSCORES

Publisher

Subscriber 1

Subscriber 2

Subscriber 3

Subscriber n

Page 18: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

The Demo

Page 19: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

19

Demo: Problem Description

English Tweets Filter

Influencer Tweets Filter

Popular hashtags among English tweets

Influencer Catalog

All Tweets

Sample Tweet Message in the JSON format:{

"created_at":"Tue Jul 11 17:06:03 +0000 2017","id":884821096440004600,"text":"USGS reports a M2 #earthquake 31km WSW of Enterprise, Utah on 7/11/17 @ 17:01:53 UTC https://t.co/xXQH2Mfy93 #quake","user":{

"id":1414684496,"name":"Every Earthquake","screen_name":"everyEarthquake","location":"Earth","followers_count":18978,"friends_count":17,"lang":"en"

}}

"lang":"en"

followers_count > 10000

Match pattern “#(\\w+)”Increment count for that pattern

Map influencer id to profileSorted Set: follower count -> id

Page 20: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

20

Demo Setup

Service Provider for Messages

Programming Language for the demo

IDE

Redis container on Docker

Page 21: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

21

The Three Data Ingest Techniques

Fast Data Ingest Technique Pros Cons

Pub/Sub • Easy• Decoupled setup• Good for geographically

distributed setup

• Not resilient to connection loss

• Requires many connections

Lists • Easy • Resilient to connection loss

• Tightly coupled producers and consumers

• Data duplication

Sorted Sets • Resilient to connection loss• Least chance of losing data• Access to historical data• Loosely coupled producers and

consumers

• Consumes space• Complex logic

Page 22: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

Technique 1: Fast Data Ingest with Pub/Sub

Page 23: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

23

Fast Data Ingest with Pub/Sub

EnglishTweetsFilter

InfluencerTweetsFilter

HashTagCollector

InfluencerCollector

IngestPubSub

AllTweets

EnglishTweets

InfluencerTweets

• Easy• Decoupled setup• Good for geographically distributed setup

Advantages

Page 24: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

24

Class Diagrams and Sample Code

https://github.com/redislabsdemo/IngestPubSub

Page 25: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

Technique 2: Fast Data Ingest with Lists

Page 26: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

26

Fast Data Ingest with Lists

EnglishTweetsFilter

InfluencerFilter

HashTagFilter

IngestStream

AllTweetsListener

EnglishTweetsListener

alldata

englishtweets

• Easy• Resilient to connection loss

Advantages

Page 27: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

27

Class Diagrams and Sample Code

https://github.com/redislabsdemo/IngestList

Page 28: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

Technique 3: Fast Data Ingest with Sorted Sets

Page 29: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

29

Fast Data Ingest with Sorted Sets

EnglishTweetsFilter

InfluencerFilter

HashTagFilter

IngestStream

alltweets

englishtweets

• Resilient to connection loss• Least chance of losing data• Access to historical data• Loosely coupled producers and consumers

Advantages

Page 30: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

30

Class Diagrams and Sample Code

https://github.com/redislabsdemo/IngestSortedSet

Page 31: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

Redise for Fast Data Ingest

Page 32: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

32

Redise Technology

Redis Database Instances

Page 33: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

33

Redise Technology

Cluster Manager

Enterprise Layer

Open Source Layer

REST API

Zero latency proxy

Page 34: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

34

Redise Technology

Enterprise Layer

Open Source Layer

Zero latency proxy

Cluster Manager

REST API

Redise Node

Page 35: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

35

Redise Technology

Redise Cluster• Shared nothing cluster architecture

• Fully compatible with open source

commands & data structures

Page 36: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

36

Redise - Shared Nothing Symmetric Architecture

ClusterManagementPath

ProxiesNode WatchdogCluster Watchdog

Node 1 Node 2 Node N (odd number)…

Redis Shards

Unique multi-tenant “Docker” like architecture enables running hundreds of databases over a single, average cloud instance without performance degradation and with maximum security provisions

Data Path

Distributed ProxiesSingle or Multiple Endpoints

Page 37: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

37

Redise Benefits for Data Ingest

Effortless Scaling

Simple, SeamlessClustering. Linear scaling

ACID Compliance in Cluster Architecture

Substantially Lower Costs

Run on Flash as a RAM extension

Top notch 24x7 expert support

Always On Availability

Instant Failure Recovery, No Data loss

Stable and Predictable High Performance

Page 38: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

38

Redise Flash

• Near-RAM performance at 70%+ lower costs

• Technology treats Flash as a RAM replacement (or extension)

• RAM/Flash ratio can be easily configured

• Pluggable storage engine

• Available on SATA-based SSD, NVMe-based SSD, NVDIMM like 3D XPoint/SCM on x86 and P8 platforms

2048 GB RAM

204 GB RAM

1844 GB Flash

10% 90%

Keys & hot values Cold values

Page 39: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

39

Redise Flash - 10TB Redis Deployment on EC2

Redis on RAM Redise Flash

Dataset size 10 TB 10 TB

Database size with replication 30 TB 20 TB

AWS instance type x1.32xlarge i3.16xlarge

Actual instance size (RAM, and RAM+Flash)

1.46 TB 3.66 TB

# of instances needed 21 6 + 1 (for quorum)

Persistent Storage (EBS) 154 TB 110 TB

1 year cost (reserved instances) $1,595,643 $298,896

Savings - 81.27%

*

* Redis Enterprise only needs 1 copy of the data because quorum issues are solved at the node level

Page 40: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

40

Questions

??

?

?

?

?

??

?

??

Page 41: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

41

One more thing….

redis.conf setting:

client-output-buffer-limit pubsub 32mb 8mb 60

With this setting, Redis will force the clients to disconnect under two situations:

• If the output buffer grows more than 32mb

• If the output buffer holds 8mb of data consistently for 60 seconds

Page 42: Home of Redis11 About Redis Open source. The leading in-memory database platform, supporting any high performance operational, analytics or hybrid use case. The open source home and

42

Thank You

[email protected]@roshankumar

Roshan [email protected]@redislabs

Redis Labs