52
Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering

Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Machine Learning meets Databases

Ioannis PapapanagiotouCloud Database Engineering

Page 2: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Create Personalized Recommendations for discoveries of engaging video content that maximizes member joy.

Page 3: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Personalize Everything

Page 4: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 5: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 6: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 7: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 8: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 9: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 10: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 11: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 12: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 13: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 14: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

90 seconds

Page 15: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

90 seconds...

Page 16: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

What do caches touch?

Signing up*Logging inChoosing a profilePicking liked videosPersonalization*Loading home page*Scrolling home page*A/B testsVideo image selection

Searching*Viewing title detailsPlaying a title*Subtitle / language prefsRating a titleMy ListVideo history*UI stringsVideo production*

* multiple caches involved

Page 17: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 18: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Key-Value store optimized for AWS and tuned for Netflix

Ephemeral Volatile Cache

Page 19: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

What is EVCache?

Distributed, sharded, replicated key-value storeTunable in-region and global replicationBased on MemcachedResilient to failureTopology awareLinearly scalableSeamless deployments

Page 20: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Why Optimize for AWS

Instances disappearZones failRegions become unstableNetwork is lossyCustomer requests bounce between regions

Failures happen and we test all the time

Page 21: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

EVCache Use @ Netflix Hundreds of terabytes of dataTrillions of ops / dayTens of billions of items storedTens of millions of ops / secMillions of replications / secThousands of serversHundreds of instances per clusterHundreds of microservice clientsTens of distinct clusters3 regions

Page 22: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Architecture

Server

Memcached

EVCar

Application

Client Library

Client

Eureka(Service Discovery)

Page 23: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Architecture

us-west-2a us-west-2cus-west-2b

ClientClient Client

Page 24: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Reading (get)

us-west-2a us-west-2cus-west-2b

Client

Primary Secondary

Page 25: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Writing (set, delete, add, etc.)

us-west-2a us-west-2cus-west-2b

ClientClient Client

Page 26: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Use Case: Lookaside Cache

Application

Client Library

Client REST/gRPC Client

S S S S

C C C CData Flow

Page 27: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Use Case: Transient Data Store

Application

Client Library

Client

Application

Client Library

Client

Application

Client Library

Client

Time

Page 28: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Use Case: Primary Store

Offline / Nearline Precomputes for

Recommendations

Online Services

Offline Services

Online Application

Client Library

Client

Data Flow

Page 29: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Use Case: Impression store

Hive

Online Services

Offline Services

Online Application

Client Library

Client

Data Flow

Page 30: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Pipeline of Personalization

Compute A

Compute B Compute C

Compute D

Online Services

Offline Services

Compute E

Data Flow

Online 1 Online 2

Page 31: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Additional Features

Kafka● Global data replication● Consistency metrics

Key Iteration● Cache warming● Lost instance recovery● Backup (and restore)

Page 32: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Region BRegion A

APP APP

Repl Proxy

Repl Relay

1 mutate

2 send metadata

3 poll msg

5 https s

end msg

6 mutate4 get data

for set

Kafka Repl Relay Kafka

Repl Proxy

Cross-Region Replication

7 read

Page 33: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Open Source

https://github.com/netflix/EVCache(client and REST proxy)

Page 34: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Viewing History

Page 35: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use
Page 36: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Requirements for Viewing History● Time series dataset● Support high writes● Cross region replication● Large Dataset

Page 37: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Growth of Viewing History

Page 38: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

1) Massively scalable architecture2) Multi-datacenter,

multi-directional replication3) Linear scale performance4) Transparent fault detection and

recovery5) Flexible, dynamic schema data

Page 39: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Viewing History

1) Apply Custom Filters (user, device, subtitle, episode, season)

2) Tunable consistency to tradeoff performance vs data consistency

Page 40: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Growth of Viewing History

Page 41: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

New Data Model

Page 42: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Use Case: A/B Metadata

● Wanted to capture information about each test○ Owner○ Properties ○ Start time/End Time○ Allocation

Page 43: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Dynomite

A framework that makes non-distributed data stores, distributed. Can be used with many key-value storage engines

Features: highly available, automatic failover, node warmup, tunable consistency, backups/restores

Page 44: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Pluggable Storage Engines

● Layer on top of a non-distributed key value data store○ Peer-peer, Shared Nothing○ Auto-Sharding○ Multi-datacenter○ Linear scale○ Replication○ Gossiping

Page 45: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Replication

Page 46: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Dyno - Java Client

● Connection Pooling● Load Balancing● Effective failover● Pipelining● Scatter/Gather● Metrics

Page 47: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Moving Across Storage Engines

Page 48: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Data Explorer for Dynomite (UI)

Page 49: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Open Source

https://github.com/netflix

/dynomite Proxy (C)

/dyno Client (Jedis)

/dynomite-manager Sidecar (Tomcat Container)

/dyno-queues Distributed queue recipe (Java)

Page 50: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Other Datastores● Source of truth: Hive backed by S3● Elastic Search● MySQL, Postgres, AWS Aurora

Page 51: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

We are Hiring!https://jobs.netflix.com/jobs/865007

Twitter: @ipapapaLinkedIn: https://www.linkedin.com/in/ipapapa/Github: https://github.com/ipapapa

Page 52: Machine Learning meets Databases€¦ · Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering. Create Personalized Recommendations for ... EVCache Use

Thank you.