48
Silicon Valley

Membase Meetup - Silicon Valley

  • Upload
    membase

  • View
    1.728

  • Download
    1

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Membase Meetup - Silicon Valley

Silicon Valley

Page 2: Membase Meetup - Silicon Valley

2

Tonight

• Membase Overview

• Use Cases and Deployment Examples

• Membase Architecture

• Demo!

• Developing with Membase

• A Glimpse into the Future

Page 3: Membase Meetup - Silicon Valley

What is Membase?

Page 4: Membase Meetup - Silicon Valley

Before: Application scales linearly, data hits wall

Application Scales OutJust add more commodity web

servers

Database Scales UpGet a bigger, more complex server

4

Page 5: Membase Meetup - Silicon Valley

Membase is a distributed database

5

Membase Servers

In the data center

Web application server

Application user

On the administrator console

Page 6: Membase Meetup - Silicon Valley

6

Built-in Memcached Caching Layer

Memcached

Membase Database

Memcached

Membase Database

Memcached Mode Membase Mode

Membase development team has contributed over half of the source code to the Memcached project.

Page 7: Membase Meetup - Silicon Valley

Deployment options

7

applicationlogic

OTC memcached

client

data operations

applicationlogic

OTC memcached

client

data operations

cluster operations

11211

serverlist

OTC Memcached Server

11211

Membase Server

serverlist

proxy vbucketmap

applicationlogic

OTC memcached

client

Membase Server

localhost

proxyvbucket

map

applicationlogic

NEWmemcached

client

Membase Server

vbucketmap

Embedded proxy Standalone proxy “vBucket-aware” client

Deployment Option 1 Deployment Option 2 Deployment Option 3

11210

data operations

cluster operations

11211

proxy vbucketmap

11210

data operations

cluster operations

11211

proxy vbucketmap

11210

Page 8: Membase Meetup - Silicon Valley

Secure multitenant support

8

Membase data servers

In the data center

Web application server

Application user

On the administrator console

Bucket 1Bucket 2

Aggregate Cluster Memory and Disk Capacity

Page 9: Membase Meetup - Silicon Valley

Five minutes or less to a working cluster• Downloads for Linux and Windows

• Start with a single node

• One button press joins nodes to a cluster

Easy to develop against• Just SET and GET – no schema required

• Drop it in. 10,000+ existing applications already “speak membase” (via memcached)

• Practically every language and application framework is supported, out of the box

Easy to manage• One-click failover and cluster rebalancing

• Graphical and programmatic interfaces

• Configurable alerting

Membase is Simple, Fast, Elastic

9

Page 10: Membase Meetup - Silicon Valley

Membase is Simple, Fast, Elastic

10

Predictable• “Never keep an application waiting”

• Quasi-deterministic latency and throughput

Low latency• Built-in Memcached technology

• Auto-migration of hot data to lowest latency storage technology (RAM, SSD, Disk)

• Selectable write behavior – asynchronous, synchronous (on replication, persistence)

High throughput• Multi-threaded

• Low lock contention

• Asynchronous wherever possible

• Automatic write de-duplication

Page 11: Membase Meetup - Silicon Valley

Membase is Simple, Fast, Elastic

11

Zero-downtime elasticity• Spread I/O and data across commodity

servers (or VMs)

• Consistent performance with linear cost

• Dynamic rebalancing of a live cluster

All nodes are created equal• No special case nodes

• Clone to grow

Extensible• Filtered TAP interface provides hook points

for external systems (e.g. full-text search, backup, warehouse)

• Data bucket – engine API for specialized container types

• Membase NodeCode [FUTURE]

Page 12: Membase Meetup - Silicon Valley

Leading cloud service (PAAS) provider

Over 65,000 hosted applications

Membase Server supporting over 3,000 Heroku customers

Proven at small, and extra large scale

12

Social game leader – FarmVille, Mafia Wars, Café

WorldOver 230 million monthly

usersZynga is a core contributor to

and large scale user of Membase Server

Page 13: Membase Meetup - Silicon Valley

After: Data layer scales like application logic layerData layer now scales with linear cost and constant performance.

Application Scales OutJust add more commodity web

servers

13

Database Scales OutJust add more commodity data

servers

Scaling out flattens the cost and performance curves.

Membase Servers

Page 14: Membase Meetup - Silicon Valley

Membase - A practical path to “NoSQL” adoption

14

Page 15: Membase Meetup - Silicon Valley

Use Cases

Page 16: Membase Meetup - Silicon Valley

17

Leading cloud service (PAAS) providerOver 65,000 hosted applicationsMembase Server serving over 1,200 Heroku customers (as of June 10, 2010)

Deployments Leading Membase

Social game leader – FarmVille, Mafia Wars, Café WorldOver 230 million monthly users

• Membase Serveris the 500,000 ops-per-second database behind FarmVille and Café World

Page 17: Membase Meetup - Silicon Valley

18

Use case – Ad targeting

eventsprofiles, campaigns

profiles, real time campaign statistics

40 milliseconds to come up with an

answer.

2

3

1

Page 18: Membase Meetup - Silicon Valley

19

Search and Gaming Portal

Database

Page 19: Membase Meetup - Silicon Valley

Targeting atShareThis

Page 20: Membase Meetup - Silicon Valley

Largest integrated sharing network

We make sharing simple, engaging & valuable

Powerful Social Analytics & Audience Monetization

About ShareThis

450/momillion

consumers

~850thousand

sites

50+social

channels

Page 21: Membase Meetup - Silicon Valley

This is how it works

Log FilesSearch Keywords

Page Views

Sharing Behavior

HDFS

Map/Reduce

Content Analysis

Taxonomy

Ad Server

User Membase2

Page 22: Membase Meetup - Silicon Valley

ShareThis Ad Products

Page 23: Membase Meetup - Silicon Valley

Membase Architecture

Page 24: Membase Meetup - Silicon Valley

25

Clustering

• Underlying cluster functionality based on erlang OTP

• Have a custom, vector clock based way of storing and propagating...– Cluster topology

– vBucket mapping

• Collect statistics from many nodes of the cluster– Identify hot keys, resource

utilization

25

Page 25: Membase Meetup - Silicon Valley

26

Page 26: Membase Meetup - Silicon Valley

27

TAP

• A generic, scalable method of streaming mutations from a given server– As data operations arrive, they can be sent to arbitrary TAP

receivers

• Leverages the existing memcached engine interface, and the non-blocking IO interfaces to send data

• Three modes of operation

Page 27: Membase Meetup - Silicon Valley

29

Clients, nodes and other nodes

Page 28: Membase Meetup - Silicon Valley

30

Data buckets are secure membase “slices”

Membase data servers

In the data center

Web application server

Application user

On the administrator console

Bucket 1Bucket 2

Aggregate Cluster Memory and Disk Capacity

Page 29: Membase Meetup - Silicon Valley

32

vBucket mapping

Page 30: Membase Meetup - Silicon Valley

34

Disk > Memory

Dataset may have many items infrequently accessed. However, memcached has different behavior (LRU) than wanted with membase.

Still, traditional (most) RDBMS implementations are not 100% correct for us either. The speed of a miss is very, very important.

Page 31: Membase Meetup - Silicon Valley

Membase Demo

Page 32: Membase Meetup - Silicon Valley

36

Thanks!

Page 33: Membase Meetup - Silicon Valley

Key-Value Patterns

Page 34: Membase Meetup - Silicon Valley

38

Key-Value

(with a replica)

Items have:KeyValueExpirationFlagsCAS (more on this later)

Operations include:Get/SetIncrement/DecrementAppend/Prepend

Page 35: Membase Meetup - Silicon Valley

39

Membase Datatypes

• byte[]– Does your data have 1s

and 0s?

“Any customer can have a car painted any colour that he wants so long as it is black.”

• Items do have flags– Many clients use flags

– Data type options• Google protobuf• Thrift• Avro

Page 36: Membase Meetup - Silicon Valley

40

Transactions

• Lock == slow me down

• CAS operations– Optimistic locking

• Very useful with complex datatypes– Imagine two clients trying to

update a complex item

• You’re likely using CAS already... if you use a CPU

User 1

Fail!

User 2Success

Page 37: Membase Meetup - Silicon Valley

41

Common Use: Sessions

• Web user sessions– Highly read, less writes in many case– Protocol advantage of memcached

• Options already for PHP, Ruby and Java

• Application state– Not necessarily “entity” style things– May be appropriate for a “cache” pool

Page 38: Membase Meetup - Silicon Valley

42

Common Use (cache): Rate Limiting

• Want to provide API calls into the system– Twitter search– Google search services

• Use the atomic increment– Set an item with a unique ID– Upon API request, increment

and check• HTTP 420: go away and come

back later

Your Users

Your App

¡Ouch!

Page 39: Membase Meetup - Silicon Valley

Looking Ahead: NodeCodeFrank Weigel, Membase

Page 40: Membase Meetup - Silicon Valley

44

Beyond key-value • Indexing/Range Queries• Advanced Data Structures• Sub-object direct manipulation

Validation and In-flight transformation• Block mutations failing validation• Enrich or transform objects

Connectors (Integrate easily with other systems)• Solr• Hadoop• MySQL

NodeCode – Motivation

Page 41: Membase Meetup - Silicon Valley

45

NodeCode - What is it?

Method for extending & customizing Membase

Separate code modules

Defined interface to datapath and cluster manager

Notification on events• Synchronous• Asynchronous

Page 42: Membase Meetup - Silicon Valley

46

Simple• Packaged modules for easy install and enable• Library of “off the shelf” modules• Module monitoring• Straight forward development and debugging

Fast• Low latency/high-throughput• Per-bucket process isolation• Don’t break data manager performance/correctness

Elastic• Automatically migrate and instantiate on rebalance• Provide support for migration of internal data• Leverage native Membase engine for internal data

storage

NodeCode – Drivers

Page 43: Membase Meetup - Silicon Valley

47

Block-level architecture

Page 44: Membase Meetup - Silicon Valley

48

Java only– jar format

Must implement minimal module API• Initial module startup• Module removal• Association with bucket

NodeCode library helper functions• Register synchronous & asynchronous listeners/callbacks• Register protocol extension/callbacks • Register rebalance callback• Register cluster manager event callbacks• Membase data access

NodeCode 1.0 Plans

Page 45: Membase Meetup - Silicon Valley
Page 46: Membase Meetup - Silicon Valley

50

Q&A

Page 48: Membase Meetup - Silicon Valley