39
Couchbase 101 Dipti Borkar Sr. Director | WW Solutions Engineering

Couchbase Live Europe 2015: Couchbase 101

Embed Size (px)

Citation preview

Couchbase 101 Dipti Borkar

Sr. Director | WW Solutions Engineering

©2014 Couchbase Inc.

Agenda

2

Where does Couchbase fit in? Key Concepts Operations Cluster-wide operations Look at a Live Cluster

©2014 Couchbase Inc.

Big Data = Operational + Analytic (NoSQL + Hadoop)

3

Online

Web/Mobile/IoT apps

Millions of customers/consumers

Offline, batch-oriented

Analytics apps

Hundreds of business analysts

©2014 Couchbase Inc.

Couchbase meets today’s & tomorrow’s requirements

4

Flexible data model

Consistent performance at scale

High availability

Easy, affordable scalability

24x365

©2014 Couchbase Inc.

Enterprises use Couchbase to enable key objectives

5

360 Degree Customer View

Profile Management

Catalog Fraud Detection

Content Management

Internet of Things

Digital Communication

Real Time Big Data

Mobile Applications

Personalization

Key Concepts

6

©2014 Couchbase Inc.

Couchbase can act as a

Key-Value Store Document Store

2014-06-23-10:15am : 75F

2014-06-23-11:30am : 77F

2014-06-23-02:00pm : 82F

0001:

{firstname: “Dipti”,

lastname: “Borkar”,

language: “English”,

time_zone: “PST”,

zip: 94403

}

Key - UTF-8 string up to 250 bytes

Value - can be 0 bytes – 20 MB (best practice < 1 MB)©2014 Couchbase, Inc. 7

©2014 Couchbase Inc.

Similar to primary keys in relational databases

Documents are partitioned based on the document ID

ID based document lookup is extremely fast

Must be unique

Fundamentals

JSON

Binary - integers, strings, booleans

Common binary values include serialized objects, compressed XML, compressed text, encrypted values

Document ID or Key

Value

CAS Value (unique identifier for concurrency)

TTL

Flags (optional client library metadata)

Revision #

Metadata

©2014 Couchbase Inc.

Can Represent Complex Objects and Data Structures

Very simple notation, lightweight, compact, readable

The most common API return type for Integrations

Facebook, Twitter, you name it, return JSON

Native to Javascript (can be useful)

Can be inserted straight into Couchbase (faster development)

Serialization and Deserialization are very fast

Benefits of JSON

9

©2014 Couchbase Inc.

Storing and retrieving documents

©2014 Couchbase, Inc. 10

Couchbase Cluster

Server Nodes

User/application data

Which live on

Data Buckets

DocumentsRead from / Written to

That form a

Clients

Servers

Dynamically scalable

Based on hash partitioning

©2014 Couchbase Inc.

User Objectstring uid

string firstname

string lastname

int age

array favorite_colors

string email

u::[email protected]{ “uid”: 123456,

“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]

}

User Objectstring uid

string firstname

string lastname

int age

array favorite_colors

string email

u::[email protected]{ “uid”: 123456,

“firstname”: “John”,“lastname”: “Smith”,“age”: 22,“favorite_colors”: [“blue”, “black”],“email”: “[email protected]

}

add()

get()

Objects Serialized to JSON and Back

©2014 Couchbase, Inc. 11

©2014 Couchbase Inc.

Couchbase provides a complete Data Management solution

12

High availability cache

Key-value store

Document database

Embedded database

Sync management

Multi-purpose capabilities support a broad range of apps and use cases

Enterprises often start with cache, then broaden usage to other apps and use cases

©2014 Couchbase Inc.

What makes Couchbase unique?

13

Performance & scalability leader

Sub millisecond latency with high throughput; memory-centric architecture

Multi-purpose

Simplified administration

Easy to deploy & manage; integrated Admin Console, single-click cluster expansion & rebalance

Cache, key value store, document database, and local/mobile database in single platform

Always-on availability

Data replication across nodes, clusters, and data centers

Enterprises choose Couchbase for several key advantages

24x365

Operations

14

©2014 Couchbase Inc.

Couchbase Server Architecture

15

QueryEngine

Object-managed

Cache

Storage Engine

DATA MANAGER

11210 / 11211Data access ports

8092Query API

HTTP

REST management API/Web UI

Replication, Rebalance, Shard State Manager

Erlang /OTP

CLUSTER MANAGER

8091Admin Console

©2014 Couchbase Inc.

Single Node Operations - Write

16

33 2Managed Cache

Dis

k Q

ueu

e

Disk

Replication Queue

App Server

Memory-to-Memory Replication to other node

Doc

Doc Doc

©2014 Couchbase Inc.

Managed Cache

Disk

Single Node Operations - Read

17

Managed Cache

Doc 1

Get Doc 1

Doc 1Doc 1

App Server

Dis

k Q

ueu

e

Replication Queue

Memory-to-Memory Replication to other node

©2014 Couchbase Inc.

Disk

Managed Cache

Single Node Operations – Cache Ejection

18

Doc 1

Doc 1

Doc 2Doc 3Doc 4Doc 5Doc 6

Doc 2Doc 3Doc 4Doc 5Doc 6App Server

Dis

k Q

ueu

e

Replication Queue

Memory-to-Memory Replication to other node

©2014 Couchbase Inc.

Single Node Operations – Cache Miss

19

33 2

Dis

k Q

ueu

e

Disk

Replication Queue

App Server

Memory-to-Memory Replication to other node

Doc 1

Doc 2Doc 3Doc 4Doc 5Doc 6

Doc 2Doc 3Doc 4Doc 5Doc 6

Doc 1

Doc 1Doc 1

Managed Cache

Get Doc 1

Cluster-wide Operations

©2014 Couchbase Inc.

Auto sharding – Bucket and vBuckets

21

Each bucket has active and replica data sets

Each data set has 1024 Virtual Bucket (vBuckets)

Documents get logically mapped to vBuckets

Document IDs always get hashed to the same virtual bucket

Virtual buckets to do not have a fixed physical server location

Mapping between the virtual buckets and physical server is called the cluster map

Each virtual bucket contains 1/1024th portion of the data set

vB

Data buckets

vB

1 ….. 1024

Virtual buckets

©2014 Couchbase Inc.

Cluster Map

©2014 Couchbase, Inc. 22

Hash function (KEY)

vB1 vB2 vB3 vB4 vB5 vB1024

Ph

ysi

cal

serv

ers

A B C

Add node to scale out

Lo

gic

al

Pa

rtit

ion

s

Cluster Map

New Cluster Map

DocumentsRead from / Written to

©2014 Couchbase Inc.

read/write/update

Active

SERVER 1

Active

SERVER 2

Active

SERVER 3

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Shard

5

Shard

2

Shard

9

Shard

Shard

Shard

Shard

4

Shard

7

Shard

8

Shard

Shard

Shard

Shard

1

Shard

3

Shard

6

Shard

Shard

Shard

Replica Replica Replica

Shard

4

Shard

1

Shard

8

Shard

Shard

Shard

Shard

6

Shard

3

Shard

2

Shard

Shard

Shard

Shard

7

Shard

9

Shard

5

Shard

Shard

Shard

Multi-Node Operations

©2014 Couchbase, Inc. 23

• Docs distributed evenly across servers

• Each server stores both active and replica docs- Only one server active at a time

• Client library provides app with simple interface to database

• Cluster map provides map to which server doc is on- App never needs to know

• App reads, writes, updates docs

• Multiple app servers can access same document at same time

©2014 Couchbase Inc.

SERVER 4 SERVER 5

Replica

Active

Replica

Active

read/write/update

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Active

SERVER 1

Shard

9

Shard

Replica

Shard

4

Shard

1

Shard

8

Shard

Shard

Shard

Active

SERVER 2

Shard

8

Shard

Replica

Shard

6

Shard

3

Shard

2

Shard

Shard

Shard

Active

SERVER 3

Shard

6

Shard

Replica

Shard

7

Shard

9

Shard

5

Shard

Shard

Shard

read/write/update

Shard

5

Shard

2

Shard

Shard

Shard

4

Shard

7

Shard

Shard

Shard

1

Shard

3

Shard

Shard

Adding Nodes

©2014 Couchbase, Inc. 24

• Two servers added withone-click operation

• Docs automatically rebalance across cluster- Even distribution of docs- Minimum doc movement

• Cluster map updated

• App database calls now distributed over larger number of servers

©2014 Couchbase Inc.

SERVER 4 SERVER 5

Replica

Active

Replica

ActiveActive

SERVER 1

Shard 5

Shard 2

Shard 9Shard

Shard

Shard

Replica

Shard 4

Shard 1

Shard 8Shard

Shard

Shard

Active

SERVER 2

Shard 4

Shard 7 Shard 8

Shard

Shard Shard

Replica

Shard 6

Shard 3 Shard 2

Shard

Shard Shard

Active

SERVER 3

Shard 1

Shard 3

Shard 6Shard

Shard

Shard

Replica

Shard 7

Shard 9

Shard 5Shard

Shard

Shard

• App servers accessing Shards

• Requests to Server 3 fail

• Cluster detects server failedo Promotes replicas of

Shards to activeo Updates cluster map

• Requests for docs now go to appropriate server

• Typically rebalance would follow

Shard 1 Shard 3

Shard

Managing failures

25

App Server 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

App Server 2

A look at a live cluster

26

Cross Data Center Replication

XDCR

27

©2014 Couchbase Inc.

Market leading memory-to-memory replication

28

New York

San Francisco

©2014 Couchbase Inc.

XDCR: Cross Data Center Replication

Application can access both clusters (master – master)

Scales out linearly

Different from intra-cluster replication (“CP” versus “AP”)

©2014 Couchbase Inc.

XDCR: Flexible topologies

One-one, one-many, many-one

Differently sized and resourced clusters supported

©2014 Couchbase Inc.

33 2

XDCR after Write

33

Managed Cache

Dis

k Q

ueu

e

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1

Doc 1

XDCR Queue

Doc 1Doc 1

(New in 3.0) Memory-to-Memory Replication to remote cluster

Memory-to-Memory Replication to other node

©2014 Couchbase Inc.

Indexing and Querying Features

©2014 Couchbase, Inc. 34

Index and Query Distributed indexing and querying Secondary indexes of JSON document content Flexible querying of indexes

Incremental Map-Reduce Distributed simple real-time analytics Only considers changes due to updated data

Full Text Search Robust integration with ElasticSearch / Solr cluster Flexible full text search and faceted search

©2014 Couchbase Inc.

33 2

View processing after write

35

Managed Cache

Dis

k Q

ueu

e

Disk

Replication Queue

App Server

Couchbase Server Node

Doc 1

Doc 1

To other node

View engine Doc 1Doc 1

©2014 Couchbase Inc.

Active

SERVER 1

Shard

5

Shard

2

Shard

Shard

Replica

Shard

4

Shard

1

Shard

Shard

Shard

1

Active

SERVER 3

Shard

5

Shard

2

Shard

Shard

Replica

Shard

4

Shard

1

Shard

Shard

Shard

1

Active

SERVER 2

Shard

5

Shard

2

Shard

Shard

Replica

Shard

4

Shard

1

Shard

Shard

Shard

1

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Couchbase Server Architecture - Views

©2014 Couchbase, Inc. 36

• Indexing work is distributed amongst nodes

• Large data set possible

• Parallelize the effort

• Each node has index for data stored on it

• Queries combine the results from required nodes

©2014 Couchbase Inc.

Couchbase Elastic Search Connector

©2014 Couchbase Inc.

Couchbase Solr Connector

©2014 Couchbase Inc.

Introduction to N1QL – SQL for Documents

Next generation, NoSQL query language

SQL-like : SELECT * FROM WHERE/LIKE/JOIN/GROUP/etc, CREATE INDEX

Extended for JSON to support nested and hierarchical data structures

Support for views and newly-developed secondary indexes

Query (DQL), Manipulation (DML), Description (DDL)

ODBC/JDBC drivers in development

Built into Couchbase Server:

Single installation package

Multi-threaded, stateless query and indexing components

Leverages high-performance, high-scale Couchbase buckets

Coming in 2015, preview at query.couchbase.com

©2014 Couchbase Inc.

N1QL Architecture

Single node installation, services defined dynamically

Query service access Index and Data to formulate response

All queries and direct access is topology aware and dynamically scalable

Q & AThank you.

[email protected]@dborkar