Upload
sarath-lakshman
View
9
Download
4
Embed Size (px)
DESCRIPTION
Couchbase view engine internals
Citation preview
§ Overview of incremental Map/Reduce
§ Architecture of Couchbase views
§ Database Change Protocol (DCP) for Views
§ Index on-disk storage lower layer rewrites
§ Faster index updates and indexing latency
§ Read Your Own Writes (RYOW) for view queries
Overview
©2014 Couchbase, Inc. 2
j
What are Map/Reduce views ?
• In Couchbase, Map-Reduce is specifically used to create Indexes • Map functions are applied to JSON documents and they output or "emit"
data that is organized in an Index form
Map function
©2014 Couchbase, Inc. 5
function (doc, meta) { !"if (doc.type == “beer” && doc.brewery_id && doc.name) { !" "emit(doc.name, doc.abv); !"} !
}
Sample View
function (keys, values, rereduce) { ! var maxscore = 0; ! for (var i = 0; i < values.length; i++) { ! if (values[i].score > maxscore) { ! maxscore = values[i].score; ! } ! } ! return maxscore; !}
Reduce function
©2014 Couchbase, Inc. 6
Design documents
©2014 Couchbase, Inc. 7
Couchbase Bucket
Design Document 2
View View View
Indexers Are Allocated Per Design Doc
All Updated at Same Time
Can Only Access Data in the
Bucket Namespace
Design Document 1
View View
Querying view indexes
©2014 Couchbase, Inc. 8
http://localhost:8092/beer-sample/_design/beer/_view/brewery_beers?limit=10
§ Document updates are delivered to view engine in near real time
§ Disk reads are not required from document storage engine
§ Faster index creation
§ Lower indexing latencies § Ability for view engine to rollback during node failures without full
index rebuild
Database Change Protocol and Views
§ Lock contention and bottlenecks in Erlang VM
§ View engine needs to be faster inorder to consume and process database changes through DCP
§ Rewritten index builder, index updater and index compactor
§ Further work will improve rebalance duration with views
Rewrite of index engine in storage layer
©2014 Couchbase, Inc. 16
Benchmark on indexing latency
34916
359 0
10000
20000
30000
40000
Couchbase 2.5.1 Couchbase 3.0
©2014 Couchbase, Inc. 17
Time taken (ms) for an updated document to get indexed in a view index
4 nodes, 1 bucket, 20M docs of size 2KB, 250 mutations/sec
§ Stale = ok/true § Least query latency § Returns the query results from index storage § Index is by default incrementally updated every 5 seconds § Stale = update_after (default staleness setting) § Similar to stale=ok § Forces indexer to perform index update immediately as part of query ignoring index update
interval § Stale = false § Higher query latency § Triggers indexer update and wait for index to be updated at least up to the current document store § Query results are returned only after index is updated
Query options for staleness / consistency
©2014 Couchbase, Inc. 19
§ Performing RYOW in Couchbase 2.5 § Stale=false only ensured that persisted documents are available in the view query results § User should ensure a document has persisted by using Observe command before querying
§ RYOW using Couchbase 3.0 § Stale=false ensures that query results are atleast up-to-date till the point in time of request
§ How view engine ensures at-least point-in-time consistency § Document dataset is partitioned to smaller set across the cluster § Each partition has a sequence number that incremented on every update operation on a
document § View engine notes down the current sequence numbers for all the partitions during a stale=false
query and waits until index is updated atleast up to those sequence numbers
Read Your Own Writes (RYOW)
©2014 Couchbase, Inc. 20