Nibiru: Building your own NoSQL store

Building a nosql from scratchLet them know what they are missing!

#ddtx16@edwardcapriolo@HuffPostCode

If you are looking for

A battle tested NoSQL data store That scales up to 1 million transactions a second Allows you to query data from your IoT sensors in real time You are at the wrong talk! This is a presentation about Nibiru An open source database I work on in my spare time But you should stay anyway...

Motivations Why do that? How this got started? What did it morph into? Many NoSQL databases came out of an industry specific use

case and as a result they had baked in assumptions. If we have clean interfaces and good abstractions we can make a better general tool with lessed forced choices.

Pottentially support a majority of the use cases in one tool.

A friend asked

Won't this make Nibiru have all the bugs of all the systems?

My response

You might want to follow along with local copy

There are a lot of slides that have a fair amount of code https://github.com/edwardcapriolo/nibiru/blob/master/hexagon

s.ppt http://bit.ly/1NcAoEO

Basics

Terminology

Keyspace: A logical grouping of store(s) Store: A structure that holds data

Avoided: Column Family, Table, Collection, etc Node: a system Cluster: a group of nodes

Assumptions & Design notes

A store is of a specific type Key Value, Column Family, etc The API of the store is dictated by the type Ample gotchas from one man, after work, project Wire components together, not into a large context Using string (for now) instead of byte[] for debug

Server ID

We need to uniquely identify each node Hostname/ip is not good solution

Systems have multiple Can change

Should be able to run N copies on single node

Implementation

On first init() create guid and persist

Cluster Membership

What is a list of nodes in the cluster? What is the up/down state of each node?

Static Membership

Different cluster membership models

Consensus/Gossip Cassandra Elastic Search

Master Node/Someone elses problem HBase (zookeeper)

Gossip

http://www.joshclemm.com/projects/

Teknek Gossip

Licenced Apache V2 Forked from google code project Available from maven g: io.teknek a: gossip Great tool for building a peer-to-peer service

Cluster Membership using Gossip

Get Live Members

Gutcheck

Did clean abstractions hurt the design here? Does it seem possible we could add zookeeper/etcd as a

backend implemention? Any takers? :)

Request Routing

Some options

So you have a bunch of nodes in a cluster, but where the heck does the data go? Client dictated - like a sharded memcache|mysql|whatever HBase - Sharding with a leader election Dynamo Style - ring topology token ownership

Router & Partitioners

Pick your poison: no hot spots or key locality :)

Quick example LocalPartitioner

Scenario: using a Dynamo-ish router

Construct a three node topology Give each an id Give them each a token Test that requests route properly

Cluster and Token information

Unit Test

Token Router

Do the Damn Thing!

Do the Damn Thing! With Replication

Storage Layer

Basic Data Storage SSTables

SS = Sorted String { 'a', $PAYLOAD$ },{ 'b', $PAYLOAD$ }

LevelDB SSTable payload

Key Value implementation SortedMap<byte, byte>

{ 'a', '1' }, { 'b', '2' }

Cassandra SSTable Implementation

Key Value in which value is a map with last-update-wins versioning

SortedMap<byte, SortedMap <byte, Val<byte,long>>

{ 'a', { 'col':{ 'val', 1 } } }, { 'b', {

'col1':{ 'val', 1 }, 'col2':{ 'val2', 2 }

HBase SSTable Implementation

Key-Value in which value is a map with multi-versioning

SortedMap<byte, SortedMap <byte, Val<byte,long>>

{ { 'a', { 'col':{ 'val', 1 } } },

{ 'b', { 'col1':{ 'val', 1 },

'col1':{ 'valb', 2 }, 'col2':{ 'val2', 2 }

Column Family Store high level

Operations to support

One possible memtable implementation

Holy Generics batman! Isn't it just a map of map?

Unforunately no!

Imagine two requests arrive in this order: set people [edward] [age]='34' (Time 2) set people [edward] [age]='35' (Time 1)

What should be the final value? We need to deal with events landing out of order Also exists delete write known as Tombstone

And then, there is concurrency

Multiple threads manipulating at same time Proposed solution: (Which I think is correct)

Do not compare and swap value, instead append to queue and take a second pass to optimize

Optimization 1: BloomFilters

Use guava. Smart! Audiance: make disapointed aww sound because Ed did not

write it himself

Optimization 2: IndexWriter

Not ideal to seek a disk like you would seek memory

Consistency

Multinode Consistency

Replication: Number of places data lives Active/Active Master/Slave (with takover) Resolving conflicted data

Quorum Consistency Active/Active Implemantation

Message dispatched

Asyncronos Responses T1

Asyncronos Responses T2

Logic to merge results

Breakdown of components

Start & dedline : Max time to wait for requests Message : The read/write request sent to each destination Merger : Turn multiple responses into single result

Testing

Challenges of timing in testing

Target goal is ~ 80% unit 20% integetration (e2e) testing Performance varies in local vs travis-ci Hard to test something that typically happens in milliseconds

but at worst case can take seconds Lazy half solution: Thread.sleep() statements for worst case

Definately a slippery slope

Introducing TUnit

https://github.com/edwardcapriolo/tunit

The End

Nibiru: Building your own NoSQL store

Software

Planet X, Nibiru, Gabriel's Fist

Schemaless NoSQL Data Stores – Object-NoSQL Mappers to the ... · NoSQL data store depends strongly on the underlying NoSQL data model. Furthermore, different mappings are possible

GT.M - Multi-purpose Universal NoSQL Databasemumps.cz/gtm/misc/120730-1agtmasnosqldatabase.pdf · Summary – Universal NoSQL • At the heart of all databases is a data store •

NoSQL Data Store Technologies · 01 SEP 2014 2. REPORT TYPE N/A 3. DATES COVERED 4. TITLE AND SUBTITLE NoSQL Data Store Technologies 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM

Scaling HBase (nosql store) to handle massive loads at Pinterest by Jeremy Carol

Nibiru Rising

Артем Логинов "NoSQL DBMSs review and non-relational approaches to store data"

Nibiru Elenin PDF

NoSQL Performance Benchmark 2018 - Couchbase, Inc. · 2.4 DataStax Enterprise (Cassandra) cluster configuration DataStax Enterprise (Cassandra) is a wide-column store NoSQL database

ANUNNAKI - Solar System-With Nibiru

Postgrtesql as a NoSQL Document Store - The JSON/JSONB data type

Nibiru Podcasts_ Podcast de Nibiru Existe - Inymis

Nibiru and Planet XXX

Why you don't need NOSQL solutions for time series data · 2017-06-16 · Why you don't need NOSQL solutions for time series data Making Oracle 12cR2 a “relational NOSQL store”

Nibiru - Effects of the Nibiru Complex

Oracle NoSQL Database – A Distributed Key-Value Store · HPTS, October 24, 2011 Agenda • Oracle and NoSQL • Oracle NoSQL Database Architecture • Oracle NoSQL Database Technical

Nibiru Collision

NoSQL – Beyond the Key-Value Store

NoSQL Data Store Technologies NoSQL Data Store Technologies ... and completing and reviewing the collection of information. Send comments ... These requirements were then used to create

DB2 NoSQL Graph Store What, Why & Overview - IBM · DB2 NoSQL Graph Store What, Why & Overview Mario Briggs, March 28 2012 ... DB2 RDF support is implemented in rdfstore.jar file