Introduction to new high performance storage engines in MongoDB 2.8 Henrik Ingo Solutions Architect,...

Preview:

Citation preview

Introduction to new high performance storage engines in MongoDB 2.8

Henrik IngoSolutions Architect, MongoDB

3.0

2

Hi, I am Henrik Ingo

@h_ingo

Introduction to new high performance storage engines in MongoDB 2.8

Agenda:

- MongoDB and NoSQL - Storage Engine API - WiredTiger configuration + performance

3.0

4

Most popular NoSQL database

5

5 NoSQL categories

Key Value Wide Column Document

Graph Map Reduce

Redis, Riak Cassandra

Neo4j Hadoop

6

MongoDB is a Document Database

MongoDBRich Queries

• Find Paul’s cars• Find everybody in London with a car

built between 1970 and 1980

Geospatial• Find all of the car owners within 5km of

Trafalgar Sq.

Text Search• Find all the cars described as having

leather seats

Aggregation• Calculate the average value of Paul’s

car collection

Map Reduce• What is the ownership pattern of colors

by geography over time? (is purple trending up in China?)

{ first_name: ‘Paul’, surname: ‘Miller’, city: ‘London’, location: [45.123,47.232], cars: [ { model: ‘Bentley’, year: 1973, value: 100000, … }, { model: ‘Rolls Royce’, year: 1965, value: 330000, … } }}

7

Operational Database Landscape

MongoDB 3.0 & storage engines

9

Current state in MongoDB 2.6

Read-heavy apps

• Great performance• B-tree• Low overhead

• Good scale-out perf• Secondary reads• Sharding

Write-heavy apps

• Good scale-out perf• Sharding

• Per-node efficiency wish-list:• Doc level locking• Write-optimized data

structures (LSM)• Compression

Other

• Complex transactions• In-memory engine• SSD optimized engine• etc...

10

Current state in MongoDB 2.6

Read-heavy apps

• Great performance• B-tree• Low overhead

• Good scale-out perf• Secondary reads• Sharding

Write-heavy apps

• Good scale-out perf• Sharding

• Per-node efficiency wish-list:• Doc level locking• Write-optimized data

structures (LSM)• Compression

Other

• Complex transactions• In-memory engine• SSD optimized engine• etc...

How to get all of the above?

11

MongoDB 3.0 Storage Engine API

MMAP

Read-heavy app

WiredTiger

Write-heavy app

3rd party

Special app

12

MMAP

Read-heavy app

WiredTiger

Write-heavy app

3rd party

Special app

• One at a time:– Many engines built into mongod– Choose 1 at startup– All data stored by the same engine– Incompatible on-disk data formats (obviously)– Compatible client API

• Compatible Oplog & Replication– Same replica set can mix different engines– No-downtime migration possible

MongoDB 3.0 Storage Engine API

13

• MMAPv1– Improved MMAP (collection-level locking)

• WiredTiger– Discussed next

• RocksDB– LSM style engine developed by Facebook– Based on LevelDB

• TokuMXse– Fractal Tree indexing engine from Tokutek

Some existing engines

14

• Heap– In-memory engine

• Devnull– Write all data to /dev/null– Based on idea from famous flash animation...– Oplog stored as normal

• SSD optimized engine (e.g. Fusion-IO)

• KV simple key-value engine

Some rumored engines

https://github.com/mongodb/mongo/tree/master/src/mongo/db/storage

WiredTiger

16

• Modern NoSQL database engine– flexible schema

• Advanced database engine– Secondary indexes, MVCC, non-locking algorithms

– Multi-statement transactions (not in MongoDB 3.0)

• Very modular, tunable– Btree, LSM and columnar indexes

– Snappy, Zlib, 3rd-party compression

– Index prefix compression, etc...

• Built by creators of BerkeleyDB• Acquired by MongoDB in 2014• source.wiredtiger.com

What is WiredTiger

17

Choosing WiredTiger at server startup

mongod --storageEngine wiredTiger

http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine

18

Main tunables exposed as MongoDB options

mongod --storageEngine wiredTiger --wiredTigerCacheSizeGB 8 --wiredTigerDirectoryForIndexes /data/indexes --wiredTigerCollectionBlockCompressor zlib --syncDelay 30

http://docs.mongodb.org/master/reference/program/mongod/#cmdoption--storageEngine

19

All WiredTiger options via configString (hidden)

mongod --storageEngine wiredTiger --wiredTigerEngineConfigString "cache_size=8GB,eviction=(threads_min=4,threads_max=8), checkpoint(wait=30)"

--wiredTigerCollectionConfigString "block_compressor=zlib"

--wiredTigerIndexConfigString "type=lsm,block_compressor=zlib" --wiredTigerDirectoryForIndexes /data/indexes

See docs for wiredtiger_open() & WT_SESSION::create()http://source.wiredtiger.com/2.5.0/group__wt.html#ga9e6adae3fc6964ef837a62795c7840edhttp://source.wiredtiger.com/2.5.0/struct_w_t___s_e_s_s_i_o_n.html#a358ca4141d59c345f401c58501276bbb

20

Also via createCollection(), createIndex()

db.createCollection( "users", { storageEngine: { wiredTiger: { configString: "block_compressor=none" } } )

http://docs.mongodb.org/master/reference/method/db.createCollection/#db.createCollectionhttp://docs.mongodb.org/master/reference/method/db.collection.createIndex/#db.collection.createIndex

21

• db.serverStatus()

• db.collection.stats()

More...

Understanding and OptimizingWiredTiger

23

Understanding WiredTiger architectureW

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

24

Covering 90% of your optimization needsW

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

Decompression time

Disk seek time

25

Strategy 1: fit working set in CacheW

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

cache_size = 80%

26

Strategy 2: fit working set in OS Disk CacheW

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical disk

cache_size = 10%

OS Disk Cache (Remaining: 90%)

27

Strategy 3: SSD disk + compression to save €W

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical diskSSD

28

Strategy 4: SSD disk (no compression)W

ired

Tig

er S

E

Btree LSM Columnar

Cache (default: 50%)

None Snappy Zlib

OS Disk Cache (Default: 50%)

Physical diskSSD

29

What problem is solved by LSM indexes?P

erf

orm

ance

Fast reads Fast writesBoth

Easy: Add indexes

Easy: No indexes

Hard: Smart schema design (hire a consultant) LSM index structures (or columnar)

30

2B inserts (with 3 secondary indexes)

http://smalldatum.blogspot.fi/2014/12/read-modify-write-optimized.html