40
.NET User Group Bern Roger Rudin bbv Software Services AG [email protected]

NET User Group · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Embed Size (px)

Citation preview

Page 1: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

.NET User

Group

Bern

Roger Rudin

bbv Software Services AG

[email protected]

Page 2: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Agenda

– What is NoSQL

– Understanding the Motivation behind NoSQL

– MongoDB: A Document Oriented Database

– NoSQL Use Cases

Page 3: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

What is NoSQL?

NoSQL = Not only SQL

Page 4: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

NoSQL Definition http://nosql-database.org/

NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points: being non-relational,

distributed, open-source and horizontal scalable. The original

intention has been modern web-scale databases. The

movement began early 2009 and is growing rapidly. Often

more characteristics apply as: schema-free, easy replication

support, simple API, eventually consistent /BASE (not ACID),

a huge data amount, and more. So the misleading

term "nosql" (the community now translates it mostly with

"not only sql") should be seen as an alias to something like

the definition above.

Page 5: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Who Uses NoSQL? • Twitter uses DBFlock/MySQL and Cassandra

• Cassandra is an open source project from Facebook

• Digg, Reddit use Cassandra

• bit.ly, foursquare, sourceforge, and New York Times use

MongoDB

• Adobe, Alibaba, Ebay, use Hadoop

Page 6: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

UNDERSTANDING THE

MOTIVATION BEHIND NOSQL

Page 7: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Why SQL sucks..

• O/R mapping (also known as Impedance Mismatch)

• Data-Model changes are hard and expensive

• SQL database are designed for high throughput, not low latency

• SQL Databases do no scale out well

• Microsoft, Oracle, and IBM charge big bucks for databases

– And then you need to hire a database admin

• Take it from the context of Google, Twitter, Facebook and Amazon.

– Your databases are among the biggest in the world and nobody pays you for that feature

– Wasting profit!!!

Page 8: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

What has NoSQL done?

• Implemented the most common use cases

as a piece of software

• Designed for scalability and performance

Page 9: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Visual Guide To NoSQL http://blog.nahurst.com/visual-guide-to-nosql-systems

Page 10: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

NoSQL Data Models

• Key-Value

• Document-Oriented

• Column Oriented/Tabular

Page 11: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

MONGODB: A DOCUMENT

ORIENTED DATABASE

Page 12: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

NoSQL Data Model: Document

Oriented

• Data is stored as “documents” • We are not talking about Word documents

• Comparable to Aggregates in DDD

• It means mostly schema free structured data • Can be queried

• Is easily mapped to OO systems (Domain

Model, DDD)

• No join need to implement via programming

Page 13: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Network Communications

• REST/JSON

• TCP/BSON (ClientDriver)

BSON [bee · sahn], short for Bin­ary JSON, is a bin­ary-

en­coded seri­al­iz­a­tion of JSON-like doc­u­ments.

Like JSON, BSON sup­ports the em­bed­ding of

doc­u­ments and ar­rays with­in oth­er doc­u­ments and

ar­rays. BSON also con­tains ex­ten­sions that al­low

rep­res­ent­a­tion of data types that are not part of the

JSON spec. For ex­ample, BSON has a Date type and a

BinData type.

Page 14: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Client Drivers (Apache License)

• MongoDB currently has client support for the following programming languages:

• C

• C++

• Erlang

• Haskell

• Java

• Javascript

• .NET (C# F#, PowerShell, etc)

• Perl

• PHP

• Python

• Ruby

• Scala

Page 15: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Collections vs. Capped Collection

(Table in SQL)

• Collections • blog.posts

• blog.comments

• forum.users

• etc.

• Capped collections (ring buffer) • Logging

• Caching

• Archiving

db.createCollection("log", {capped: true, size: <bytes>, max: <docs>});

Page 16: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Indexes

• Every field in the document can be indexed

• Simple Indexes:

db.cities.ensureIndex({city: 1});

• Compound indexes:

db.cities.ensureIndex({city: 1, zip: 1});

• Unique indexes:

db.cities.ensureIndex({city: 1, zip: 1}, {unique: true});

• Sort order: 1 = descending, -1 = ascending

Page 17: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)
Page 18: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Relations

• ObjectId db.users.insert(

{name: "Umbert", car_id: ObjectId("<GUID>")});

• DBRef db.users.insert(

{name: "Umbert", car: new DBRef("cars“, ObjectId("<GUID>")});

db.users.findOne(

{name: "Umbert"}).car.fetch().name;

Page 19: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Queries (1)

Page 20: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Queries (Regular Expressions)

{field: /regular.*expression/i}

// get all cities that start with “atl” and end on “a” (e.g. atlanta)

db.cities.count({city: /atl.*a/i});

Page 21: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Queries (2) : LINQ https://github.com/craiggwilson/fluent-mongo

Equals

x => x.Age == 21 will translate to {"Age": 21}

Greater Than, $gt:

x => x.Age > 18 will translate to {"Age": {$gt: 18}}

Greater Than Or Equal, $gte:

x => x.Age >= 18 will translate to {"Age": {$gte: 18}}

Less Than, $lt:

x => x.Age < 18 will translate to {"Age": {$lt: 18}}

Less Than Or Equal, $lte:

x => x.Age <= 18 will translate to {"Age": {$lte: 18}}

Not Equal, $ne:

x => x.Age != 18 will translate to {"Age": {$ne: 18}}

Page 22: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Atomic Operations (Optimistic

Locking)

• Update if current: • Fetch the object.

• Modify the object locally.

• Send an update request that says "update the object

to this new value if it still matches its old value".

Page 23: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Atomic Operations: Sample

> t=db.inventory

> s = t.findOne({sku:'abc'})

{"_id" : "49df4d3c9664d32c73ea865a" , "sku" : "abc" , "qty" : 1}

> t.update({sku:"abc",qty:{$gt:0}}, { $inc : { qty : -1 } } ) ;

> db.$cmd.findOne({getlasterror:1})

{"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it has worked

> t.update({sku:"abcz",qty:{$gt:0}}, { $inc : { qty : -1 } } ) ;

>db.$cmd.findOne({getlasterror:1})

{"err" : , "updatedExisting" : false , "n" : 0 , "ok" : 1} // did not work

Page 24: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Atomic Operations: multiple

items

db.products.update(

{cat: “boots”, $atomic: 1},

{$inc: {price: 10.0}},

false, //no upsert

true //update multiple

);

Page 25: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Replica set (1)

• Automatic failover

• Automatic recovery of servers that were

offline

• Distribution over more than one

Datacenter

• Automatic nomination of a new Master

Server in case of a failure

• Up to 7 server in one replica set

Page 26: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

ReplicaSet

RECOVERING

Replica set (2)

PRIMARY DOWN

PRIMARY

Page 27: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Mongo Sharding

• Partitioning data across multiple physical servers to

provide application scale-out

• Can distribute databases, collections or objects in a

collection

• Choose how you partition data (shardkey)

• Balancing, migrations, management all automatic

• Range based

• Can convert from single master to sharded system with

0 downtime

• Often works in conjunction with object replication

(failover)

Page 28: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Sharding-Cluster

Page 29: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Map Reduce http://www.joelonsoftware.com/items/2006/08/01.html

• It is a two step calculation where one

step is used to simplify the data, and the

second step is used to summarize the

data

Page 30: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Map Reduce Sample

Page 31: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Map Reduce using LINQ https://github.com/craiggwilson/fluent-mongo/wiki/Map-Reduce

• LINQ is by far an easier way to compose map-reduce functions.

// Compose a map reduce to get the sum everyone's ages. var sum = collection.AsQueryable().Sum(x => x.Age); // Compose a map reduce to get the age range of everyone grouped by the first letter of their last name. var ageRanges = from p in collection.AsQueryable() group p by p.LastName[0] into g select new { FirstLetter = g.Key, AverageAge = g.Average(x => x.Age), MinAge = g.Min(x => x.Age), MaxAge = g.Max(x => x.Age) };

Page 32: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Store large Files: GridFS

• The database supports native storage of binary data within BSON objects (limited in size 4 – 16 MB).

• GridFS is a specification for storing large files in MongoDB

• Comparable to Amazon S3 online storage service when using it in combination with replication and sharding

Page 33: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Performance

On MySql, SourceForge was reaching its limits of performance at its current user load. Using some of the easy scale-out options in MongoDB, they fully replaced MySQL and found MongoDB could handle the current user load easily. In fact, after some testing, they found their site can now handle 100 times the number of users it currently supports.

It means you can charge a lot less per user of your application and get the same revenue. Think about it.

Page 34: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Performance http://www.michaelckennedy.net/blog/2010/04/29/MongoDBVsSQLServer2008PerformanceShowdown.aspx

• It’s the inserts where the differences are most

obvious between MongoDB and SQL Server

(about 30x-50x faster than SQL Server)

Page 35: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Administration: MongoVUE

(Windows)

Page 36: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Administration: Monitoring

• MongoDB

Monitoring Service

Page 37: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

NOSQL USE CASES

Page 38: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Use Cases: Well suited

• Archiving and event logging

• Document and Content Management Systems

• E-Commerce

• Gaming. High performance small read/writes, geospatial indexes

• High volume problems

• Mobile. Specifically, the server-side infrastructure of mobile systems

• Projects using iterative/agile development methodologies

• Real-time stats/analytics

Page 39: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Use Cases: Less Well Suited

• Systems with a heavy emphasis on

complex transactions such as banking

systems and accounting (multi-object

transactions)

• Traditional Non-Realtime Data

Warehousing

• Problems requiring SQL

Page 40: NET User Group  · PDF file• Adobe, Alibaba, Ebay, use Hadoop . UNDERSTANDING THE MOTIVATION BEHIND NOSQL . ... • Scala . Collections vs. Capped Collection (Table in SQL)

Questions?

[email protected]