Mongo 101 - Basics · 2018. 4. 25. · How different is MongoDB from MySQL/RDBS NoSQL and SQL are...

Preview:

Citation preview

Mongo 101 - Basics

2

Who are we?

Adamo Tonete

● Senior Support Engineer● Joined Percona in 2015● 10+ years as a DBA● 5+ years working with NoSQL

products

Rick Golba

● Product Marketing Manager● Joined Percona in 2014 as a

Solutions Engineer● 20+ years as a SQL trainer● 3+ years working with NoSQL

products

● What is MongoDB?● How different is MongoDB from MySQL?● Common MongoDB topologies● CRUD: data management● Aggregations, Import/Export, and Backups● Schema design patterns● Replica-sets and Upgrades● Securing your setup - Demo● Common issues: How to detect, verify and address them using logs, Percona

Toolkit, and Percona Monitoring and Management (PMM)

Agenda

● In order to install the software you should have a laptop with at least:○ 2 cores○ 4 gb ram○ 500 MB disk○ Internet connection○ Git client installed

We strongly suggest using Linux but Windows machines will work as well

Requirements

Installing MongoDB

Now that we already know what MongoDB is, let's install the database

For MacOS and Linux

Go to mongodb.com/downloads

git clone https://github.com/adamobr/MongoDB3XLabs

cd MongoDB3XLabs

./run_single.sh

Installing MongoDB - Single instance

First commands:

show databases

use percona

db.collection.insert({today : new Date()})

db.collection.find()

Installing MongoDB - Single instance

First let's cleanup the previous instance

./reset_lab

./run_replicaset

./3.6/bin/mongo # to connect to the replica-set.

Installing MongoDB - Replica-set

Installing MongoDB - Replica-set

Let's run some commands to describe your environment:

rs.status()

rs.config()

db.serverStatus()

What is MongoDB?

What is MongoDB

● NoSQL● Document-oriented Database● Built for fast delivery and development● Easy-to-scale database● ...and a LOT more!

How different is MongoDB from MySQL?

How different is MongoDB from MySQL/RDBS

● Some features we will compare

○ Normalization○ Transactions○ Query language○ Data are stored○ Special indexes○ How to distribute and scale

How different is MongoDB from MySQL/RDBS

● NoSQL and SQL are not enemies; they are made to complement each other● While MongoDB is a young NoSQL database, MySQL has been in the market for a couple of

years as a mature relational database system● In some cases, using MongoDB as the main database is not the best thing to do● However, MongoDB can provide a very fast-growing environment without too much effort

How different is MongoDB from MySQL/RDBS

● Comparing data distribution

○ MongoDB expects data to grow beyond machine limitations○ MySQL does have a few add-ons that allow data distribution among instances, but they

were created later by a 3rd-party company○ MySQL expects to work in a single machine with full ACID, while MongoDB doesn't

expect ACID, MongoDB is limited to the CAP theorem

What is ACID?

Atomicity: single document level & no snapshotting for readsConsistency: primary = strong | secondaries = your choiceIsolation: not really, $isolated can helpDurability: configurable w:majority and/or j:true

How different is MongoDB from MySQL/RDBS

How different is MongoDB from MySQL/RDBS

● The CAP theorem was proposed by Eric Allen in 2000

● A distributed system can't have the 3 guarantees at the same time. One of them must be sacrificed.

How different is MongoDB from MySQL/RDBS

● Consistency● Availability● Partition Tolerance

Anyone will get the same response, data is consistent among instances

A

PC

How different is MongoDB from MySQL/RDBS

● Consistency● Availability● Partition Tolerance

System will always respond to requests, no downtime

A

PC

How different is MongoDB from MySQL/RDBS

● Consistence● Availability● Partition Tolerance

System can handle errors (network, hardware failure)

A

PC

How different is MongoDB from MySQL/RDBS

A

PC

Relational Databases

MySQL

PostgreSQL

Cassandra

Riak

MongoDB

Redis

How different is MongoDB from MySQL/RDBS

How different is MongoDB from MySQL/RDBS

At each intersection is a

single scalar value

{

"_id" : ObjectId("507f1f77bcf86cd799439011"),

"studentID" : 100,

"firstName" : "Jonathan",

"middleName" : "Eli",

"lastName" : "Tobin",

"classes" : [

{ "courseID" : "PHY101",

"grade" : "B",

"courseName" : "Physics 101",

"credits" : 3 },

{ "courseID" : "BUS101",

"grade" : "B+",

"courseName" : "Business 101",

"credits" : 3 }

]

How different is MongoDB from MySQL/RDBS

● Unlike MySQL, MongoDB doesn't have a predefined schema● Documents can have different fields with different data types, for example

{x : 1, y : ['test']}

and

{x : 'percona', y : ISODate('2018-01-01')}

are both valid MongoDB documents

How different is MongoDB from MySQL/RDBS

● No joins● Rich Geo Indexing● Schema-free

How different is MongoDB from MySQL/RDBS

● MongoDB doesn't use 3rd-form normalization● All documents must have as much information as necessary

○ Linked documents are acceptable but not recommended

How different is MongoDB from MySQL/RDBS

● High availability by default in MongoDB● Replica sets is the minimum suggested way to go to production● Shards can be used to increase read/write throughput - we will discuss that in

Topology

How different is MongoDB from MySQL/RDBS

● Machine costs● If we want to scale MongoDB, we can simply add more machines

○ This is not always true for MySQL

How different is MongoDB from MySQL/RDBS

How Similar is MongoDB to MySQL?

● … but these databases are not completely different

● They share

○ Security○ Indexing○ Multi-user access - 3.6 (session)○ Multi table○ Concurrency○ Several other database concepts

How Similar MongoDB is to MySQL

Database terms and concepts

How Similar MongoDB is to MySQL

MongoDB MySQL

Database Database

Collection Table

Document Row

Key : value pair Field

● Indexes are the fast way to find a specific row or document they are very similar in both databases

How Similar MongoDB is to MySQL

● Multi-user ● Concurrent operations

How Similar MongoDB is to MySQL

MongoDB Topologies

MongoDB Topologies

It is possible to deploy MongoDB using

● Single instance● Replica set● Sharded cluster

MongoDB Topologies - Single Instance

MongoDB Topologies

Single instance

● Commonly used for testing purposes● Percona doesn't recommend using single instances for production

MongoDB Topologies - Replica Set

Replica set

● Very common for small/medium environments● Asynchronous replication● Easy-to-scale reads● Doesn't scale writes● Can have delayed members● Rely on oplog

MongoDB Topologies

MongoDB Topologies - Sharded Cluster

MongoDB Topologies

Sharded Cluster

● Act very similarly to replica-sets, but they are used to scale reads/writes among shards

● Data is divided in shards● Data can migrate among shards● If not using the right shard key, they don’t scale well● Rely on oplogs + config servers

Installing MongoDB

Now that we already know what MongoDB is, let's install the database

For MacOS and Linux

Go to mongodb.com/downloads

git clone https://github.com/adamobr/MongoDB3XLabs

cd MongoDB3XLabs

./run_single.sh

Installing MongoDB - Single instance

First commands:

show databases

use percona

db.collection.insert({today : new Date()})

db.collection.find()

Installing MongoDB - Single instance

First let's cleanup the previous instance

./reset_lab

./run_replicaset

./3.6/bin/mongo # to connect to the replica-set.

Installing MongoDB - Replica-set

Installing MongoDB - Replica-set

Let's run some commands to describe your environment:

rs.status()

rs.config()

db.serverStatus()

MongoDB Operations

MongoDB Operations

● Creating and using a database● CRUD Operations - Create, Read, Update, Delete● Create - insert

• Collections• Documents

● Read - find• Using operators

● Update - update vs upsert• Write concern considerations

● Delete - remove

Connecting to the Mongo Shell

●When connecting to a mongo instance, you connect to the test database by default

●You will likely use the --port option to connect to a specific mongod instance. In this case we are connecting via localhost.

● If connecting remotely, specify a --host option where your mongod is running

Creating a Database➔ This command shows which databases exist

➔ This command creates and connects to the percona database

➔ This command shows which db you are currently connected to

➔ Why is the database we created not showing?

Creating a Database

●The database isn’t actually created until we write data to it

● If you disconnect from the shell after using a database, but writing no data it will be gone

●Let’s write some data so it stays

Inserting Documents

●db.collection_name.insert() is the basic syntax● In this case, we inserted a simple key : value pair●We used java syntax to create a random number between 0 and 1 with our

Math.random() function●To see our insert, we use the find() function- Because this collection only has 1 document inserted, we only see one result

Inserting Documents

● It is important to note that we did not actually create a collection with the previous command explicitly. The collection was created when we inserted a document into it.

●We can see the collections in the current database using the show collections command

●The system.indexes collection contains information about indexes in the database (percona)

Reading

●Before we explore find commands in the mongo shell, let’s insert some test data into our new collection

●MongoDB allows us to write a short script to generate test data directly into the shell

●This script generates 25 random numbers from 0-1 and inserts them into our new_collection collection

Reading

●Our results show 26 documents in our new_collection●The mongo shell caps to 20 results by default

- You can iterate more by typing it●The _id field is auto-generated if not specified. It is unique like a foreign key in

SQL.

57

Full Results

Reading

●This is a search specifically by _id

●An index was created by default on this when we created our document

Reading with Operators

●There are many operators to use when querying data in MongoDB, but we will focus on these

Reading with Operators

●How would we find numbers greater than 0.9 in our collection that we added random numbers to?

A.db.new_collection.find( "random_number" : $gt : .9 )

B.db.new_collection.find( { "random_number" : { $gt : .9 } } )

C.db.new_collection.find( { "random_number" > .9 } )

D.db.new_collection.find( { "random_number" : { $gt : .9 } )

Reading with Operators

●B is the correct answer

A.db.new_collection.find( "random_number" : $gt : .9 )

B.db.new_collection.find( { "random_number" : { $gt : .9 } } )

C.db.new_collection.find( { "random_number" > .9 } )

D.db.new_collection.find( { "random_number" : { $gt : .9 } )

Reading with Operators

●Here’s what it looks like in practice

●Now that we know how to find documents, let’s update some

Updating

●The general syntax for an update is:• db.collection_name.update( query , update , options )

●By default, updating only updates a single document. Setting the multi parameter allows modifying all documents found in the query.

●Let’s update our documents with a random number value greater than .9 to 1

Updating

Updating - Write Concern

●Write concerns can be customized per operation if specified. Default behavior is used otherwise.

●w:0 = Fire and forget, no acknowledgement

●w:1 = Acknowledgement from primary only

●w: "majority" = Acknowledgement from majority of nodes with data

Application

Primary

Secondary

Secondary

Deleting

●Remove all documents in a collection: db.collection.remove()●Partial removal, you must specify your criteria as shown below

Deleting

What is the correct syntax for deleting all records in our collection with a value less than .2 ?

A.db.new_collection.remove ( { "random_number" : { $lt : .2 } } )B.db.remove( { "random_number" : { $lt : .2 } } )C.db.new_collection.remove ( { $lt : .2 } )D.db.new_collection.remove ( { "random_number" < .2 } )

Deleting

A is the correct answer

A.db.new_collection.remove ( { "random_number" : { $lt : .2 } } )B.db.remove( { "random_number" : { $lt : .2 } } )C.db.new_collection.remove ( { $lt : .2 } )D.db.new_collection.remove ( { "random_number" < .2 } )

Deleting

●Dropping an entire collection can also be done with the drop command

Deleting

●Databases can also be dropped with the dropDatabase command

Aggregation, Export/Import and Backups

MongoDB Aggregations

MongoDB features aggregations to help us to write complex queries as some calculations, grouping can not be done with standard querying

So what is the aggregation framework?

Similar to OLAP more used for analytics

MongoDB Aggregations

Aggregation framework works with a pipeline and the most common case is

Match > Project > Sort/Group > Final

Each process is dependent of the previous one and the next process depend on the previous one

The output can be a cursor or a collection

MongoDB Aggregations

An aggregation example:

db.new_collection.aggregate([{$match : {random_number : {$gte : 0.5} }}, {$project : {_id : '$random_number'}}])

Values $gte 0.5

MongoDB import and export

Like other databases there are some tools to help us to export and import data

It is possible to export a single collection, with or without a filter

With some flags it is possible to generate a CSV file to load in a different database

MongoDB backup and restore

There are couple of methods to backup and restore a MongoDB instance

But mongodump is the most common method to save a copy of the data

● Disk snapshot● Hot backup

./mongodump -d percona -c test -o perconabackup

2018-04-22T20:26:59.665-0300 writing percona.test to

2018-04-22T20:26:59.666-0300 done dumping percona.test (1 document)

MongoDB mongodump

MongoDB mongorestore

MacPro13:bin adamo$ ./mongorestore -d perconabackup perconabackup/percona/

2018-04-22T20:28:12.109-0300 the --db and --collection args should only be used when restoring from a

BSON file. Other uses are deprecated and will not exist in the future; use --nsInclude instead

2018-04-22T20:28:12.110-0300 building a list of collections to restore from perconabackup/percona dir

2018-04-22T20:28:12.111-0300 reading metadata for perconabackup.test from

perconabackup/percona/test.metadata.json

2018-04-22T20:28:12.168-0300 restoring perconabackup.test from perconabackup/percona/test.bson

2018-04-22T20:28:12.261-0300 no indexes to restore

2018-04-22T20:28:12.261-0300 finished restoring perconabackup.test (1 document)

2018-04-22T20:28:12.261-0300 done

Schema design

Schema design

Although MongoDB is a schema free database there are good practices that need

to be followed

● Use suggestive collection names

● Avoid "joins"

● Keep the document pretty simple

● Keep field names short (mmap)

Schema design

● One to some (< 16MB)

{

_id : ObjectId('8123ad324723ds9fd83453'),

text : 'this is a really simple blog post'

comments : [

{_id : 1, comment : "I really liked your post"},

{_id : 2, comment : "too simple!"}

]

}

Schema design

● One to Thousands (maybe > 16MB)

{

_id : ObjectId('8123ad324723ds9fd83453'),

brand : 'Lemon'

parts : [

ObjectId(), ObjectId(), ObjectId(), ObjectId(), ObjectId()... ObjectId() …]

}

Schema design

Think about denormalization, how to transform a couple of tables in a single

document

Installing MongoDB

Now that we already know what MongoDB is, let's install the database

For MacOS and Linux

Go to mongodb.com/downloads

git clone https://github.com/adamobr/MongoDB3XLabs

cd MongoDB3XLabs

./run_single.sh

Installing MongoDB - Single instance

First commands:

show databases

use percona

db.collection.insert({today : new Date()})

db.collection.find()

Installing MongoDB - Single instance

First let's cleanup the previous instance

./reset_lab

./run_replicaset

./3.6/bin/mongo # to connect to the replica-set.

Installing MongoDB - Replica-set

Installing MongoDB - Replica-set

Let's run some commands to describe your environment:

rs.status()

rs.config()

db.serverStatus()

Replicaset Upgrades

Replica-sets and Upgrades

● Upgrades can be done without downtime

● Drivers help, it doesn't depend only on the instances

● New versions usually can be a member of an old replica-set

Replica-sets and Upgrades

In order to upgrade a replica-set, we will take advantage of high availability

Replica-sets and Upgrades

Removing a secondary or setting the instance as hidden

Replica-sets and Upgrades

Then drivers will see this configuration

Replica-sets and Upgrades

Repeat the process in the remaining secondaries

Replica-sets and Upgrades

Step the primary down and replace the remaining old instance

Let's talk about security

MongoDB Operations

● Default roles

● Enabling authentication and creating a root and a standard user

● Starting a replica-set environment with authentication

Default Roles

● All the roles listed below come by default in the MongoDB database serverhttps://docs.mongodb.com/manual/reference/built-in-roles/

read readWrite dbAdmin dbOwner userAdmin

clusterAdmin clusterManager clusterMonitor hostManager backup

restore readAnyDatabase readWriteAnyDatabase userAdminAnyDatabase

dbAdminAnyDatabase root __system

Enabling authentication

● Creating a root user and restarting the mongod process

mongo

use admin

> db.createUser({user : 'administrator', pwd : '123321', roles : ['root']})

Successfully added user: { "user" : "administrator", "roles" : [ "root" ] }

-- mongod.conf --

#security

security

authorization : enabled

-- service restart ---

./mongod --auth

Enabling authentication

● Checking access

use admin

db.auth('administrator','123321')

1

mongo -u administrator -p --authenticationDatabase admin

password:

> show dbs

local

percona

Creating a standard user

$ mongo

db.createUser({user: 'percona_user', pwd: '123', roles : [{ role :'read', db:

'percona'}]})

Successfully added user: {

"user" : "percona_user",

"roles" : [

{

"role" : "read",

"db" : "percona"

}

]

}

Starting a replica-set with keyfile

Starting a replica-set using keyfile

Empty

Secondary

Primary

I know your secret...

● Pre existing data instance with users in the admin database

Starting a replica-set with key file

sync'edPrimary

Ok, i can trust you.

Data...__system

client

Demo

./reset_lab.sh

./get_mongod_downloads.sh

cd 3.6/bin

mkdir data1 data2 data3

./mongod --dbpath data1 --logpath data/log.log --auth --fork --replSet myrs

./mongod --dbpath data2 --logpath data2/log.log --auth --fork --replSet myrs --port 27018

./mongod --dbpath data3 --logpath data3/log.log --auth --fork --replSet myrs --port 27019

./mongo

rs.initiate()

use admin

db.createUser({user : 'admin',pwd : '123', roles : ["root"]})

Use admin

db.auth('admin','123')

rs.add({_id : 2, host : 'localhost:27018', hidden : true, priority : 0, votes : 0})

Demo

./reset_lab.sh

./get_mongod_downloads.sh

cd 3.6/bin

mkdir data1 data2 data3

openssl rand -base64 756 > mykey.key && chmod 600 mykey.key

./mongod --dbpath data1 --logpath data1/log.log --auth --fork --replSet myrs --keyFile mykey.key

./mongod --dbpath data2 --logpath data2/log.log --auth --fork --replSet myrs --port 27018 --keyFile mykey.key

./mongod --dbpath data3 --logpath data3/log.log --auth --fork --replSet myrs --port 27019 --keyFile mykey.key

./mongo

rs.initiate()

use admin

db.createUser({user : 'admin',pwd : '123', roles : ["root"]})

use admin

db.auth('admin','123')

rs.add({_id : 2, host : 'localhost:27018'})

Common issues

Common issues

As all other databases we need to work proactively to keep the environment running as expected

The most common used commands to investigate the database health are

db.serverStatus()

db.currentOp()

rs.status()

rs.printSlaveReplicationInfo()

Common issues

Common issues

11

2

Rate Our Session

Thank You!

Recommended