Upload
ethel-gordon
View
218
Download
0
Embed Size (px)
Citation preview
MongoDB
First Light
Mongo DB Basics
• Mongo is a document based NoSQL.– A document is just a JSON object.– A collection is just a (large) set of documents– A database is just a set of collections.
• JSON = JavaScript Object Notation– Actually BSON = binary encoded JSON
• Mongo shell is a JavaScript interpreter!– (And I have never coded JavaScript, ahem...)
CS@AU Henrik B Christensen 2
An In-between NoSQL
• The ends of the spectrum– Key-value stores
• Know the key to access opaque blob of anything• Fire-and-forget (write-and-forget)
– RDB• Elaborate ad-hoc queries over highly structured data
(Schema)• Normalized meaning ‘lots’ of tables• Transactions
• MongoDB sits somewhere in the middle• Documents have elaborate (OO) structure (but not fixed!)• Rather powerful query language (no joins though)• From fire-and-forget to ‘acknowledge write on all replica’
CS@AU Henrik B Christensen 3
JSON
• Get used to key/value pairs!
• { course: ”SAiP”, semester:”E12”, teacher: ”hbc” }
• Basically close to fields of OO languages
– The architectural mismatch between programming language and DB concepts is lessened!
CS@AU Henrik B Christensen 4
Basic commands…
• MongoDB creates objects and collections in the fly…
CS@AU Henrik B Christensen 5
No schema enforced...
CS@AU Henrik B Christensen 6
Schema: Pro and Con
• Schema can provide a lot of data safety– Validating data, avoid hard-to-find bugs in clients, ...
• However, they are also costly to migrate
• MongoDB is pretty handy in agile and early development when the ‘schema’ changes often...
CS@AU Henrik B Christensen 7
find()
• You can formulate simple queries using ‘find()’ on a collection. Of course, the parameter of find is– A JSON object!
CS@AU Henrik B Christensen 8
More complex queries
• Regular expressions, and, or...
CS@AU Henrik B Christensen 9
Hey – what about updates?
• Update– 1 argument: the document to find– 2 argument: the values to add/set/update
CS@AU Henrik B Christensen 10
Mongo 3 has updatedthe API a bit!
Adding more structure
• Now, after I go home you decide to give my talk grades. – No new tables, schema, etc.– We just add more structure, similar to OO
• Ahh – one late grade arrives – justs $push it
CS@AU Henrik B Christensen 11
Or - using SkyCave
CS@AU Henrik Bærbak Christensen 12
RoomRecord like stuff
CS@AU Henrik Bærbak Christensen 13
Pretty() is pretty nice
CS@AU Henrik Bærbak Christensen 14
RegExps
CS@AU Henrik Bærbak Christensen 15
Sorting on fields
CS@AU Henrik Bærbak Christensen 16
Bounded result: ‘limit’
CS@AU Henrik Bærbak Christensen 17
Wall exercise?
CS@AU Henrik Bærbak Christensen 18
Adding msg
CS@AU Henrik Bærbak Christensen 19
Players
CS@AU Henrik Bærbak Christensen 20
Now…
• How do we compose the ‘getShortRoomDesc()’?
• SELECT r.desc FROM room r, player pWHERE p.name = ”Mikkel” AND p.pos = r. pos
• ???
CS@AU Henrik Bærbak Christensen 21
The NoSQL answer
• The NoSQL answer: Manual references!– It is client-side responsibility to join
• Find p.pos using query 1; next find r.desc using query 2– (§4.4.2 in MongoDB manual 3.0.6)
• Exercise– Why it is this the right answer in a NoSQL world?
• Hint: Think 10.000 clients, think CPU cycles – where?
CS@AU Henrik Bærbak Christensen 22
Alternatives
• Solution 2:– Denormalize / Embedded documents
• But not always possible for complex data structures• But may actually slow queries down depending on search
patterns– Searching inside documents is more tedious
• Solution 3:– DBRefs
• special MongoDB feature to make it even more SQL like
CS@AU Henrik B Christensen 23
MongoDB modeling
Comparing Documents to Tables
CS@AU Henrik B Christensen 24
Entry on social network site: Schema
CS@AU Henrik B Christensen 25
As RDB Schema
• The RDB version
CS@AU Henrik B Christensen 26
Discussion
• Thus Mongo has less need for joining because the datamodel is richer – Arrays of complex objects– Sub objects
• Avoids the RDB idioms for modeling OneToMany relations
• ManyToMany handled by manual references– Two ‘find()’ instead of one ‘Select’
• And– Replaces many random reads with fewer sequential
CS@AU Henrik B Christensen 27
Going Large
Durability, Scaling, Replication and Sharding
CS@AU Henrik B Christensen 28
Durability
• RDBs guaranty Durability– Once a data update is acknowledged, data is stored
• MongoDB is configurable (write concern)
– Unacknowledge: fire-and-forget– Acknowledged: acknowledge the write operation– Journaled: at least one will store data– Replica acknow.: at least N replica has received the
write operation
CS@AU Henrik B Christensen 29
Scaling out
• To get more power/space – just add more...
CS@AU Henrik B Christensen 30
Replication
• Replica sets– Primary (handles writes/reads)– N secondaries (only reads)– Eventual consistency!
• Failover is automatic– Secondary votes– New primary selected
• Experience: Easy!
CS@AU Henrik B Christensen 31
Sharding
• Key goals– No change in the client side API!
• When our EcoSense data grows out of its boxes we do not have to change our client programs!
– Auto sharding• You configure your shard key as ranges on your document
keys
– Shard balancing• Migrates data automatically if one shard grows too large
• Experience: Nope
CS@AU Henrik B Christensen 32