MongoDB and Node.js

Embed Size (px)

Citation preview

Mongo and Node

I was going for a super butch shot of Mongolian warriors but she is just so adorable!

Node is non blocking

Instead of waiting for a long action return, you provide a callback that executes when the action is finished

You can send several instructions to Mongo without waiting on the result

https://github.com/bignomanatee/workstack

Node is a language written in the style of the Mongol warrior. When Mongol warriors went through a village chopping off heads they didn't wait for the heads to hit the ground before moving on to the next one.Similarly in Node, you can simply pass along a callback to be executed at the end of a long action and its more than likely that that long action is repository related. Even though Mongo itself is relatively non-blocking, there are many times when there is no real reason to wait around for a specific action to execute before getting on with your life.Sometimes though you might want aggregate blocking. i.e., I want to do a series of inserts possibly to more than one collection. I don't care which order they execute in or which one ends first, but I do want to know when ALL of them are done. I wrote a special library for aggregate action tracking. It is really just decoration of the onTimeout with a tracker that increments when you add actions and decrements when all the actions are done when the tracking index is zero (or less ? ) a callback is fired off.

Cristkof Native

Gives you complete low level access to all MongoDB commands

That means, of course, Map/Reduce

Comes in two flavors: Pure Javascript and Native C library driven

github.com/christkv/node-mongodb-native

Christkof Native is the low level gateway that almost all Node users rely on to access Mongo data. It is also the gateway that Mongoose uses. Anything you can do in Mongo you can do in Native that includes commands, map reduce, etc. Native provides c-based bindings to the Mongo libraries; however it also provides Pure Javascript based access to Mongo that does not rely on the c libraries. Why would you want to not use the c bindings? If for some reason the c libraries don't work in a given environment, you are still guaranteed to have a way to get at your database.

Sometimes it is nice in testing to be able to log activity through Native

Going Native

Basic CRUD in Mongo Native is NOT THAT HARDhttp://wonderlandlabs.com/wll_drupal/node/mongo/coll.html

And there are benefits Map/Reduce among them to using the native libraries directly.

If you understand how Node works and that alone is a hump using native is not that tough. I wrote a model wrapper for native that gives you get, put, find, etc. Note that neither Native nor my custom library is Active Record in the last it operates on raw JavaScript (JSON) Objects and doesn't do any checking against schema, content, or whatever. This means the code that calls it is responsible for ensuring that the data you send to the database is good, consistent with your app, etc. My custom hack lacks a lot of the features that make Mongoose awesome chiefly, a native schema but that also means I have full access to Native, which is great if you care about things like Map/Reduce.

Mongoose: Schemas and More

http://mongoosejs.com/Mongo by its nature is schemaless

Mongoose gives you the ability FORCES you to create schemas for your collections

Mongo is effectively transactionless.

Mongoose uses inference to enforce transactions. (I think.)

Node, Native is non-blocking.

Mongoose blocks when blocking is a good thing.

Mongoose is all about the schema it adds to Mongo what people most miss when they start using Mongo a defined, type aware schema that allows by which I mean forces you to submit data in a specific format. While you can put un-schema'd data in a field by defining the field's schema as an empty object {}, at least in the root of the document every field must be defined in the schema or it is not submitted to the database. This is a fine way to operate and it gives you yet another way to tell the snarky SQL people to go fuck themselves, it is definitely an editorial decision by the Mongoose people. You also get implicit transactions in that when you attempt to write, you get blocked if fields that you haven't changed since you read a record are changed when you write it. Once again, blow me, SQL, you got nothing we don't. The one area that Mongoose violates the non-blocking rule is the one area in which it should. Mongoose blocks until it connects to your database preventing you from issuing commands to a dead connection.

Mongolian Deadbeef

https://github.com/marcello3d/node-mongolian

While I haven't played with Mongolian Deadbeef, it looks interesting and is worth noting as it is a very fast gateway and has some pretty impressive metrics in the repo. It is another illustration of how easy it is to work with and/or around Mongo Native in Node. It is interesting to me because he very closely maps the mongo shell syntax to node, where native and especially Mongoose (and my own code) take the modelling approach and attempt to force mongo to express itself like a classical or activeRecord based ORM.

Noogling the Node.js channel

I used Mongo to index and digest the Node.js channel logs.

Noogle indexes every word of every line of months of conversations on node.js

This means you can do and searches across the words property of the collection and get a listing of lines of conversation relevant to said keywords.

Noogle is actually a web spider optimized to poll the chat logs of the Node.js channel.The Noogle front page uses a map-reduced database of word frequencies. This is based on EVERY WORD of every conversation in the last year on the channel, with stopwords removed. Honestly the best search engine for this sort of application is probably still Solr, but the fact that you can take this kind of a survey across the conversations in Mongo is pretty bitchin. The full map/reduce cycle to produce this report takes several minutes to a half hour. I am working on a way to make that process incremental so that going forward I don't have to re-poll the entire history to produce this report.

Here are the results for a search on callback. The full list is 3,319 items long. (to come- pagination)

Here is a single conversation. There isn't any intelligent polling to remove the elements of the conversation that don't relate to the search terms in practice I just didn't see the payoff in this particular application when there are other areas including personalization that I want to develop. I grab around 300 lines of conversation around the line clicked on from the previous screen and display them here.

Thanks

Do everything in your power to make MySQL users feel stupid and out of touch with modern sofware trends. So, have you sharded your database yet?Why not?And how many replicas are you operating from?

Muokkaa otsikon tekstimuotoa napsauttamalla

Muokkaa jsennyksen tekstimuotoa napsauttamallaToinen jsennystasoKolmas jsennystasoNeljs jsennystasoViides jsennystasoKuudes jsennystasoSeitsems jsennystasoKahdeksas jsennystasoYhdekss jsennystaso