13
Happiness is MongoDB Monday, October 3 rd , 2011 Daniel Doubrovkine (dB.) [email protected] @dblockdotorg http://code.dblock.org 902 Broadway, 4th Fl. New York, NY

Using MongoDB for the Art Genome Project (Mongo Boston 2011)

Embed Size (px)

DESCRIPTION

Using MongoDB for the Art Genome Project, presented 10/3/2011 at Mongo Boston.

Citation preview

Page 1: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

Happiness is MongoDBMonday, October 3rd, 2011

Daniel Doubrovkine (dB.)[email protected]@dblockdotorghttp://code.dblock.org

902 Broadway, 4th Fl.New York, NY

Page 2: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» Claude Monet 

» Mark Grotjahn

Demo

Page 3: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

Kasimir Malevich – “Self Portrait”

- vs. -

William Beckman – “Self Portrait”

Art Genome Project

Page 4: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

= 42

Portrait Contemporary Realist Conceptual

100 100 0 75

Portrait Contemporary Realist Conceptual

100 50 70 20

Euclidean Distance

Page 5: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

def similar(a1)

artworks.each { |a2|

[a2, euclidean(a1, a2)] }.sort_by { |a, d|

d

}.take(10)

end

Fast Search in Ruby

Page 6: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

MySQL Prototype Schema

Page 7: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

Need a sorted sparse vector on boot.[ 100, 0, 20, … 60 ]10K artworks: 5 minutes to startup

5 minutes to accomplish … nothing.

MySQL Prototype Schema

Page 8: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» Genome.genes – it’s a hash!

{ “Portrait” => 100, …, “Conceptual” => 20 }

» Genome, Embedded in Artwork

MonoDB “Schema”

Page 9: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» Something new? Got (far too) many years of experience with *SQL / DW

» @harryh uses it @ 4sq

» @eliothorowitz looks pretty smart

» db.startups.find({ location : { $near : GA }, category : ‘nosql db vendor' } ).first = 10gen

» install … ? … profit

» available on Heroku from MongoHQ

» continuous deployment friendly

Choosing MongoDB

Page 10: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» MongoDB retrieval by ID is fast, maybe faster, than Ruby Hash

» Using Rails + Rake and Mongo is safer than mongo shell db.collection.update({x: y})

» Shared Hosting is not Rubber, You Can’t Stretch It

» Map/reduce for live queries really doesn’t work, no really mongoid_fulltext

» Read-secondary + Map/Reduce can be fun read_secondary: <%= $rails_rake_task.nil? or !$rails_rake_task %>

» Collection names are limited in length if you use mongodumphttps://jira.mongodb.org/browse/SERVER-2973, fixed in 2.0.0

» copyDatabase requires administrative privilegeshttps://jira.mongodb.org/browse/SERVER-2846

Using MongoDB

Page 11: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» Mongo cursors aren’t snapshotted by defaultProcessing 5183 of 4012 …http://www.mongodb.org/display/DOCS/How+to+do+Snapshotted+Queries+in+the+Mongo+Database

» Mongo Interest is growing, RoR + MongoId = GTDhttp://code.dblock.org/ror-win-getting-things-done-with-mongodb-mongoid

» Mongoid Keeps Things Entertaining, Living on the Edge

Using MongoDB (continued …)

Page 12: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

» MongoHQ Extensions via Heroku

» Production Directly w/MongoHQ

» A Few Hundred Bucks / mo.

» Mongo 1.8.1 w/ replica sets, 2 DBs and 1 arbiter

» Different Availability Zones

» Dedicated RAM, separate EBS, shared CPU

» Early Issues, Now Very Stable

» Jason McCay + other folks @ MongoHQ = Awesome

» Mongoid 2.0.2

» mongoid_slug, mongoid_fulltext, mongoid_history, delayed_job_mongoid

Deploying MongoDB

Page 13: Using MongoDB for the Art Genome Project (Mongo Boston 2011)

name: Daniel Doubrovkine (aka. dB.)

company: http://art.sy ^ work here

twitter: @dblockdotorg blog: http://code.dblock.org ^ link to slides here

email: [email protected]

Thank you.