Upload
dataversity
View
972
Download
0
Tags:
Embed Size (px)
Citation preview
1
Spring 2012
Open source, high performance database
How to get started with your
MongoDB Pilot Project
Jared Rosoff (@forjared)
2
1. Why use MongoDB?
2. Finding a first project
3. Getting good at MongoDB
4. Making the business case
5. Into Production
Agenda
3
VOLUME AND TYPE OF DATA
AGILE DEVELOPMENT
• Systems scaling horizontally, not vertically
• Commodity servers
• Cloud Computing
• Trillions of records
• 10’s of millions of queries per second
• Volume of data
• Semi-structured and unstructured data
• Iterative & continuous
• New and emerging Apps
NEW ARCHITECTURES
4
DEVELOPER PRODUCTIVITY DECREASES
• Needed to add new software layers of ORM, Caching, Sharding, and Message Queue
• Polymorphic, semi-structured and unstructured data not well supported
COST OF DATABASE INCREASES • Increased database licensing cost
• Vertical, not horizontal, scaling
• High cost of SAN
INCREASES COMPLEXITY LOWERING PRODUCTIVITY
COSTS
LAUNCH +30 DAYS
+90 DAYS
+6 MONTHS
+1 YEAR
PROJECT START
DENORMALIZE DATA MODEL
STOP USING JOINS CUSTOM
CACHING LAYER
CUSTOM SHARDING
5
• Document-oriented Storage • Based on JSON Documents
• Schema-less
• Scalable Architecture • Auto-sharding
• Replication & high availability
• Open source, written in C++
• Key Features Include: • Full featured indexes
• Query language
• Map/Reduce & aggregation
6
Google Searches 451 Research
“MongoDB increasing its dominance”
#2 on Indeed’s Fastest Growing Jobs Jaspersoft BigData Index
Demand for MongoDB, the document-oriented NoSQL database, saw the biggest spike with over 200% growth in 2011.
7
1. Why use MongoDB?
2. Finding a first project
3. Getting good at MongoDB
4. Making the business case
5. Into Production
Agenda
8
User Data Management High Volume Data Feeds
Content Management Operational Intelligence Product Data Mgt
9
Characteristic Challenges MongoDB Solution
High throughput • Lots of reads
• Lots of writes
Sharding + Replication
Data variability • Variable fields in objects
• Object fields change over time
• Hard to model in relational
Document Data Model
High availability • Automatic failover
• Multi-data center deployments
Replica Sets + Tagging
Low latency • Fast response time
• Working set larger than RAM
Memory Mapped
Storage
Large volumes of
data
• Spread data over lots of disks
• Tolerance of partial failures
Sharding + Replication
10
• Look for non-customer facing use cases
– Log aggregation
– Counters & statistics
11
{
_id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
author : "roger",
date : "Sat Jul 24 2010 19:47:11",
text : "Spirited Away",
tags : [ "Tezuka", "Manga" ],
comments : [
{ author : ’’ Fred ",
date : "Sat Jul 24 2010 20:51:03",
text : "Best Movie Ever” } ,
{ author : ’’ Bill ",
date : "Sat Jul 24 2010 21:13:23",
text : ” No Way !! ” }
]
}
12
• Can I express them as atomic operations?
• Do they make sense with my data model?
• Do I need strong consistency?
Blog Platform
Blogger
Publish a Blog Post
Moderate Comments
Reader
Read a Blog Post
Submit a comment
13
• Can I quantify my requirements?
• Can I benchmark my solution?
• Do I have anything to compare it to?
14
1. Why use MongoDB?
2. Finding a first project
3. Getting good at
MongoDB
4. Making the business case
5. Into Production
Agenda
15
May 2012 May 1-2 - Data Innovations Analyst Briefing, Atlanta, GA
May 1 - Cloud Foundry Open - London, UK
May 3 - Big Panel on Big Data - Atlanta, GA
May 3 - MongoSF Workshops - San Francisco, CA
May 4 - MongoSF - San Francisco, CA
May 7-10 - DISA Federal Event - Tampa, FL
May 8 - Insight Partners Technology Forum - New York, NY
May 9 - Progressive NoSQL - London, UK
May 9 - Emerging Business Tech - Boston, MA
May 10 - Webinar : MongoDB's New Aggregation Framework
May 14 - MongoDB Oslo (Free Evening Meetup) - Oslo, Norway
May 15-16 - flatMap Oslo - Oslo, Norway
May 15 - Carahsoft Webinar: Buidling your first MongoDB Application
May 15 - VLAB NoSQL Panel - Palo Alto, CA
May 15 - MongoDB Pittsburgh (Free Evening Meetup) - Pittsburgh, PA
May 16 - Grails Meetup - London, UK
May 16 - Open Analytics Meetup - New York, NY
May 16 - London Java User Group - London, UK
May 17 - Webinar: MongoDB for Content Management
May 18 - Walkabout NYC - New York, NY
May 18-19 - PHP Day - London, UK
May 19-20 - JSConf.ar - Buenos Aires, Argentina
May 22 - MongoNYC Workshops - New York, NY
May 23 - MongoNYC - New York, NY
May 24 - Glue Conference - Denver, CO *Max Keynoting May 24 - Webinar: Building Web Services with MongoDB, Node.JS, and Openshift May 24-25 - GOTO Conference - Amsterdam, NL May 25-26 - FLOSS Conf - London, UK May 29 - NoSQL Matters - London, UK May 31 - Seedhack - London, UK
June 2012 June 1-2 - Euruko 2012 - Amsterdam, NL (Pending talk acceptance) June 3-6 - International PHP Conference - Berlin, DE June 4-5 - Berlin Buzzwords - Berlin, DE June 4 - Berlin MUG - Berlin, DE June 4 - Django Con EU (Community Member Attending) - Italy June 6-8 - NDC - Oslo, Norway June 6 - Prague MUG - Prague, CZE June 7-8 - Dutch PHP Conference - Amsterdam, NL June 7 - PyCon Asia Pacific - Singapore June 8-9 - PyGotham - New York, NY *Eliot Keynoting June 8-10 - South East Linux Fest - Charlotte, NC June 9-10 - PHP Conference - Moscow, RUS June 12 - Dataversity Webinar - Topic TBD June 13 - MongoDB Paris Workshops - Paris, FR June 13 - Rightscale Conference - New York, NY June 13-14 - Hadoop Summit - San Jose, CA June 14 - MongoDB Paris - Paris, FR June 14-15 - WindyCityDB - Chicago, IL June 18-20 - QCon - New York, NY June 19 - MongoDB UK Workshops - London, UK June 20 - MongoDB UK - London, UK June 20-21 - Gigaom Structure - San Francisco, CA June 21 - Webinar: MongoDB + Hadoop: Taming the Elephant in the Room June 23 - GoRuCo - New York, NY (Crowdtap Speaking) June 23 - TestFest - Amsterdam, NL June 25 - MongoDC Workshops - Washington, DC June 26 - MongoDC - Washington, DC June 26 - MongoDB at Big Data - Houston, TX June 26 - Red Hat Developer Day - Boston, MA June 26-29 - Open Source Bridge - Portland, OR June 27 - Jazoon - Zurich, DE June 27 - SVforum Software Architecture & Platform SIG - Mountain View, CA June 29-30 - Lone Star PHP Conference - Dallas, TX
July 2012 July 1 - SPA Conference (London)
July 2 - PyCon Italia (Italy)
July 3 - MongoDB Essentials Training (London)
July 10 - Dataversity Webinar (Topic TBD)
July 11 - MongoDB Essentials Training (China)
July 11 - Online Conference
July 12 - Carahsoft Webinar
July 13 - MongoDB Sao Paulo (Brazil)
July 14 - Gotham.js (NYC)
July 16 - MongoDB Essentials Training (Japan)
July 16 - OSCON (Portland, OR)
July 17 - MongoDB Essentials Training (Palo Alto)
July 19 - C# Webinar
July 24 - NYC MUG
July 24 - SF MUG
July 25 - 578 Broadway Startup Tour (NYC)
July 25 - MongoDB Essentials Training (Sydney, AUS)
July 25 - MongoDB San Diego (CA)
July 30 - MongoDB Essentials Training (Melbourne, AUS)
July 31 - MongoDB Essentials Training (NYC)
TBA Last Week of Month - MongoDB Israel
16
17
New York Wednesdays 4pm-6:30pm 578 Broadway
San Francisco Every other
Thursday
5pm-7pm Epicenter Café
764 Harrison St
Palo Alto Thursdays 4pm-6pm 555 University Ave
Atlanta 2nd Tuesday of the
month
4pm-6pm 1736 Defoor Pl NW
18
19
1. Why use MongoDB?
2. Finding a first project
3. Getting good at MongoDB
4. Making the business
case
5. Into Production
Agenda
20
21
22
23
24
1. Why use MongoDB?
2. Finding a first project
3. Getting good at MongoDB
4. Making the business case
5. Into Production
Agenda
25
26
RAM
Hard Disk
27
28
29
TRAINING for developers and administrators
CONSULTING expertise on a project basis
SUBSCRIPTIONS developer and production support, commercial
license and MongoDB Subscriber Edition
Commercial Support
“MediaMath is growing fast and our data volume throughput requirements are going up very quickly. MongoDB and 10gen have been extremely helpful partners for us in scaling our data infrastructure.”
Vince Li
30
• SaaS solution providing
instrumentation and visibility
into MongoDB systems
• Included in the 10gen
commercial subscriptions
• Deployed to most customers
• Free version released
• 6,500+ customers using service
MongoDB Monitoring Service
Ray Howell, Vice President of Architecture
“After adding MMS to our cluster, 10gen’s engineers detected an anomaly in our production deployment and proactively reached out to us to fix the problem before it became a production incident.”
31
1. Why use MongoDB?
2. Finding a first project
3. Getting good at MongoDB
4. Making the business case
5. Into Production
Agenda
32