CP344 – Databasescs.coloradocollege.edu/~mwhitehead/courses/2015_2016/CP344/Le… · id1 doc1 id2...

Preview:

Citation preview

CP344 – Databases

Open Notes Chapter 10:Document-Based Databases

Human embryos can be grown in lab for longer than 14 days

Tech News!

Medical error third leading cause of death

Human embryos can be grown in lab for longer than 14 days

Tech News!

Hackers' Tip of the Day:Create indexes in MySQL

CREATE INDEX color_index on Jeep(color);

Table/Indexing Updates?

Main Weakness of Key/Value Stores

● Data only has a manually defined structure

Main Weakness of Key/Value Stores

● Data only has a manually defined structure

● Redis stores data structures (sets, lists, etc)● These data structures cannot easily be queried

Document Store

Key Value

id1 doc1

id2 doc2

id3 doc3

id4 doc4

id5 doc5

id6 doc6

Document Store

Key Value

id1 doc1

id2 doc2

id3 doc3

id4 doc4

id5 doc5

id6 doc6

What's a document?

XML Document

XML DocumentHuman and machine readable.

XML DocumentHuman and machine

readable.

Built-in schema.

XML DocumentHuman and machine readable.

Built-in schema.

User-defined tags.

JSON Document

JSON DocumentHuman and machine

readable.

JSON DocumentHuman and machine

readable.

Flexible schema.

JSON DocumentHuman and machine readable.

Flexible schema.

Subset of Javascript.

YAML Document

YAML DocumentHuman and machine

readable.

YAML DocumentHuman and machine readable.

Easy to read with whitespace.

RESTful APIs(Representational State Transfer)

http://www.dog.com/search?q=”dog”

Normal HTTP GET request

RESTful APIs(Representational State Transfer)

http://www.dog.com/search?q=”dog”

{“dogName”: “Mr. Paws”,“breed”: “Golden-pointer”,“favBed”: “Paw Palace”,“favPastime”: “Barking”

}

JSON

Normal HTTP GET request

Exercise: Access Freebase

https://www.googleapis.com/freebase/v1/search?query=dog

Example request:

import jsonimport urllib2

Download text from a queryParse text using json libraryPrint out result number 3

Python pseudocode:

BSON Documents

● Binary JSON

● Values are stored in binary instead of plain text● Save space● Faster to read● Not human-readable

Document Stores

● Pros● Documents have built-in schema● Fast key/value lookup● Easy to split up across machines

Document Stores

● Pros● Documents have built-in schema● Fast key/value lookup● Easy to split up across machines

● Cons● Code must keep track of schema of each doc.● No overall database structure● Some queries are hard to write

Document Stores

● MongoDB

● CouchDB

● Terrastore

MongoDB Examples

Exercise: Insert Freebase Articles to MongoDB

Download JSON from freebase

Insert each result as a separate MongoDB document

Run an example MongoDB query that searches documents based on one attribute

Pseudocode:

import pymongo

conn = pymongo.Connection('localhost', 27017)db = conn['test_database']coll = db['test_collection']

doc = {"Name":"Benny", "Password":"Pancake"}docID1 = coll.insert(doc)

pymongo example:

Final Project

Recommended