91
“Comics” Is Hard: Alternative Databases Ben Scofield Viget Labs 1

"Comics" Is Hard: Alternative Databases

Embed Size (px)

DESCRIPTION

Given at Developer Day Boston on August 15th

Citation preview

Page 1: "Comics" Is Hard: Alternative Databases

“Comics” Is Hard:Alternative DatabasesBen Scofield – Viget Labs

1

Page 2: "Comics" Is Hard: Alternative Databases

Modeling

flickr: bunchofpants

2

Page 3: "Comics" Is Hard: Alternative Databases

Biology

3

Page 4: "Comics" Is Hard: Alternative Databases

Linnean taxonomy4

Page 5: "Comics" Is Hard: Alternative Databases

AnimaliaChordataMammaliaCarnivoraFelidaePantheratigris

KingdomPhylumClassOrderFamilyGenusSpecies

flickr: pandiyan

5

Page 6: "Comics" Is Hard: Alternative Databases

AnimaliaChordataMammaliaCarnivoraFelidaePantheratigris

flickr: pandiyan

5

Page 7: "Comics" Is Hard: Alternative Databases

kingdom phylum class

genus family order

species organism

6

Page 8: "Comics" Is Hard: Alternative Databases

ProblemThe levels are imaginary

7

Page 9: "Comics" Is Hard: Alternative Databases

kingdom phylum class

genus family order

species organism

subphylum

superclass subclass

superordersuborder

superfamily

subfamily

subgenus

variety

subspecies

8

Page 10: "Comics" Is Hard: Alternative Databases

kingdom phylum class

genus family order

species organism

subphylum

superclass subclass

superordersuborder

superfamily

subfamily

subgenus

variety

subspecies

?8

Page 11: "Comics" Is Hard: Alternative Databases

taxon taxontaxon species organism

variety

subspecies

9

Page 12: "Comics" Is Hard: Alternative Databases

Speciesflickr: cpurrin1

10

Page 13: "Comics" Is Hard: Alternative Databases

Reproductive Isolationflickr: superciliousness

11

Page 14: "Comics" Is Hard: Alternative Databases

Reproductive Isolationflickr: superciliousness

11

Page 15: "Comics" Is Hard: Alternative Databases

12

Page 16: "Comics" Is Hard: Alternative Databases

13

Page 17: "Comics" Is Hard: Alternative Databases

flickr: niznoz

14

Page 18: "Comics" Is Hard: Alternative Databases

Numerical taxonomy15

Page 19: "Comics" Is Hard: Alternative Databases

Cladistics16

Page 20: "Comics" Is Hard: Alternative Databases

taxon taxon organismclade

17

Page 21: "Comics" Is Hard: Alternative Databases

ProblemCladistics is historical and

counter-intuitive

18

Page 22: "Comics" Is Hard: Alternative Databases

flickr: goellnitz

19

Page 23: "Comics" Is Hard: Alternative Databases

flickr: goellnitzflickr: pcoin

19

Page 24: "Comics" Is Hard: Alternative Databases

The ChallengeUnclear, imprecise domain

20

Page 25: "Comics" Is Hard: Alternative Databases

comics

21

Page 26: "Comics" Is Hard: Alternative Databases

publisher title issue

22

Page 27: "Comics" Is Hard: Alternative Databases

23

Page 28: "Comics" Is Hard: Alternative Databases

publisher imprint title

issue

24

Page 29: "Comics" Is Hard: Alternative Databases

25

Page 30: "Comics" Is Hard: Alternative Databases

publisher imprint title

issue volume

26

Page 31: "Comics" Is Hard: Alternative Databases

27

Page 32: "Comics" Is Hard: Alternative Databases

27

Page 33: "Comics" Is Hard: Alternative Databases

28

Page 34: "Comics" Is Hard: Alternative Databases

29

Page 35: "Comics" Is Hard: Alternative Databases

publisher imprint title

issue volume

trade

30

Page 36: "Comics" Is Hard: Alternative Databases

31

Page 37: "Comics" Is Hard: Alternative Databases

31

Page 38: "Comics" Is Hard: Alternative Databases

32

Page 39: "Comics" Is Hard: Alternative Databases

32

Page 40: "Comics" Is Hard: Alternative Databases

33

Page 41: "Comics" Is Hard: Alternative Databases

33

Page 42: "Comics" Is Hard: Alternative Databases

publisher imprint title

issue volume

tradevariant

34

Page 43: "Comics" Is Hard: Alternative Databases

35

Page 44: "Comics" Is Hard: Alternative Databases

35

Page 45: "Comics" Is Hard: Alternative Databases

36

Page 46: "Comics" Is Hard: Alternative Databases

36

Page 47: "Comics" Is Hard: Alternative Databases

publisher imprint title

issue volume

tradevariant

name

37

Page 48: "Comics" Is Hard: Alternative Databases

38

Page 49: "Comics" Is Hard: Alternative Databases

38

Page 50: "Comics" Is Hard: Alternative Databases

38

Page 51: "Comics" Is Hard: Alternative Databases

39

Page 52: "Comics" Is Hard: Alternative Databases

39

Page 53: "Comics" Is Hard: Alternative Databases

publisher imprint title

issue

nested set?volume

tradevariant

name

?

40

Page 54: "Comics" Is Hard: Alternative Databases

41

Page 55: "Comics" Is Hard: Alternative Databases

42

Page 56: "Comics" Is Hard: Alternative Databases

publisher

imprint

title

issue

nested set?volume

tradevariant

name

storyline ?!?!43

Page 57: "Comics" Is Hard: Alternative Databases

horror

superhero

Martial Arts

noir

Pirate

science fiction

independent

historical

genres?

44

Page 58: "Comics" Is Hard: Alternative Databases

publisherimprint

title

issue

nested set?volume

tradevariant

name

storyline

genre@#&*!45

Page 59: "Comics" Is Hard: Alternative Databases

The ChallengeComplete insanity

46

Page 60: "Comics" Is Hard: Alternative Databases

databases unite!

#?forben#?forben

Alternatives

flickr: ikhnaton2

47

Page 62: "Comics" Is Hard: Alternative Databases

Key-Value

49

Page 63: "Comics" Is Hard: Alternative Databases

Cassandra*

Tokyo CabinetRedis

Project Voldemort

50

Page 64: "Comics" Is Hard: Alternative Databases

require "rubygems"require "tokyocabinet" include TokyoCabinet bdb = BDB::new # B-Tree database; keys may have multiple valuesbdb.open("casket.bdb", BDB::OWRITER | BDB::OCREAT) # store records in the database, allowing duplicatesbdb.putdup("key1", "value1")bdb.putdup("key1", "value2")bdb.put("key2", "value3")bdb.put("key3", "value4") # retrieve all valuesp bdb.getlist("key1")# => ["value1", "value2"] # range query, find all matching keysp bdb.range("key1", true, "key3", true)# => ["key1", "key2", "key3"]

http://www.igvita.com/2009/02/13/tokyo-cabinet-beyond-key-value-store/

51

Page 65: "Comics" Is Hard: Alternative Databases

Biology x

Comics x

52

Page 66: "Comics" Is Hard: Alternative Databases

Configuration ✓Caching ✓

Translations ✓

53

Page 67: "Comics" Is Hard: Alternative Databases

54

Page 68: "Comics" Is Hard: Alternative Databases

Document

55

Page 69: "Comics" Is Hard: Alternative Databases

56

Page 70: "Comics" Is Hard: Alternative Databases

{ 'name':'Ben Scofield', 'adjective':'awesomesauce'}

{ 'name':'Magic Pony', 'description':'It is a *lie*!'}

57

Page 71: "Comics" Is Hard: Alternative Databases

Biology ✓

Comics x

58

Page 72: "Comics" Is Hard: Alternative Databases

{ 'kingdom':'Animalia', ‘phylum’:‘Chordata’, ‘subphylum’:‘Vertebrata’, ‘class’:‘Mammalia’, ‘subclass’:‘Eutheria’, ‘order’:‘Carnivora’, ‘family’:‘Felidae’, ‘subfamily’:‘Panthernae’, ‘genus’:‘Pantera’, ‘species’:‘tigris’, ‘name’:‘Wanda’}

59

Page 73: "Comics" Is Hard: Alternative Databases

Graph

60

Page 74: "Comics" Is Hard: Alternative Databases

Java

AllegroGraphJava / Lisp

61

Page 75: "Comics" Is Hard: Alternative Databases

http://neotechnology.com/why-neo

62

Page 77: "Comics" Is Hard: Alternative Databases

flickr: 9948354@N08

64

Page 78: "Comics" Is Hard: Alternative Databases

Biology ✓

Comics ✓

65

Page 79: "Comics" Is Hard: Alternative Databases

Felidae

001

tigris

Panthera

Panthernae

member

speciesof

Wanda

name

3

age

300

weight

genusof

Animalia

...

subfamilyof

genusof

66

Page 80: "Comics" Is Hard: Alternative Databases

Felidae

001

tigris

Panthera

Panthernae

member

speciesof

Wanda

name

3

age

300

weight

genusof

Animalia

...

subfamilyof

genusof

010

tigris

species

name

type

66

Page 81: "Comics" Is Hard: Alternative Databases

Picornaviridae

002

human

rhinovirus A

Rhinovirus

member

speciesof

genusof

Picornavirales

familyof

Group IV

orderof

67

Page 82: "Comics" Is Hard: Alternative Databases

Picornaviridae

002

human rhinovirus A

Rhinovirus

member

speciesof

genusof

Picornavirales

familyof

Group IV

orderof

Felidae

tigris

Panthera

Panthernae

speciesof

genusof

subfamilyof

genusof

001

member

Carnivora

familyof

Mammalia

orderof

class group

68

Page 83: "Comics" Is Hard: Alternative Databases

DC

001

DC Universe

titleof

imprintof

titleof

Volume 1

volumeof

Green

Lantern nameof

Sci Fi

genreof

genreof

genreof

002

issueof genreof

Superhero

Issue #2precedes

003

coverof

#1

number

69

Page 84: "Comics" Is Hard: Alternative Databases

require 'neo4j'

Neo4j::Transaction.run do dc = Neo4j::Node.new dc[:name] = 'DC' dc_universe = Neo4j::Node.new dc_universe[:name] = 'DC Universe' dc.relationships.outgoing(:imprints) << dc_universe rel = dc.relationships.outgoing(:imprints).first rel[:started] = 1980 vol1 = Neo4j::Node.new vol1[:started] = 1941 vol1[:ended] = 1949 vol1[:name] = 'Green Lantern' dc.relationships.outgoing(:titles) << vol1 dc_universe.relationships.outgoing(:titles) << vol1

# ...end

70

Page 85: "Comics" Is Hard: Alternative Databases

flickr: joriel

Hybrid Solutions

71

Page 86: "Comics" Is Hard: Alternative Databases

72

Page 87: "Comics" Is Hard: Alternative Databases

72

Page 88: "Comics" Is Hard: Alternative Databases

tagpost

document id

{ 'title':'Post Title' 'content':'Hello!' 'comments':["First!"]}

73

Page 89: "Comics" Is Hard: Alternative Databases

74

Page 90: "Comics" Is Hard: Alternative Databases

user accounts and whatnot

{ 'name':'...' 'attr1':'...' 'attr7':'...'}

search

75

Page 91: "Comics" Is Hard: Alternative Databases

Thank Youben scofield - @bscofield

http://benscofield.comhttp://www.viget.com/extend

http://www.speakerrate.com/bscofield76