Upload
henry-lee-mou-yen
View
215
Download
0
Embed Size (px)
Citation preview
8/4/2019 Cam List Ore Talk 2011-02-01
1/27
Camlistorehttp://camlistore.org/
Brad Fitzpatrick,[email protected]
8/4/2019 Cam List Ore Talk 2011-02-01
2/27
What is Camlistore?
20% project, personal itch to scratcho sick of building CMS-like systemso livejournal, photos, brackup, scanning
cabinet, my websites, ...ohacking since jun 2010, idle planning for
~year before that.
storage system?
o yeah, but not the whole story. "a way to store, sync, share, model and back
up content" ...
8/4/2019 Cam List Ore Talk 2011-02-01
3/27
Religion (or lack thereof)
"cloud" or local? both.o for storageo for content creation (sync adapters)
programming language? any, all.o interfaces are what matters.
identity / verifiableness
you should own your content, have backup
dorks are great, but this must be usable.omental model: "Your stuff. Can't screw it
up."
8/4/2019 Cam List Ore Talk 2011-02-01
4/27
oh, right, it's an acronym...
Content- Addressable, Multi- Layer, Indexed, Storage
8/4/2019 Cam List Ore Talk 2011-02-01
5/27
Content-Addressable
at the bottom-most layer, everything isaddressed by its digest of bytes
Terminology:"Blob" -- 0 or more bytes. No extra-metadata."Blobref" -- handle to a blob, in form - e.g. sha1-8a30407962eeb19b309b78ddf587aea18ab55232
8/4/2019 Cam List Ore Talk 2011-02-01
6/27
Content-Addressability properties
trivial caching / syncing: you have it or youdon't.ono "which version do you have?"
content-deduplicationomultiple users having same content,o filesystem backup snapshots
incrementals are cheap etc...
8/4/2019 Cam List Ore Talk 2011-02-01
7/27
Multi-Layer
Unix school of thought:o small, well-defined composable tools
Camlistore has multiple layers / parts:o blob server: super dumbo schema: how one might represent datao search/indexer: make sense of dumbness,
datao frontend: interact with world, sharing.
8/4/2019 Cam List Ore Talk 2011-02-01
8/27
Architecture
8/4/2019 Cam List Ore Talk 2011-02-01
9/27
Blob Server: how dumb it is...
Private operations, to owner of data only:oget(blobref) -> bloboput(blobref, blob)oenumerate(..) -> [(blobref, size), ...]
Public/non-owner operations:o none!
GET /camli/sha1-xxxxxxxxx HTTP/1.0....Hello, world!
8/4/2019 Cam List Ore Talk 2011-02-01
10/27
Blob Server: dumbness continued
so, just blobs. remember: no meta-data no "filenames" no "mime types" no "{create,mod,access} time" nothing seriously, no metadata!
owe will fight you. (and have)
8/4/2019 Cam List Ore Talk 2011-02-01
11/27
Uh, what can you do with that?
Not a terrible lot. But let's start with an easy example at this
layer...
8/4/2019 Cam List Ore Talk 2011-02-01
12/27
Filesystem backups
previous project: brackupo slide/dice/encrypt S3 backup, content-
addressed, but only files C-A, not dirs. fossil/venti, git: recursive directories content-
addressedogit: "tree objects"
camlistore: "schema blobs"
8/4/2019 Cam List Ore Talk 2011-02-01
13/27
Schema Blobs
so if all blobs are just dumb blobs of bytes withno metadata,
how do you store metadata?
as blobs themselves! how to recognize it? same way you sniff a
JPEG. magic. start with a '{'? parse as JSON? in memory schema -> JSON object serialization
with "camliVersion" key == "schema blob"
8/4/2019 Cam List Ore Talk 2011-02-01
14/27
Schema Blob
Minimal "schema blob" is:{
"camliVersion": 1,"camliType": "whatever"
}Whitespace doesn't matter. Just must be valid JSON in itsentirety. Use whatever JSON libraries you've got.That one is named sha1-19e851fe3eb3d1f3d9d1cefe9f92c6f3c7d754f6
8/4/2019 Cam List Ore Talk 2011-02-01
15/27
Schema Blob; type "file"
{"camliVersion": 1,"camliType": "file","fileName": "foo.dat","unixPermission": "0644",...,"size": 6000133,"contentParts": [{"blobRef": "sha1-...dead", "size": 111},{"blobRef": "sha1-...beef", "size": 5000000, "offset": 492 },{"size": 1000000},{"blobRef": "digalg-blobref", "size": 22},]
}
8/4/2019 Cam List Ore Talk 2011-02-01
16/27
Schema Blob; type "directory"
{"camliVersion": 1,"camliType": "directory","fileName": "foodir","unixPermission": "0755",...,"entries": "sha1-c3764bc2138338d5e2936def18ff8cc9cda38455",
}
8/4/2019 Cam List Ore Talk 2011-02-01
17/27
Schema Blob; type "static-set"
{"camliVersion": 1,"camliType": "static-set","members": [
"sha1-xxxxxxxxxxxx","sha1-xxxxxxxxxxxx","sha1-xxxxxxxxxxxx","sha1-xxxxxxxxxxxx","sha1-xxxxxxxxxxxx","sha1-xxxxxxxxxxxx",
],}
8/4/2019 Cam List Ore Talk 2011-02-01
18/27
Backup your filesystem...
$ camput --file $HOMEsha1-8659a52f726588dc44d38dfb22d84a4da2902fed(like git/hg/fossil, that identifier represents everything down.)Iterative backups are cheap, easy identifier to share, etc.
8/4/2019 Cam List Ore Talk 2011-02-01
19/27
But what about mutable data?
immutable data is easy to represent &reference
how to represent mutable data in an
immutable, content-addressed world? how to share a reference to a mutable object
when changing an object mutates its name?
8/4/2019 Cam List Ore Talk 2011-02-01
20/27
Objects & Permanodes
8/4/2019 Cam List Ore Talk 2011-02-01
21/27
Terminology...
"object": a set of blobs representing a mutable object. youmodify an object by adding a new mutation claim blob to theset."signed schema blob" or "claim": a schema blob that youJSON-sign. (OpenPGP) aside: bootstrapping tools for this."permanode": a blob that's just a signed schema blob of arandom number that serves as the anchor and reference point
for the blob. like a "permalink" on the web.
8/4/2019 Cam List Ore Talk 2011-02-01
22/27
Permanode$ camput --permanodesha1-ea799271abfbf85d8e22e4577f15f704c8349026$ camget sha1-ea799271abfbf85d8e22e4577f15f704c8349026{"camliVersion": 1,"camliSigner": "sha1-c4da9d771661563a27704b91b67989e7ea1e50b8","camliType": "permanode","random": "oj)r}$Wa/[J|XQThNdhE"
,"camliSig":"iQEcBAABAgAGBQJNRxceAAoJEGjzeDN/6vt8ihIH/Aov7FRIq4dODAPWGDwqL1X9Ko2ZtSSO1lwHxCQVdCMquDtAdI3387fDlEG/ALoT/LhmtXQgYTt8QqDxVduEK1or6/jqo3RMQ8tTgZ+rW2cj9f3Q/dg7el0Ngoq03hyYXdo3whxCH2x0jajSt4RCcgdXN6XmLlOgD/LVQEJ303Du1OhCvKX1A40BIdwe1zxBc5zkLmoa8rClAlHdqwogxYFY4cwFm+jJM5YhSPemNrDe8W7KT6r0oA7SVfOan1NbIQUel65xwIZBD0ah
CXBx6WXvfId6AdiahnbZiBup1fWSzxeeW7Y2/RQwv5IZ8UgfBqRHvnxcbNmScrzlp3V3ZoY==BfKn"}
8/4/2019 Cam List Ore Talk 2011-02-01
23/27
S h / I d
8/4/2019 Cam List Ore Talk 2011-02-01
24/27
Search / Indexer...
S h / I d
8/4/2019 Cam List Ore Talk 2011-02-01
25/27
Search / Indexer
subscribes to blobs in real-timeo or enumerates / mapreduces world on init
builds index of:o directed blob graph,o resolved attributes,o set memberships,o dates, tags, ...owhatever's needed
eventually consistent
P i M d l & Sh i
8/4/2019 Cam List Ore Talk 2011-02-01
26/27
Privacy Model & Sharing
all your blobs & searches are privateo nothing public by default
to share something (a blob, object, or search
query e.g. "recent public photos of mine") youcreate a "share"claimo claim = "signed schema blob"
demo: http://camlistore.org/docs/sharing
Q ti ? htt // li t /
http://camlistore.org/docs/sharinghttp://camlistore.org/docs/sharing8/4/2019 Cam List Ore Talk 2011-02-01
27/27
Questions? http://camlistore.org/