29
MARC, XML, JSON And Other Stuff Like That So We Can All Speak A Common Language And Build The Future Together MJ Suhonos Code4Lib North 2013

Code4Lib North 2013: Metadata

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Code4Lib North 2013: Metadata

MARC, XML, JSONAnd Other Stuff Like That So We Can All Speak A Common Language And Build The Future Together

MJ SuhonosCode4Lib North 2013

Page 2: Code4Lib North 2013: Metadata

WARNINGMAY CONTAIN:

Code-Like ExamplesGross Simplifications

No Unicorns

Page 3: Code4Lib North 2013: Metadata

ISO-270901041cam 2200265 a 4500001002000000003000400020005001700024008004100041010002400082020002500106020004400131040001800175050002400193082001800217100003200235245008700267246003600354250001200390260003700402300002900439500004200468520022000510650003300730650001200763^###89048230#/AC/r91^DLC^19911106082810.9^891101s1990####maua###j######000#0#eng##^##$a###89048230#/AC/r91^##$a0316107514 :$c$12.95^##$a0316107506 (pbk.) :$c$5.95 ($6.95 Can.)^##$aDLC$cDLC$dDLC^00$aGV943.25$b.B74 1990^00$a796.334/2$220^10$aBrenner, Richard J.,$d1941-^10$aMake the team.$pSoccer :$ba heads up guide to super soccer! /$cRichard J. Brenner.^30$aHeads up guide to super soccer.^##$a1st ed.^##$aBoston :$bLittle, Brown,$cc1990.^##$a127 p. :$bill. ;$c19 cm.^##$a"A Sports illustrated for kids book."^##$aInstructions for improving soccer skills. Discusses dribbling, heading, playmaking, defense, conditioning, mental attitude, how to handle problems with coaches, parents, and other players, and the history of soccer.^#0$aSoccer$vJuvenile literature.^#1$aSoccer.^\

Page 4: Code4Lib North 2013: Metadata

“MARC”01041cam 2200265 a 4500001 ###89048230 003 DLC 005 19911106082810.9008 891101s1990 maua j 001 0 eng010 ## $a ###89048230 020 ## $a 0316107514 : $c $12.95020 ## $a 0316107506 (pbk.) : $c $5.95 ($6.95 Can.)040 ## $a DLC $c DLC $d DLC050 00 $a GV943.25 $b .B74 1990082 00 $a 796.334/2 $2 20100 1# $a Brenner, Richard J., $d 1941-245 10 $a Make the team. $p Soccer : $b a heads up guide to super soccer! / $c Richard J. Brenner.246 30 $a Heads up guide to super soccer250 ## $a 1st ed.260 ## $a Boston : $b Little, Brown, $c c1990.300 ## $a 127 p. : $b ill. ; $c 19 cm.500 ## $a "A Sports illustrated for kids book."520 ## $a Instructions for improving soccer skills. Discusses dribbling, heading, playmaking, defense, conditioning, mental attitude, how to handle problems with coaches, parents, and other players, and the history of soccer.650 #0 $a Soccer $v Juvenile literature.650 #1 $a Soccer.

Page 5: Code4Lib North 2013: Metadata

MARCXML<collection xmlns="http://www.loc.gov/MARC21/slim"><record> <leader>01048cam a2200277 a 4500</leader> <controlfield tag="001"> 89048230 </controlfield> <controlfield tag="003">DLC</controlfield> <controlfield tag="005">19990716000000.0</controlfield> <controlfield tag="008">891101s1990 maua j 000 0 eng </controlfield> <datafield tag="010" ind1=" " ind2=" "> <subfield code="a"> 89048230 </subfield> </datafield> <datafield tag="020" ind1=" " ind2=" "> <subfield code="a">0316107514 :</subfield> <subfield code="c">$12.95</subfield> </datafield> <datafield tag="020" ind1=" " ind2=" "> <subfield code="a">0316107506 (pbk.) :</subfield> <subfield code="c">$5.95 ($6.95 Can.)</subfield> </datafield> <datafield tag="050" ind1="0" ind2="0"> <subfield code="a">GV943.25</subfield> <subfield code="b">.B74 1990</subfield> </datafield> <datafield tag="082" ind1="0" ind2="0"> <subfield code="a">796.334/2</subfield> <subfield code="2">20</subfield> </datafield> <datafield tag="100" ind1="1" ind2=" "> <subfield code="a">Brenner, Richard J.,</subfield> <subfield code="d">1941-</subfield> </datafield> <datafield tag="245" ind1="1" ind2="0"> <subfield code="a">Make the team.</subfield> <subfield code="p">Soccer :</subfield> <subfield code="b">a heads up guide to super soccer! /</subfield> <subfield code="c">Richard J. Brenner.</subfield> </datafield> <!-- ..................... --></record></collection>

Page 6: Code4Lib North 2013: Metadata

MARC-HASH{ "fields": [ ["001", " 89048230 "], ["003", "DLC"], ["005", "19990716000000.0"], ["008", "891101s1990 maua j 000 0 eng "], ["010", " ", " ", [ ["a", " 89048230 "] ] ], ["020", " ", " ", [ ["a", "0316107514 :"], ["c", "$12.95"] ] ], [ "020", " ", " ", [ [ "a", "0316107506 (pbk.) :"], [ "c", "$5.95 ($6.95 Can.)" ] ] ], [ "040", " ", " ", [ [ "a", "DLC"], [ "c", "DLC"], [ "d", "DLC" ] ] ], [ "042", " ", " ", [ [ "a", "lcac" ] ] ], [ "050", "0", "0", [ [ "a", "GV943.25"], [ "b", ".B74 1990" ] ] ], [ "082", "0", "0", [ [ "a", "796.334/2"], [ "2", "20" ] ] ], [ "100", "1", " ", [ [ "a", "Brenner, Richard J.,"], [ "d", "1941-" ] ] ], [ "245", "1", "0", [ [ "a", "Make the team."], [ "p", "Soccer :"], [ "b", "a heads up guide to super soccer! /"], [ "c", "Richard J. Brenner." ] ] ], [ "250", " ", " ", [ [ "a", "1st ed." ] ] ], [ "260", " ", " ", [ [ "a", "Boston :"], [ "b", "Little, Brown,"], [ "c", "c1990." ] ] ], [ "300", " ", " ", [ [ "a", "127 p. :"], [ "b", "ill. ;"], [ "c", "19 cm." ] ] ], [ "500", " ", " ", [ [ "a", "\"A Sports illustrated for kids book.\"" ] ] ], [ "520", " ", " ", [ [ "a", "Instructions for improving soccer skills. Discusses dribbling, heading, playmaking, defense, conditioning, mental attitude, how to handle problems with coaches, parents, and other players, and the history of soccer." ] ] ], [ "650", " ", "0", [ [ "a", "Soccer"], [ "x", "Juvenile literature." ] ] ], [ "650", " ", "1", [ [ "a", "Soccer." ] ] ], [ "740", "0", " ", [ [ "a", "Heads up guide to super soccer." ] ] ] ], "leader": "01048cam a2200277 a 4500", "type": "marc-hash", "version": [1, 0]}

Page 7: Code4Lib North 2013: Metadata

• MARC

• MARCXML

• MARC-HASH

What’s the difference?

SERIALIZATION

Page 8: Code4Lib North 2013: Metadata

Serialization: the process of ! translating data structures into a ! format that can be stored or ! transmitted.

Page 9: Code4Lib North 2013: Metadata
Page 10: Code4Lib North 2013: Metadata

Unicode Replacement Character

Page 11: Code4Lib North 2013: Metadata

• MARC-8

• ISO-8859-1

• UTF-8

What’s the difference?

ENCODING

Page 12: Code4Lib North 2013: Metadata

Encoding: the process by which ! characters are converted into ! byte sequences that can be ! stored or transmitted.

Page 13: Code4Lib North 2013: Metadata
Page 14: Code4Lib North 2013: Metadata

• AACR2

• Dublin Core

• RDA

What’s the difference?

SCHEMA

Page 15: Code4Lib North 2013: Metadata

Schema: metadata standards ! intended to establish a common ! understanding of the meaning of ! data structures and values.

Page 16: Code4Lib North 2013: Metadata

SCHEMA

ENCODING

SERIALIZATION

+

+

“Metadata Format”

Page 17: Code4Lib North 2013: Metadata

ISO-2709

SERIALIZATION Binary(multiple flavours) ✗

ENCODINGMARC-8, and/orUTF-8, and/or

who-knows-what ✗

SCHEMA AACR2(and maybe RDA) ✓*

* assuming AACR2 is a reasonable data-centric schema, which it is not.

Page 18: Code4Lib North 2013: Metadata

MARCXML

SERIALIZATION XML ✓ENCODING

UTF-8, orISO-8859-1, or

who-knows-what ✗

SCHEMA AACR2(and maybe RDA) ✓*

* assuming AACR2 is a reasonable data-centric schema, which it is not.

Page 19: Code4Lib North 2013: Metadata

MARC-HASH

SERIALIZATION JSON ✓ENCODING UTF-8 ✓

SCHEMA AACR2(and maybe RDA) ✓*

* assuming AACR2 is a reasonable data-centric schema, which it is not.

Page 20: Code4Lib North 2013: Metadata

Ideal Metadata Format

Page 21: Code4Lib North 2013: Metadata

• Independent from schema and encoding

• Choose from multiple serializations

• XML-based formats are BAD for this

• PREMIS, MODS, EAD ...

Serialization

Page 22: Code4Lib North 2013: Metadata

Encoding

• UTF-8 or GTFO

• Otherwise, part of serialization spec

Page 23: Code4Lib North 2013: Metadata

SchemasSchemae? Schemata?

• Data-centric (like Dublin Core)

• Not markup-centric (like AACR2, HTML)

• Test: if order or punctuation matters:It’s NOT data-centric

Page 24: Code4Lib North 2013: Metadata

ResourceDescriptionFramework

Page 25: Code4Lib North 2013: Metadata

RDF

• A framework, NOT a “metadata format”

• For combining schemas, encodings & serializations

• Multiple schemas in a single resource (“record”)

Page 26: Code4Lib North 2013: Metadata

RDA Example

SERIALIZATION RDF-XML ✓ENCODING UTF-8 ✓

SCHEMA RDA ✓

Page 27: Code4Lib North 2013: Metadata

FOAF Example

SERIALIZATION N3 ✓ENCODING UTF-8 ✓

SCHEMA FOAF ✓

Page 28: Code4Lib North 2013: Metadata

DC+MODS Example

SERIALIZATION JSON-LD ✓ENCODING UTF-8 ✓

SCHEMA DCTERMSMODSRDF ✓

Page 29: Code4Lib North 2013: Metadata

SCHEMA

ENCODING

SERIALIZATION

+

+

[email protected]