WW2 underground newspapers on Wikipedia using DBPedia , 12-2-2016, The Hague

Preview:

Citation preview

Dutch WW2 underground newspapers on Wikipedia

6th International DBpedia Community Meeting, 12-02-2016, The Hague

Olaf Janssen, Koninklijke Bibliotheek

olaf.janssen@kb.nl - @ookgezellig - slideshare.net/OlafJanssenNL CC-BY-SA

htt

p:/

/ww

w.4

en5

mei

amst

erd

am.n

l/at

tach

men

t/4

74

54

During WW2 ± 1.300 Dutch underground newspapers have been issued

In every shape & form…

htt

p:/

/ww

w.4

en5

mei

amst

erd

am.n

l/at

tach

men

t/4

74

54

http://resolver.kb.nl/resolve?urn=ddd:010436323

http://resolver.kb.nl/resolve?urn=ddd:010442948

http://resolver.kb.nl/resolve?urn=ddd:010447825 http://resolver.kb.nl/resolve?urn=ddd:010450508

From well-known big titles

(o.a. Parool, Vrij Nederland, Trouw, de Waarheid)

After the war many titles have

1) been (physically) preserved at the NIOD …

https://commons.wikimedia.org/wiki/File:Verzetskrant_in_archiefdozen_bij_het_NIOD.jpg – CC-BY-SA - OlafJanssen

The national Institute for War, Holocaust and Genocide Studies in Amsterdam

http://opac-gonext.oclc.org:8180/DB=8/XMLPRS=Y/PPN?PPN=107123223

.. were 2) described in formal library catalogues

Bibliographic metadata

.. were 3) digitized in Delpher …

The Dutch national aggregator for historic full-text newspapers, books and magzines

http://resolver.kb.nl/resolve?urn=ddd:010424553:mpeg21:p001

• Scans • Full-text OCR

.. and were 4) contextualized & interlinked

1 by 1 in a book

Context

.. and were 4) contextualized & interlinked

1 by 1 in a book

Relation

Newspaper Placename

semantics, linked data

.. and were 4) contextualized & interlinked

1 by 1 in a book

Relation

Newspaper Persons

semantics, linked data

.. and were 4) contextualized & interlinked

1 by 1 in a book

Relation

Newspaper Other newspapers

semantics, linked data

This book has been OCRed into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

Converted into structured, linked data Linked to KB-catalogue (metadata) and Delpher (full-text) Linked to other sources (DBpedia, VIAF, Gemeentegeschiedenis.nl, Nationaal Archief)

This book has been OCRed into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

Convert PDF into structured, linked data Link to KB-catalogue (metadata) and Delpher (full-text) Link people and places to external sources (VIAF, Gemeentegeschiedenis.nl, Nationaal Archief,

Biografisch Portaal)

This book has been OCRed into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

Convert PDF into structured, linked data Link titles to KB-catalogue (metadata) and Delpher (full-text) Link people and places to external sources (VIAF, Gemeentegeschiedenis.nl, Nationaal Archief,

Biografisch Portaal)

This book has been OCRed into PDF (CC-BY-SA)

http://www.niod.nl/nl/de-ondergrondse-pers-1940-1945 (PDF)

Available online (PDF, flat file)

Open license (CC-BY-SA)

Convert PDF into structured, linked data Link titles to KB-catalogue (metadata) and Delpher (full-text) Link titles, people and places to external sources (VIAF, Gemeentegeschiedenis.nl,

Nationaal Archief, Biografisch Portaal)

So:

a lot of information is available about these WW2 underground newspapers

(and the related persons & places) …

... but the chunks of data are (largely)

unconnected!

htt

p:/

/2.b

p.b

logsp

ot.

com

/_BW

zuYw

iS6-I

/TM

geR

sFd3m

I/AAAAAAAAElw

/3cv

gbZSPW

cs/s

1600/d

oct

or+

macr

o+

judy+

scare

d.jpg

... making discovery, understanding & research

for many people harder than necessary.

... making discovery, understanding & research

for many people harder than necessary.

htt

ps:

//n

l.wik

iped

ia.o

rg/w

iki/

Cat

ego

rie:

Illeg

ale_

per

s_in

_de_

Twee

de_

Wer

eld

oo

rlo

g

Today, only 14 of these 1.300 newspapers are described on WP:NL

The Wikiproject Verzetskranten will change this!

Systematically and uniformly describe & interlink all 1.300 Dutch underground newspapers from WW2

on Wikipedia

tinyurl.com/verzetskranten

Automatically makes data available for open reuse projects

Wikidata -- DBpedia -- Dataviz

From 14 1.300 titles

Global approach

1. Make central LOD-database

2. Build article template

3. Generate WP-article stubs -- using 1. and 2.

4. Involve WP-community to expand stubs into full WP-articles

5. Make dataset available for open reuse Wikidata -- DBpedia -- Dataviz -- et al.

First time data about undergound newspapers is systematically collected and linked online!

LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)

Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal

LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)

Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal

LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)

Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal

LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)

Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal

LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)

Link titles, people and places to external sources Dbpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal

LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)

Link titles, people and places to external sources DBpedia VIAF Gemeentegeschiedenis.nl Nationaal Archief Biografisch Portaal

LOD-database for underground newspapers Convert PDF into structured, linked data RDF-triplestore (Virtuoso, SPARQL, Bibframe) Link titles to KB-catalogue (metadata) & Delpher (full-text) Using PPNs (unique IDs for publications in NL)

Link titles, people and places to external sources DBpedia Wikipedia

VIAF Nationaal Archief Biografisch Portaal

htt

p:/

/ww

w.4

en5

mei

amst

erd

am.n

l/at

tach

men

t/4

74

54

So we have a LOD-database with data about 1.300 underground newspapers

Using an article template we can generate 1.300 uniform and interlinked WP-stubs

htt

ps:

//c1

.sta

ticf

lickr

.co

m/9

/82

81

/76

99

23

19

18

_11

a73

56

c38

_b.jp

g

LOD-db + article template = article stub

https://nl.wikipedia.org/wiki/De_Geus_onder_studenten_(verzetsblad)

Grey = • From database • Predefined fixed strings

All that WP-writers need to add manually to create a full article

Current status

Global approach

1. Make central LOD-database

2. Build article template

3. Generate WP-article stubs

4. Involve WP-community to expand stubs into full WP-articles

Current status

Global approach

1. Make central LOD-database

2. Build article template

3. Generate WP-article stubs

4. Involve WP-community to expand stubs into full WP-articles

Current status

Global approach

1. Make central LOD-database

2. Build article template

3. Generate WP-article stubs

4. Involve WP-community to expand stubs into full WP-articles

Current status

Global approach

1. Make central LOD-database

2. Build article template

3. Generate WP-article stubs

4. Involve WP-community to expand stubs into full WP-articles

Current status

This month

March onwards

htt

p:/

/up

load

.wik

imed

ia.o

rg/w

ikip

edia

/co

mm

on

s/1

/12

/Pla

nn

ing_

tan

k_o

per

atio

ns,

_Sie

ge_o

f_To

bru

k_cp

h.3

b1

82

03

.jpg

Questions?

olaf.janssen@kb.nl - @ookgezellig

tinyurl.com/verzetskranten