59
Bringing Digital Humanities to the wider public: Libraries as incubators for DH Research Results dr. Martijn Kleppe Head of Research Department [email protected] | @ martijnkleppe | www.kb.nl/martijnkleppe

Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Bringing Digital Humanities to the wider public:

Libraries as incubators for DH Research Results

dr. Martijn Kleppe – Head of Research Department

[email protected] | @martijnkleppe | www.kb.nl/martijnkleppe

Page 2: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

What is the National

Library of the Netherlands?

Page 3: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 4: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 5: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

7 million items

115 kilometers of materials

Page 6: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 7: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 8: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 9: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 10: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 11: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Full text (OCR) access to:

467.000 books (1486 – 2013)

15 million newspaper pages (1618 – 1995)

4,4 million magazine pages (1840 – 1940)

1,5 miljoen ANP-radiobulletins (1937 – 1984)https://www.delpher.nl/

Page 12: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 13: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 14: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

www.delpher.nl www.kb.nl/dataservices

Page 15: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

http://lab.kb.nl/

Page 16: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 17: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

https://www.onlinebibliotheek.nl/

Page 18: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 19: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

https://www.onlinebibliotheek.nl/e-books.html

Page 20: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 21: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

What does the Research Department do?

Page 22: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 23: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

We’re curious

We learn

We experiment

We collaborate

Page 24: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

1. INFORMATION SOCIETY

2. PUBLICATIONS

3. ACCESS & SHARING

4. CUSTOMERS

5. IMPACT

ww

w.k

b.n

l/re

sear

chag

end

a

Page 25: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

ww

w.k

b.n

l/re

sear

chag

end

a

1. INFORMATION SOCIETY

2. PUBLICATIONS

3. ACCESS & SHARING

4. CUSTOMERS

5. IMPACT

Page 26: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

www.polimedia.nl

Page 27: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

“Putting TDM in the

Mainstream”, i.e. search

portals for bigger audience”

http://dh.library.yale.edu/projects/vogue/

https://www.youtube.com/watch?v=yHi4TD4YfGQ

https://twitter.com/sclaeyssens/status/748047246722228228

Page 28: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

https://www.jstor.org/analyze/analyzer

https://www.slideshare.net/AlexHumphreys1/the-case-for-applied-digital-humanities-in-

scholarly-communications

https://www.jstor.org/analyze/about

“But in a sense, what we do

is: Applied Digital Humanities”

Page 29: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

https://www.bbc.co.uk/rd/blog/2018-09-artificial-intelligence-archive-made-machine

https://www.bbc.co.uk/rd/projects/ai-production

Page 30: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

http://mediasuite.clariah.nl

Page 31: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

We’re curious

We learn

We experiment

We collaborate

Page 32: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Collaboration with libraries

https://libereurope.eu/strategy/digital-skills-services/digitalhumanities/

Page 33: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Collaboration with heritage institutes

https://pro.europeana.eu/network-association/special-interest-groups/europeanatech

https://www.netwerkdigitaalerfgoed.nl/en/

Page 34: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Collaboration with Research infrastructures

https://www.clariah.nl/

http://www.odissei-data.nl/en

https://www.clarin.eu/

https://www.dariah.eu/

https://timemachine.eu/

Page 35: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Collaboration with Researchers,that are actually our customers

https://www.kb.nl/en/organisation/research-expertise/projects

https://www.kb.nl/en/organisation/research-expertise/researcher-in-residence

Page 36: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Example #1

Page 37: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

http://kbkranten.politicalmashup.nl/

http://lab.kb.nl/tool/newspaper-ngram-viewer

Page 38: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 39: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Example #2

Page 40: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 41: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

https://blog.prototypr.io/behind-the-magic-

how-we-built-the-arkit-sudoku-solver-e586e5b685b0

Page 42: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

1918http://lab.kb.nl/tool/chronreader

Page 43: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

http://lab.kb.nl/tool/chronreader

Page 44: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

http://lab.kb.nl/tool/chronreader

Page 45: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

http://lab.kb.nl/tool/chronreader

“De aankomst van het Koninklijk Paar voor het paleis in Amsterdam”

“Arrival of the Royal Couple at the palace in Amsterdam”

Page 46: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

https://www.youtube.com/watch?v=PldvKPTPlz4&feature=youtu.be

https://zenodo.org/record/843504

Juliette Lonij

Willem Jan

Faber

Theo van Veen

Page 47: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Index_newspapers.

py

Solrverrijkte kranten

MongoDB

Dac.py

Topics

Index_please.py

SRU

Word2vec

KandidatenSolr

Named entityrecognition

DBpedia/Wikidata

Virtuoso

Kranten-index

Componenten verrijkingsinfrastructuur

OAIArtikelen

Initiële vulling Solr kandidaten index

Model tensorflow Features+ labels

training Trainings-set

Training van model

https://zenodo.org/record/843504

Page 48: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Continuous improvement of enrichment algorithm

article number / time

80

1 108 mlj

• All DBpedia titles searched in news articles• Named Entities searched in DBpedia• Speedup by using HPC cloud SURFsara• Using context and machine learning

Qu

alit

y /

con

fid

ence

(%

)

70

90At the end cycle to first article and overwrite earlier enrichments with newest algorithm

Page 49: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

algorithm accuracy link recall link precision link F-measure

Rule based .76 .76 .65 .70

Machine learning (SVM) .84 .76 .83 .79

Neural network .84 .73 .87 .79

Extra featurese.g. word embedding

.85 .81 .82 .82

Extra Wikidata data, more training data

.87 .81 .86 .84

Entity embedding .88 .86 .85 .85

From conventional entity linking to deep learning and beyond

Page 50: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

“Putting TDM in the

Mainstream”, i.e. search

portals for bigger audience”

“But in a sense, what we do

is: Applied Digital Humanities”

“Yes! But..

We’re not there yet…”

Page 51: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 52: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 53: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 54: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 55: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine
Page 56: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

We’re curious

We learn

We experiment

We collaborate

Page 57: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

https://www.nwo.nl/en/news-and-events/news/2018/09/nwo-seeks-

talented-researchers-for-challenging-ict-case-studies.html

Page 58: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

http://lab.kb.nl/about-us/team

http://lab.kb.nl/about-us/affiliated-researchers

Lotte Wilms Juliette Lonij Willem Jan Faber

Steven ClaeyssensTheo van Veen Thomas Smits

Page 59: Bringing Digital Humanities to the wider public: Libraries ... · Full text (OCR) access to: 467.000 books (1486 –2013) 15 million newspaper pages (1618 –1995) 4,4 million magazine

Questions?

Bringing Digital Humanities to the wider public-

Libraries as incubators for DH Research Results

dr. Martijn Kleppe – Head of Research Department

[email protected] | @martijnkleppe | www.kb.nl/researchagenda