Transcript
Page 1: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Digitale Zeitungsarchive als Quellen (digitaler)

Geschichtsforschung

Dr. Pim Huijnen

Universität Utrecht

[email protected]

Berlin, 28.02.2014

Page 2: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

www.translantis.nl

Page 3: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Translantis

Digital Humanities Approaches to Reference Cultures; The Emergence of the United States in Public Discourse in the Netherlands, 1890-1990

“…uses digital technologies to analyze the role of reference cultures in debates about social issues and collective identities, looking specifically at the emergence of the United States in public discourse in the Netherlands from the end of the nineteenth century to the end of the Cold War.

!

Page 4: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

The United States as a reference culture

Business

Society

Consumption

Media

Crime

Health

Page 5: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Amerikanisierung

Page 6: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Business/economy: Americanization

1870-1914 - 1918-1940 - 1945-1989

Fordism Taylorism

Professionalization Managerism

Productivity

Rationalisation Efficiency

Standardization Mass production

Mass market Consumer society

Credit

Consultancy Accountancy

Page 7: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Abweisung, Aneignung,

Verflechtung

Page 8: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Leeuwarder Courant, 27 oktober 1950

Die USA als Referenz-Kultur

27 oktober 1195r 500 anntt, 2

Page 9: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Un

ited

Sta

tes in

Du

tch

ne

ws m

ed

ia

"!

#"""!

$"""!

%"""!

&"""!

'"""!

("""!

)"""!

*"""!

#*'"!

#*'%!

#*'(!

#*'+!

#*($!

#*('!

#*(*!

#*)#!

#*)&!

#*))!

#**"!

#**%!

#**(!

#**+!

#*+$!

#*+'!

#*+*!

#+"#!

#+"&!

#+")!

#+#"!

#+#%!

#+#(!

#+#+!

#+$$!

#+$'!

#+$*!

#+%#!

#+%&!

#+%)!

#+&"!

#+&%!

,-./..0123.!4565.0,!

,78./19

660:;<

,!

Page 10: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Text mining for historical research

National Library Den Haag: ~9.000.000 digitized pages from Dutch news media 1618-1995

Opportunities for comparative and transnational historical research

(esp. History of mentalities/ of ideas)

Development of a digital text mining tool

!

!

Page 11: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Digital research on public debates

Page 12: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

servers nodig voor opslag (500 gb aan data) computers nodig voor computationele bewerking (geheugen) duurzaamheid nodig bij opslag en bestandsformaten (min. 5 jaar – maar liefst oneindig) beheer nodig (mankracht)

programmeerkennis nodig

Big Data?

Page 13: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

The change of scale has led to a change of state. The quantitative change has led to a qualitative one. […]

[B]ig data refers to things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value

Viktor Mayer-Schönberger en Kenneth Cukier, Big Data: A Revolution That Will Transform How We Live, Work, and Think

(Boston 2013) 13.!

Big Data!

Page 14: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

“Letting the data speak”

Page 15: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Top-down vs. bottom-up

Bob Nicholson, ‘The Digital Turn’, Media History 19 (2013) 59-73.!

Page 16: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Query: ‘Standard oil’ <1900 (1030 hits)

Page 17: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Wortwolke

Word cloud ‘manager’ 1910-1920

(3437 hits)

Word cloud ‘manager’ 1945-1950

(1173 hits)

Page 18: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Voyant word cloud ‘efficiency’ 1945-1960 (46040 hits)

Page 19: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Voyant word cloud ‘efficiëntie’ 1945-1990 (2861 hits)

Page 20: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung
Page 21: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Histogram

Query: ‘consultancy’ (2167 hits)

Page 22: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Histogram (SPSS)

Query: ‘manager’ (191.710 hits)

Page 23: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

BILAND

Query: ‘Heredity’ (1876) (22/1465 hits)

Page 24: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

BILAND

Query: ‘Heredity’ (1935) (1465 hits)

Page 25: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

BILAND

Query: ‘Hygiene’ (87/41 hits)

Page 26: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

‘Typisch Amerikanisch’

Page 27: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Topic modeling

Page 28: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

SPSS

Page 29: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Translantis

Query: ‘manager’ (191.710 hits)

Page 30: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Translantis

Query: ‘manager’ in advertenties (82.695 hits)

Page 31: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung
Page 32: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

=8>1/1:;<.!;?;@A:!

query

kwantitatieve analyse

kwalitatieve analyse

inzicht

Page 33: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Digital research on public debates

No limitation source material

No selection issues

No representativeness issues

Enabling research on hidden debates, mentalities, implicit notions

Reproducibility of research, from various perspectives

Page 34: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Source criticism: data

representativeness

internal coherence

(OCR) quality

Page 35: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

"!

#"""!

$"""!

%"""!

&"""!

'"""!

("""!

)"""!

*"""!

#*'"!

#*'%!

#*'(!

#*'+!

#*($!

#*('!

#*(*!

#*)#!

#*)&!

#*))!

#**"!

#**%!

#**(!

#**+!

#*+$!

#*+'!

#*+*!

#+"#!

#+"&!

#+")!

#+#"!

#+#%!

#+#(!

#+#+!

#+$$!

#+$'!

#+$*!

#+%#!

#+%&!

#+%)!

#+&"!

#+&%!

,-./..0123.!4565.0,!

,78./19

660:;<

,!

Page 36: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Representative?

Page 37: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Representative?

Libraries, archives, museums and other collection institutions have now been digitising corpora of material for many years, but with a very few exceptions, it is still quite rare for an entire run of primary sources to be digitised and made available online. This means that there are gaps within the digital record. Yet it is unusual for online resources to actively demonstrate these gaps; resources may be advertised as a growing corpus, but when searching through or downloading a digital resources there is rarely any indication of what has not been digitised. This skews the sense of the nature of the collection the scholar is working with and erodes trust.

Abstract submitted to DH2014 by Alastair Dunning (The European Library) and Clemens Neudecker (KB National Library of the Netherlands).

See: http://availableonline.wordpress.com/

Page 38: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Source criticism: comparison

Page 39: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Source criticism: Press history

[O]ne of the biggest challenges facing press historians will be to ensure that the historical agency and complex materiality of newspapers are not forgotten in a rush to mine their contents.

Bob Nicholson, ‘The Digital Turn’, Media History 19 (2013) 59-73, on

p. 67

Page 40: Europeana Newspapers German infoday - Digitale Zeitungsarchive als Quellen (digitaler) Geschichtsforschung

Source criticism (interpretation)

Newspapers = public debate?

What newspapers write = what public thinks?

How to interprete results?

What are stopwords? (“staat”) !

Mining for meaning?


Recommended