30
Europeana Newspapers Project "Distant Reading: Historic Newspapers in the Digital Age“ National Library, Warsaw, Poland January 16, 2014 Ulrike Kölsch, Project Coordinator - Berlin State Library

Europeana Newspapers Polish Information Day

Embed Size (px)

Citation preview

Page 1: Europeana Newspapers Polish Information Day

Europeana Newspapers Project

"Distant Reading: Historic Newspapers in the Digital Age“

National Library, Warsaw, PolandJanuary 16, 2014

Ulrike Kölsch, Project Coordinator - Berlin State Library

Page 2: Europeana Newspapers Polish Information Day

Europeana Newspapers16 January 2014 – Warsaw– Morning Edition

Page 3: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 3

Europeana Newspapers Project

On 15th April 1912, the passenger ship

Titanic, carrying over 2000 passengers and

crew, crashed into an iceberg on its maiden

voyage from Southampton to New York

Page 4: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 4

Europeana Newspapers Project

Responses to the Titanic Disaster

Page 5: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 5

Europeana Newspapers Project

Responses to the Titanic Disaster

Page 6: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 6

Europeana Newspapers Project

Responses to the Titanic Disaster

Page 7: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 7

Europeana Newspapers Project

Responses to the Titanic Disaster

Page 8: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 8

Europeana Newspapers Project

Responses to the Titanic Disaster

Page 9: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 9

Europeana Newspapers Project

Responses to the Titanic Disaster

Page 10: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 10

Europeana Newspapers Project

News travels at different speeds,

with importance that diminishes at

different rates.This is true now as

is was in 1912.(though the web changes things

…)

Page 11: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 11

Europeana Newspapers Project

The Europeana Newspapers Project is making this kind of investigation easier, in

several ways

Page 12: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 12

Europeana Newspapers Project

1. By creating full text for 8m pages 2. By undertaking article segmentation for 2m

pages 3. By undertaking named entity extraction for 2m

pages 4. By developing a cross-searchable newspapers

browser at The European Library(with metadata forwarded to Europeana)

Page 13: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 13

Europeana Newspapers Project

Best Practice Network that aims at aggregating 18 million digitised historic newspaper pages from 12 European libraries, drastically improving search and retrieve possibilities.

Volume

Cross European cultures

Sharing best practices

Improving availability

Improving accessibility

Page 14: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

The challenges……

Newspapers were not meant to be preserved… frail and crumbly paper missing edition incomplete supplements poorly bound fading ink different fonts legal uncertainties

with contemporary material

Page 15: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Who

12 content providers

2 networking partners

Blue– Providing Content

Yellow –Providing Technical Services

Green – Associate Partners

Page 16: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Who

12 content providers

2 networking partners

4 technology providers

1 aggregator

Page 17: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 17

Challenges and Solutions in Creating a European Historic Newspapers Browser I

Creating a newspapers interface that ...

Provides unique value to users

Reflects relationship to original physical newspaper collections

Is sustainable

Offers contributors added value

Defines relationship to Europeana

Respects library wishes

Page 18: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 18

Challenges and Solutions in Creating a European Historic Newspapers Browser II

What content will be included ?

Full Images, Full Text, Metadata

Latvia, Belgrade, Germany (Hamburg, Berlin), Estonia, Finland, Netherlands , Austria

Snippets of Images, Full Text, Metadata

Italy, France , Poland

Page 19: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 19

Challenges and Solutions in Creating a European Historic Newspapers Browser III

First Iteration- Basic text search- Filtering of results by date, country, newspaper, language, library- OCR shown- Zoom able version of full image - Clickable links between full text and image (sometimes)- Link to newspaper source library

Page 20: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 20

Challenges and Solutions in Creating a European Historic Newspapers Browser IV

Page 21: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 21

Challenges and Solutions in Creating a European Historic Newspapers Browser V

Complete Newspaper image can be shown

Eesti Potimees ehk Naddaleleht, 2 November 1866

(National Library of Estonia)

Page 22: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 22

Challenges and Solutions in Creating a European Historic Newspapers Browser VI

Fragment of Newspaper image can be shown

Dziennik Slaskui, 10 June 1915(National Library of Poland)

Page 23: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 23

Challenges and Solutions in Creating a European Historic Newspapers Browser VII

•Just title level metadata can be shown:

“Kleine Blatt, 15 November 1932”(National Library of Austria)

Page 24: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 24

Challenges and Solutions in Creating a European Historic Newspapers Browser VIII

Zooming in

Page 25: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 25

Challenges and Solutions in Creating a European Historic Newspapers Browser IX

Second Iteration

- Fragments - See information on particular title- See what was published on a particular day - Search over titles (not just text)- Other browse-able visualisations of publication and library source- Search / browse via entities

Page 26: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 26

Challenges and Solutions in Creating a European Historic Newspapers Browser X

Who are the users ?

- Historians

- Researchers

- Students

- Genealogists

- Teachers and school pupils

- Interested public Citizen researcher

Page 27: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 27

Challenges for Users

“Texts are designed to “speak” to us, and so, they always end up telling us something; but archives are not messages that were meant to address us, and so they say absolutely nothing until one asks the right question.”

(Franco Moretti "Distant Reading“, 2013)

Page 28: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 28

Share best practices

Image: Australian National Maritime Museum

… via workshops and national information days

Page 29: Europeana Newspapers Polish Information Day

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 29

Network Partner Project

Europeana Collections 1914-1918 – Remembering the First World War

Unlocking Sources – The First World War online & Europeana“, 30./31.01.2014

2014 will mark the centenary of the outbreak of the First World War, which will be commemorated worldwide. In recent years a wide range of European cultural institutions, including the Staatsbibliothek zu Berlin, have digitized manuscript and print materials as well as film holdings. Books, photos, films, posters, manuscripts, and song lyrics have recently been made available online.

On 30 and 31 January 2014 the Staatsbibliothek zu Berlin will host the event “Unlocking Sources – The First World War online & Europeana” to mark the commemoration.

More information : www.unlocking-sources.eu

Page 30: Europeana Newspapers Polish Information Day

Thank you for interest!More information on our website

www.europeana-newspapers.eu