Europeana Newspapers Polish Information Day

Preview:

Citation preview

Europeana Newspapers Project

"Distant Reading: Historic Newspapers in the Digital Age“

National Library, Warsaw, PolandJanuary 16, 2014

Ulrike Kölsch, Project Coordinator - Berlin State Library

Europeana Newspapers16 January 2014 – Warsaw– Morning Edition

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 3

Europeana Newspapers Project

On 15th April 1912, the passenger ship

Titanic, carrying over 2000 passengers and

crew, crashed into an iceberg on its maiden

voyage from Southampton to New York

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 4

Europeana Newspapers Project

Responses to the Titanic Disaster

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 5

Europeana Newspapers Project

Responses to the Titanic Disaster

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 6

Europeana Newspapers Project

Responses to the Titanic Disaster

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 7

Europeana Newspapers Project

Responses to the Titanic Disaster

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 8

Europeana Newspapers Project

Responses to the Titanic Disaster

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 9

Europeana Newspapers Project

Responses to the Titanic Disaster

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 10

Europeana Newspapers Project

News travels at different speeds,

with importance that diminishes at

different rates.This is true now as

is was in 1912.(though the web changes things

…)

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 11

Europeana Newspapers Project

The Europeana Newspapers Project is making this kind of investigation easier, in

several ways

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 12

Europeana Newspapers Project

1. By creating full text for 8m pages 2. By undertaking article segmentation for 2m

pages 3. By undertaking named entity extraction for 2m

pages 4. By developing a cross-searchable newspapers

browser at The European Library(with metadata forwarded to Europeana)

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 13

Europeana Newspapers Project

Best Practice Network that aims at aggregating 18 million digitised historic newspaper pages from 12 European libraries, drastically improving search and retrieve possibilities.

Volume

Cross European cultures

Sharing best practices

Improving availability

Improving accessibility

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

The challenges……

Newspapers were not meant to be preserved… frail and crumbly paper missing edition incomplete supplements poorly bound fading ink different fonts legal uncertainties

with contemporary material

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Who

12 content providers

2 networking partners

Blue– Providing Content

Yellow –Providing Technical Services

Green – Associate Partners

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp

Who

12 content providers

2 networking partners

4 technology providers

1 aggregator

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 17

Challenges and Solutions in Creating a European Historic Newspapers Browser I

Creating a newspapers interface that ...

Provides unique value to users

Reflects relationship to original physical newspaper collections

Is sustainable

Offers contributors added value

Defines relationship to Europeana

Respects library wishes

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 18

Challenges and Solutions in Creating a European Historic Newspapers Browser II

What content will be included ?

Full Images, Full Text, Metadata

Latvia, Belgrade, Germany (Hamburg, Berlin), Estonia, Finland, Netherlands , Austria

Snippets of Images, Full Text, Metadata

Italy, France , Poland

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 19

Challenges and Solutions in Creating a European Historic Newspapers Browser III

First Iteration- Basic text search- Filtering of results by date, country, newspaper, language, library- OCR shown- Zoom able version of full image - Clickable links between full text and image (sometimes)- Link to newspaper source library

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 20

Challenges and Solutions in Creating a European Historic Newspapers Browser IV

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 21

Challenges and Solutions in Creating a European Historic Newspapers Browser V

Complete Newspaper image can be shown

Eesti Potimees ehk Naddaleleht, 2 November 1866

(National Library of Estonia)

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 22

Challenges and Solutions in Creating a European Historic Newspapers Browser VI

Fragment of Newspaper image can be shown

Dziennik Slaskui, 10 June 1915(National Library of Poland)

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 23

Challenges and Solutions in Creating a European Historic Newspapers Browser VII

•Just title level metadata can be shown:

“Kleine Blatt, 15 November 1932”(National Library of Austria)

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 24

Challenges and Solutions in Creating a European Historic Newspapers Browser VIII

Zooming in

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 25

Challenges and Solutions in Creating a European Historic Newspapers Browser IX

Second Iteration

- Fragments - See information on particular title- See what was published on a particular day - Search over titles (not just text)- Other browse-able visualisations of publication and library source- Search / browse via entities

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 26

Challenges and Solutions in Creating a European Historic Newspapers Browser X

Who are the users ?

- Historians

- Researchers

- Students

- Genealogists

- Teachers and school pupils

- Interested public Citizen researcher

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 27

Challenges for Users

“Texts are designed to “speak” to us, and so, they always end up telling us something; but archives are not messages that were meant to address us, and so they say absolutely nothing until one asks the right question.”

(Franco Moretti "Distant Reading“, 2013)

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 28

Share best practices

Image: Australian National Maritime Museum

… via workshops and national information days

This project is partially funded under the ICT Policy Support Programme (ICT PSP) as part of the Competitiveness and Innovation Framework Programme by the European Community http://ec.europa.eu/ict_psp 29

Network Partner Project

Europeana Collections 1914-1918 – Remembering the First World War

Unlocking Sources – The First World War online & Europeana“, 30./31.01.2014

2014 will mark the centenary of the outbreak of the First World War, which will be commemorated worldwide. In recent years a wide range of European cultural institutions, including the Staatsbibliothek zu Berlin, have digitized manuscript and print materials as well as film holdings. Books, photos, films, posters, manuscripts, and song lyrics have recently been made available online.

On 30 and 31 January 2014 the Staatsbibliothek zu Berlin will host the event “Unlocking Sources – The First World War online & Europeana” to mark the commemoration.

More information : www.unlocking-sources.eu

Thank you for interest!More information on our website

www.europeana-newspapers.eu

Recommended