33
eBooks on Demand and FEP Andreas Parschalk, University of Innsbruck (UIBK), Library [email protected]

IMPACT Final Event 26-06-2012 - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

Embed Size (px)

Citation preview

Page 1: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

eBooks on Demand and FEP Andreas Parschalk, University of Innsbruck

(UIBK), [email protected]

Page 2: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

Overview

EOD – the serviceOverview

Libraries workflow

End-users view

EOD and the Functional Extension ParserThe Functional Extension Parser (FEP)

Integration into the workflow

Current status

Page 3: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD – the service

What is EOD?Network of libraries

Digitisation on demand for copyright-free books

Started 2006 co-founded by the EC in eTEN program

Delivering digitised books since 2007

Page 4: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD – the service

EOD button: digitising this

book on request

Incorporation into Digital Library &

Europeana

Library: scans & transfers

images

Page 5: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

Who is currently offering the service?

Page 6: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

> 30 libraries, 12 countries

Page 7: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD libraries

AustriaUniversity Libraries of Innsbruck, Graz and Vienna (2x),Vienna City Library

GermanyBavarian State Library (Munich), University Libraries of Regensburg, Greifswald, Berlin (Humboldt University), Saxon State Library (Dresden), STABI Berlin

Denmark Royal Library

Estonia National Library, University Library of Tartu

France Academic health library (Paris)

Hungary National Széchényi Library of Hungary, Library of the Hungarian Academy of Science

Portugal National Library

Slovakia University Library of Bratislava, Slovak Academy of Sciences

Slovenia National and University Library

Sweden University Library of Umeå, National Library of Sweden

Switzerland National Library of Switzerland, Library at Guisanplatz

Czech Republic

Moravian Library (Brno), Research Library in Olomouc, National Technical Library, Library of the Czech Academy

Page 8: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD – the service

What is being digitisedOnly public domain books according to

laws and regulations of the libraries' country

Aim: „Full informational capture“Whole books cover to cover

Virtually counted blank pages

Supplements (maps, tables, …) that form an integral part of the document

Page 9: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD: The Libraries‘ point of view

Central services used by libraries Web application for the administration of orders

and generation of eBooks Automation of communication (automated e-mails

to end-users, tracking page with status update) OCR (optical character recognition) services:

antiqua and gothic font NEW: Structural Analysis (FEP) Delivery of CD-ROMs (optional) Preprint preparation for reprint orders (optional) Reprint creation and delivery Central management of credit card payments

Page 10: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

Carried out locally at library sites

Scanning and uploading of material

Handling orders in Order Data Manager

Uploading to local digital repositories

Long term storage

Page 11: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD: The Libraries‘ point of view

Workflow for the librariesOrder arrives

Order the book in the library

Check the order details (can it be digitised, correct automatically fetched metadata)

Scan book cover to cover

Upload the images

Start eBook generation

Check results and finish the order

Page 12: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD: The Libraries‘ point of view

Page 13: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD: The Libraries‘ point of view

Ebook generationConfiguring settings

Resolution and jpeg quality

With or without OCR

OCR settings (language, font type)

Deskew despeckle

Start eBook generation

Create EOD cover pages

Alternatively generate eBook locally

Page 14: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)
Page 15: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD: The Libraries‘ point of view

The library can download the OCR output as zipped single pages xml and as RTF

Use in local repository (e.g. full text search)

Digitisation for the visually impaired

Possible full text correction

Conversion to other formats (e.g. ePub)

No structural information. Requests for METS/ALTO output until now

Page 16: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

The end-users point of view

Page 17: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

The end-users point of view

Find the record of the book in catalogue

Click EOD button

Fill out orderform

1-2 weeks delivery time depending on the library

Pay online

Download and use

Page 18: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

The catalogue situation diverse and dispersedOPACsDigitised card cataloguesUnion catalogues

The EOD search engineIn addition to the EOD button in the libraries'

catalogueshttp://search.books2ebooks.eu3 million records of digitisable and digitised items19 EOD libraries already integrated their records

The end-users point of view

Page 19: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)
Page 20: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)
Page 21: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

The end-users point of view

Page 22: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

eBooks on Demand and the Functional Extension Parser

Page 23: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD and the FEP

MotivationImprove output for libraries

Structural information

METS/ALTO

Improve output for end-users Enhance PDF with clickable TOC

Page 24: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD and the FEP

PrerequisitesXML output of OCR of complete document

Images of the scanned document

Coordinates of the OCR xml must correspond with the coordinates in the images (deskew images before)

Quality of the scans and OCR as good as possible

FEP works with the XML output of EOD eBook generation

Automatically extracts structural information about the documentPage numbers

Table of Contents

Offers webinterface to manually correct enhance the result

Page 25: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD and the FEP

Integration of FEP into EOD workflowRegular EOD eBook generation

Operators decide if FEP is possible/usefulScan quality

OCR quality

Structure of the book

Start automatic recognition

Check/correct/modify results in FEP webinterface

Page 26: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD and the FEP

Operator finds books with automatically recognized structure in the FEP webinterface and can then enhance/correct the recognized printspace, pagination and TOC (optionally also the logical structure)

Page 27: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)
Page 28: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)
Page 29: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD and the FEP

After all correction steps are doneMETS/ALTO files

Enhanced PDF

If results are ok Replace regular PDF with enhanced PDF

by uploading to ODM via FTP

End-users download enhanced PDF as usual through their EOD trackingpage

Page 30: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD and the FEP

Page 31: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

EOD and the FEP

Current statusInterface OrderDataManager – FEP core

implemented and workflow adapted Internal testing phase finishedOnline and offline workshops to familiarize EOD

operators on FEP correction webinterface were held

Ready for production environmentBetatesting and feedback period with 10

selected EOD network libraries until end of July

Page 32: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)
Page 33: IMPACT Final Event 26-06-2012  - The Functional Extension Parser (FEP) and Ebooks On Demand (EOD) by Andreas Parschalk (University of Innsbruck)

Thank you for your [email protected]