28
Working Group 2: Retro-digitised dictionaries IS1305 “European Network of e-Lexikography (ENel)”

IS1305 “European Network of e-Lexikography (ENel)”

  • Upload
    mercer

  • View
    42

  • Download
    0

Embed Size (px)

DESCRIPTION

IS1305 “European Network of e-Lexikography (ENel)”. Objectives of WG 2. (according to the application) set up guidelines and standards for turning paper dictionaries into a digital format - PowerPoint PPT Presentation

Citation preview

Page 1: IS1305 “European Network of  e-Lexikography (ENel)”

Working Group 2:Retro-digitised

dictionaries

IS1305 “European Network of e-Lexikography (ENel)”

Page 2: IS1305 “European Network of  e-Lexikography (ENel)”

Objectives of WG 2

(according to the application)

set up guidelines and standards for turning paper dictionaries into a digital format

development of common standards in the field of e-lexikography for retro-digitised paper dictionaries already online or planning to go online (objective 3 of the action)

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 2

Page 3: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 1

1. establish an overview of existing retro-digitised dictionaries and an overview of dictionaries which should be retro-digitised (necessity to be digitised → ranking? → no, not necessary!)

→ necessary to give this overview: “scheme of categories” describing the dictionaries (to develop in close exchange with WG 1, WG 2, WG 3)

result: database to browse (→ to coordinate with WG 1)

→ question: different categories as search parameters?

time frame: year 1European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 3

Page 4: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 2

2. develop a standard workflow for digitisation of dictionaries planning to go online including parameters necessary for estimating costs

digitisation (fulltext, images, OCR) encoding of retro-digitised dictionaries development of GUI standards of presentation and design long term preservation …

result: guidelines (have to be written in such a way that policy makers understand them)

time frame: year 1—4

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 4

Page 5: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 3

3. define standards for the encoding of information and the description of relevant information categories for paper dictionaries

→ main objective: guarantee interoperability, platform interdependence

→ task: collect standards used within the action (TEI, LFM, ISO → give this question to MC)

→ questions: what markup languages to use? do we need a “minimal set” of standards for

both retro-digitised and new, born digital dictionaries?

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 5

Page 6: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 3

3.1 part of task 3: establish an overview of software for the conversion of physical lay-out information to logical information

→ question: how to mark-up the dictionaries (i.e. automatically, semi-automatically; are there “mark-up tools” to be re-used)?

result: best practices for the encoding of information, linked with dictionary database

time frame: year 1 and 2

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 6

Page 7: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 4

4. a) investigate relevant information categories to be added to the dictionary in order to make the dictionary content more readily accessible and interoperable

b) develop concepts for linking retro-digitised dictionaries

→ questions: which information do we need to interlink

dictionaries (extra-information?)? → describe the strategies

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 7

Page 8: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 4

→ questions:integration of additional information to create up new information (e.g. WordNet, wiki dictionary, FrameNet)?→ question to address to the WGs: do you put additional information in your dictionary

result: best practices

time frame: year 3

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 8

Page 9: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 5

5. investigate the possible use of dictionary content for computational linguistic applications

→ task is already done, no further need => clear task list!

time frame: year 4

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 9

Page 10: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 6

6. identify future funding sources and develop collaborative funding applications considering the dictionary-candidates to retro-digitise and the working plan for digitisation

→ information to have on an European level→ develop awareness in governments of Europe!

→ questions:– national and international funds to go for financial

support?– develop guidelines / best practices for writing

funding applications?→ responsibility of steering group!

time frame: year 1—4European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 10

Page 11: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2 — Task 6

→ task 6: responsibility of steering group!

time frame: year 1—4

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 11

Page 12: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks of WG 2

in Leiden we tried to divide tasks, to find responsible(s) for the tasks, to form subgroups → not yet finished, especially for task 4 and 5(task 5 already done, no need to find responsible(s))

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 12

Page 13: IS1305 “European Network of  e-Lexikography (ENel)”

Participants

27 participants from 14 countries: Austria (1), Denmark (2), Finland (3), France (1), Germany (5), Hungary (1), Netherlands (2), Poland (2), Portugal (2), Romania (1), Serbia (2), Slovacia (1), Switzerland (3), United Kingdom (1)

see file “WG 2 Leiden 16-01-2014 minutes Annex1 participants.pdf”

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 13

Page 14: IS1305 “European Network of  e-Lexikography (ENel)”

Dictionaries in WG 2

see list in file “WG 2 Leiden 16-01-2014 minutes Annex2 dictionaries.pdf”

not yet complete for now: 25 dictionaries of different types–most of them monolingual– 10 (?) languages–most of them diachronic / historical

dictionaries, standard language dictionaries, some dialect dictionaries

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 14

Page 15: IS1305 “European Network of  e-Lexikography (ENel)”

Plans / ideas / work in progress

bibliography of retro-digitised dictionaries online available (student using Citavi for organizing the bibliography)→ structure of the bibliography: language dictionaries, specific dictionaries (e. g. “A dictionary of food and nutrition”)→ structure of entries: author, year of publication, title, place of publication, publisher, url (Adelung, Johann Christoph (1808): Grammatisch-kritisches Wörterbuch der hochdeutschen Mundart. Mit beständiger Vergleichung der übrigen Mundarten, besonders aber der oberdeutschen. Wien: Richter. Online: http://ds.ub.uni-bielefeld.de/viewer/image/1323497/1/LOG_0003/.)

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 15

Page 16: IS1305 “European Network of  e-Lexikography (ENel)”

Plans / ideas / work in progress

→ work in progress (for now: 22 pages in Word file) → questions: re-use in the Action?which information should be given in this bibliography of retro-digitised dictionaries (close connection to the “scheme of categories” describing the dictionaries?)bibliography as basis for the database of retro-digitised dictionaries and part of the dictionary portal?

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 16

Page 17: IS1305 “European Network of  e-Lexikography (ENel)”

Plans / ideas / work in progress

“collection” of “dictionary typologies” trying to find a “scheme of categories” describing the dictionaries in the Action

problem: so far only consideration of German “typologies” – Storrer: classification of internet dictionaries

• retro-digitised dictionaries• digital born dictionaries• dictionaries with user participation• user generated dictionaries• finished dictionaries• dictionaries “under construction”

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 17

Page 18: IS1305 “European Network of  e-Lexikography (ENel)”

Plans / ideas / work in progress

– Schlaefer: • language(s) covered: monolingual, multilingual• vocabulary/lexicon described• user group addressed• methodological basis• lexikographical basis

– Hausmann• synchronic vs diachronic dictionary• historical vs contemporary dictionary• standard language vs dialect dictionary• …

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 18

Page 19: IS1305 “European Network of  e-Lexikography (ENel)”

Cooperation with other WGs

cooperation with WG 1 concerning– the encoding of dictionaries– the linking of information between

dictionaries– user interfaces – the overview of dictionaries

cooperation with WG 3 in finding common approaches to linking contents of retro-digitised and innovative dictionaries

cooperation with WG 1, WG 3, WG 4 – in identifying funding sources and developing

funding applications European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 19

Page 20: IS1305 “European Network of  e-Lexikography (ENel)”

Decisions which have to be made / questions

“Scientific” aim:–develop a “scheme of categories” describing the dictionaries (short standardized “profile”) → cooperation with WG 1, WG 3 and WG 4

→ question: which information should be given about the dictionaries?1.information about the dictionary itself (short and clear description!)dictionary typelanguage covered (source language, description language, target language)

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 20

Page 21: IS1305 “European Network of  e-Lexikography (ENel)”

Decisions which have to be made / questions

(→ 1. information about the dictionary itself)year of publication (print and online)number of entriesreferences, literature concerning the dictionary…2.information about the technical processencodingXML schema and documentationyear of publication…

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 21

Page 22: IS1305 “European Network of  e-Lexikography (ENel)”

Decisions which have to be made / questions

→ questions: which kinds of dictionaries to include / exclude?propose parameters / properties for all dictionaries which can function as search parameters in the dictionary portal (“search for dictionaries”)?

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 22

Page 23: IS1305 “European Network of  e-Lexikography (ENel)”

Decisions which have to be made / questionsOrganisation:mailing list for each WG? → establish at INL?(Google Groups for each WG, all of them including members of steering group)how to exchange information / results of WGs within WGs and amongst all participants → can we use the intranet as envisaged in the proposal? or Google Groups and Google Docs? (“suitable instruments”?)do we need slots for inter-WG meetings at all WG meetings?specialist workshops preceeding the WG meeting?

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 23

Page 24: IS1305 “European Network of  e-Lexikography (ENel)”

Decisions which have to be made / questionsOrganisation:STSMs: – “central” and open call? or call focused on

certain topics fostering certain tasks in the action?

– information concerning reimbursement to participants?

Training Schools: – how to organize? where? when? how long?– number of participants? experts?– budget?

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 24

Page 25: IS1305 “European Network of  e-Lexikography (ENel)”

To ask from participants of WG 2

short biographies concerning their background (like Anne did in WG 1, see minutes)?– collect them for ENeL website? secured or

open part of website? continue to divide tasks / build subgroups

(especially for task 4) invite them to think about topics relevant for

any concern of WG 2 not yet fixed in working plan;

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 25

Page 26: IS1305 “European Network of  e-Lexikography (ENel)”

To ask from participants of WG 2

invite them to think about experts to be involved in the discussions of WG 2 (specialist workshops)

invite them to think about topic(s) to deal with at Bolzano → fixed in Leiden: presentation of first results of task 3 (development of standards for the encoding of information and the description of relevant information categories for print dictionaries) at meeting in Bolzano

invite them once again to think about a 5-day meeting in the Lorentz Center in Leiden in 2016

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 26

Page 27: IS1305 “European Network of  e-Lexikography (ENel)”

To ask from participants of WG 2

invite them to think about the Training School in 2015: “Standard tools and methods for retro-digitising dictionaries“

→ date: year 2, semester 2→ Rute will check location with Vlado and Vera give a description of “their” dictionary/ies

according to our “dictionary scheme” (“deadline” depending on decision how this “dictionary profile” looks like)

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 27

Page 28: IS1305 “European Network of  e-Lexikography (ENel)”

Tasks for Bolzano

define a list of dictionaries to begin with (e.g. bilingual synonym dictionaries)

define a list of dictionaries to be retro-digitized

define a list of metadata(ask all WGs for a list of dictionaries and a list

of mark up) proposal with dictionary typology including

definitions of technical terms used (end of June); define

European Network of e-Lexicography

Working Group 2 Vienna, 14–4-2014 28