27
Towards a multimedia encyclopaedic lexicon for the Marquesan and Tuamotuan languages Gaby Cablitz Christian-Albrechts-Universität zu Kiel

Towards a multimedia encyclopaedic lexicon for the Marquesan and Tuamotuan languages Gaby Cablitz Christian-Albrechts-Universität zu Kiel

Embed Size (px)

Citation preview

Towards a multimedia encyclopaedic lexicon for the Marquesan and Tuamotuan languages

Gaby Cablitz

Christian-Albrechts-Universität zu Kiel

Overview of this talk

Motivation for our project Why multimedia dictionaries? Project objectives and basic design Some major developments for our project Examples of linking multimedia extensions

with lexicographic data Web-based collaboration with speech

communities

Motivation for the project

How can a language documentation be made more accessible and usable to the speech and research community?

Two problems: 1. Limited ways of structuring archive 2. Primary data do not reveal much about language structure and relatedness between words of a language

Annotation of multimedia documents shows meaning of word in specific contexts, not network of associations between words nor full range of meanings -> need for structural data to understand primary data

Role of lexicography backgrounded in DoBeS-program Dictionaries are necessary elements in language documentation

projects

Multimedia dictionaries: beyond traditional lexicography and language archiving

New ways of meaning presentation: Linking of linguistic information with media files (video clips, photos, drawings, sound files)

Multimedia extensions provide:-> information on pragmatics of lexical units (use in context)

-> information on cultural knowledge related to meaning and use of lexical units (LU)

-> non-verbal aspects of cultural activities relevant for understanding concepts encoded by LU

New form of archiving: dense network of lexical entries with all kinds of media and archive files

Moving from a conventional dictionary towards an encyclopaedia

Major project objectives Major objectives:

1. Create multimedia encyclopaedic lexicon for Marquesan and Tuamotuan languages, 2. Advance development of LEXUS, 3. Involve speech community actively in lexicon creation via web-based collaboration

Upload non-archived multimedia data with lexical database in LEXUS as possible (photos, drawings, photo galleries, etc.)

Create links between lexical, multimedia and archive data in a thematically organised way

Represent data by reflecting indigenous categorisation and understanding of relatedness between elements

Create a database which is useful for language maintenance and language revival

Design focus: creating thematically organised spaces

Creation from an ethnobotanical perspective Plants important in traditional material culture, natural way of

teaching traditional knowledge Linking of data shall be visualised in one space which allows

continuous navigation through the database

Some major software developments for our project purposes Improvement of UI issues, functionalities etc. Development of the ViCoS tool: key feature

creating for thematically organised spaces via relational links

Unlike the Kirrkirr software, ViCoS can also integrate multimedia data, has good navigation and visualisation solutions, parts of a photo or drawing can be selected, user-friendly way of creating relational links (drag&drop option, etc.) making it accessible for speech communities

Realisation in ViCoS

Realisation and navigation in ViCoS

Jump to photo gallery

Linking media with lexicographic data: corpus-based examples Edition of

corpus-based example sentences -> creating a resource for comparing spoken vs. written language

Link to archive

Linking media with lexicographic data: made-up example sentences

Link to archive with interlinearisation

Video clips: acting out meaning of motion verbals Documenting word

meaning Letting consultants

design and act out word meaning without verbal interaction

Supportive element of word meaning, also useful for language revival

Creation of semantic word fields (e.g. CUT or BREAK verbals) in ViCoS

Web-based collaboration with speech community Problematic aspects of web-based

collaboration with SC Requirements for web-based collaboration

with speech community (e.g. capacity building)

Problems of using a wiki-like lexicon tool Proposal for speech community-based

participation in the process of lexicon creation

Basic challenges for an online cooperation with speech community Current state of LEXUS and proposal of

collaborative WSs have wiki-like set-up based on consensus

Who is a suitable administrator/primary editor?

Is it really sufficiant to make a web-based tool available and assume that an encyclopaedic lexicon will be simply created in a wiki-like manner by the speech community?

Design of collaborative WS by speech community Panel of moderators

interacting with administrator and SC

Complicated system of collaborative WS, not realistic

Development and implementation is time-consuming

Organising, editing and revising large amounts of new data with multiple entry writers and multiple drafts can get out of control

Community-internal obstacles I: linguistic situation In context of endangered speech communities ->

wiki-like set up of collaborative WS is very problematic

Documentation of lexical and cultural knowledge not an easy task -> consultants do not share same metalinguistic and cultural/ encyclopaedic knowledge about words (Haviland 2006)

Indigenous Polynesian languages -> undergoing rapid linguistic change

Depending on age and upbringing -> metalinguistic knowledge very heterogenous

Community-internal obstacles II: culture-specific reasons Problem rooted in their traditional society: very secretive about

their culture, transmission of cultural knowledge not public affair -> often only one selected person within a family

Unlike western cultures, cultural knowledge has no open verifiable and codified standards

Continuous loss of linguistic and cultural heritage feeds into many insecurities of speakers -> ground for conflicts about what is authentic knowledge and what not -> results in „editing wars“?

Within speech community: accusations of re-inventing and transforming the language and culture, knowledgable speakers often stigmatised as „liars“ -> withdrawal from documenting their endangered linguistic and cultural heritage

Community-internal obstacles III: cultures with oral traditions No writing tradition, difficult

to motivate literate speech community members to express knowledge in writing

Most knowledgable community members often cannot read or write, total lack of IT skills

Recording is better way of fixing knowledge

Transmission of traditional knowledge still „observing and learning by doing“

Capacity building in the speech community Prerequisite: substantial training

in basics of lexicography and usage of linguistic software

Understanding lexicon structures (e.g. Toolbox) requires training and continuous familiarisation as well as constant repetition of usage over protracted period of time

Writing definitions, encyclopaedic articles and example sentences needs to be learned despite a simplified user interface

New participants of speech community have to be trained subsequently -> who does the training?

Psychological barriers

Native speakers feel lost when having to edit lexical entries on their own

Psychological blockade of writing lexical entries -> formal aspect of lexical entry structure puts pressure on contributors to do a good job

Older community members have to learn to cooperate with younger community members with good IT skills, but lack of knowledge about language and culture

Enrichment of lexicon with linguistic and encyclopaedic knowledge Sensitivising speakers for the difference between describing the

meaning of word/lexical unit (=definition) and writing an encyclopaedic article-> encyclopaedic knowledge can be part of word meaning, lexical units can denote complex phenomena and procedures or culture-specific activities

Enrichment of lexicon with linguistic and cultural knowledge still best achieved during fieldwork periods based on mutual dialogue between researcher and consultants -> detailed investigations about language and culture, picking-up on interesting comments, questions about grammar etc., semantic relations between lexemes, etc. best obtained in face-to-face communication -> miscommunications and misunderstandings can be instantly clarified

New proposal for online participation by SC Both communities would like to have a limited

„panel of moderators“ interacting with linguist Only reduced editing possibilities for speech

community Lexicon should be open to community with

reading rights only „Whiteboarding tool“ should be available

coupled with the LEXUS tool

-> informal editing possible

Editing lexicon with whiteboarding tool Twiddla

Web-based tool, access to websites, easy editing possibilities, edited page can be saved and sent as attachment

Web-based whiteboarding tool ReviewBasics

Comment, annotate, markup images, documents and videos, upload other media files, etc.

User-friendly UI for editing documents and handling web-based collaboration Disadvantage: cannot access protected websites

Advantages of whiteboard editing

Informal way of editing lexicon and participating in its creation -> motivating effects on speech community

Pressure of producing good definitions, encyclopaedic articles, etc. is taken away, no need to deliver complete definitions, etc.

Playful aspect motivates younger speech community members to participate, consequently learn about their language and culture

No interference with lexical database as such, only in accordance with moderators

Workload reduced for the panel of moderators (accept or reject changes)

Conclusions Web-based tool like LEXUS can be a powerful tool of

1. Linguistic and cultural revival2. Tool for visualising primary and structural data together (e.g. lexicon) -> new form for archiving making linguistic and cultural networks more visible in KS of ViCoS

Online participation of (Marquesan andTuamotuan) speech community is problematic if LEXUS is set up in wiki-like manner

LEXUS needs to be adjusted to culture-specific circumstances of the speech communities

Simplified user interface for SC will not solve the problem of online participation, contributors still need to learn basics of lexicography

Enriching a lexicon with detailed linguistic and encyclopaedic information by online participation of SC is doubtful and will not replace extensive fieldwork