33
Recommendations for Encoding Etymological Information Using TEI XML Laurent Romary INRIA France Jack T. Bowers [email protected] EMeL WG2 Meeting Vienna 13/02/2015 revision 06/04/2015

Etymology_in_TEI-JBowers-INRIA-20150317

Embed Size (px)

Citation preview

Recommendations for Encoding Etymological Information Using TEI XML

Laurent Romary

INRIA France

Jack T. [email protected]

EMeL WG2 Meeting Vienna13/02/2015

revision 06/04/2015

General Overview of Project

We are creating a set of structural recommendations for TEI lexical dictionaries, including information relevant to:

• phonetic and orthographic forms;• grammatical information;• semantic and meta-linguistic

information;• variation (on all levels);• etymology;• mono-/bi-/multi-/lingual

dictionaries; as well as in dictionaries in which encyclopedic and examples are included;

Models involve proposing changes to the TEI P5 guidelines itself and defining our constraints on the TEI in an ODD;

Jack Bowers
Describe what ODD are, what they can do for users in creating customized schemata give link to ROMA;
Jack Bowers
(and hopefully ontolex)

(III) linking and pointing mechanisms;

Goals for TEI Etymological Markup Recommendations

(i) address the lack of sufficient digital markup models and standards for representing etymological information;

(ii) coherence in treatment of the same exact linguistic information between synchronic and diachronic data structures;

(iii) TEI structures that are LMF compatible;

(iv) make better use of linking mechanisms in TEI for:• connecting cited forms in etymology and their project

internal sources (where possible);• making use of existing external resources for lexical

and information conceptual not internal to project: e.g. open source lexical & ontological knowledge and linked data resources

(v) increase diversity in the types of etymological information that can be treated & make more use of concepts from linguistics:

Jack Bowers
<mentioned>

0…n

<colloc>

<per>

<usg>

<case>

<gram> <pos>

<number>

<tns>

<gen>

<mood>

Working TEI Dictionary Metamodel (elements)

TEI

0…n

<quote>

0…n

1…n

0…1

<usg> <cit>sense

0…n 0…n0…1

<gramGrp>

1…10…10…1

0…n0…n

1…n

<bibl>

0…n<form><sense>

<orth> <pron>

0…n

<seg> <seg>

<listChange>

1…n

<change>

0…1

<bibl>

0…n

0…1

<seg>

<oRef>

<pRef>

<gramGrp>

0…n

0…n

0…n<etym>sense

0…n

<etym>entr

y

0…n

<ref>

<gloss>

1…n

<oRef>

<pRef>

<lang>

<lbl>

0…n

<ptr>

<note>

<date>

<bibl>

<ptr>

<entry>

0…n

<ref>0…n

<spanGrp>

<span> <annotationGrp>

<annotations>

1…n 0…n

1…n

<def>

1…n

0…n

<def>

<cit>etym

<gramGrp>

<cit>

<num><cit>

<num>

<lbl>

<num>

<lbl>

0…n

<c><c>

Two Potential Etymology Structures in TEI

0…n

<quote>

0…n

<cit>

0…n0…1

<gramGrp>

1…n

<bibl>

0…n

<seg>

<oRef>

<pRef>

<ptr>

<entry>

0…n

<ref>0…n

<spanGrp>

<span> <annotationGrp>

<annotations>

1…n 0…n

1…n

<def>

1…n

0…n

• if there are semantic implications for the etymological change;

• no semantic implications for existing lexical items in language the etymological change;

<etym>sense

<etym>entry

• both may occur in the same entry to account for unrelated changes that occurred at different stages;

0…1 0…n

0…n

0…n

0…n

<etym>sense

0…n

<gramGrp>

<ref>

<gloss>

1…n

<cit>

<oRef>

<pRef>

<lang>

<lbl>

0…n

<ptr>

<note>

<date>

<bibl>

1…n

<def>

<num>

<cit>

<etym>entr

y<sense>

0…n

<usg>

Jack Bowers
note: that any change in morphosyntactic role effects the semantics;

<etym>entry

• If there are no semantic implications for the etymological change, and/or the semantic change occurred in another language or proto-language stage;

0…n

1…n

<entry>

<quote>

0…n

<cit>sense

0…n0…1

<gramGrp>

<bibl>

0…n

<seg>

<oRef>

<pRef>

<ptr>

0…n

<ref>0…n

<spanGrp>

<span> <annotationGrp>

<annotations>

1…n 0…n

1…n

<def>

1…n

0…n

<sense> • Inheritance;

Phonetic and phonological processes: (non exhaustive)

• assimilation (place, manner) ;• epenthesis;• metathasis• erosion/deletion; (apokope, • coalescence;• tone changes;

(has own internal categories)• neutralization;

• Borrowing*;• lexical item imported from

other language;

1…n

0…n

0…n

0…n

0…n

0…n

<ref>

<gloss>

1…n

<cit>etym

<oRef>

<pRef>

<lang>

<lbl>

0…n

<ptr>

<note>

<date>

<bibl>

1…n

<def>

<num>

<cit>

<colloc>

<per>

<usg>

<case>

<gram>

<number>

<gen>

<mood>

1…n

<num>

<gramGrp>

<note>

<etym>sense

<cit>

<lbl>

<etym>entry

<pos>

<tns>

Jack Bowers
Borrowing is up to encoder’s ontological preference/linguistic interpretation;

0…n

<quote>

0…n

0…1

<usg> <cit>sense

0…n0…1

<gramGrp>

1…n

<bibl>

0…n

<seg>

<oRef>

<pRef>

0…n

0…n

<etym>sense

0…n

<ref>

<gloss>

1…n

<cit>etym

<oRef>

<pRef>

<lang>

<lbl>

0…n

<ptr>

<note>

<date>

<bibl>

<ptr>

<entry>

0…n

<ref>0…n

<spanGrp>

<span> <annotationGrp>

<annotations>

1…n 0…n

1…n

<def>

1…n

0…n

Used when there are semantic implications for the etymological change;

• *where there are multiple etymological processes that occur and some are semantic in nature and others phonetic, they may all be included in <etym>sense if the former permitted the latter.

1…n

<def>

<num>

<cit>

<etym>sense

• Metaphor;• Metonymy• Blending*;• Compounding;• Grammaticalizati

on;

• several of these processes can co-occur;

<gramGrp>

0…n

0…n

0…n<etym>en

try

<num>

<lbl>

0…n

0…n

<colloc>

<pers>

<usg>

<case>

<gram> <pos>

<number>

<tns>

<gen>

<mood>

1…n

<sense>

<num>

<lbl>

Etymological Processes: Inheritance

<entry xml:lang="it" xml:id=“buono"> <form type="lemma"> <orth>buono</orth> <pron notation=“ipa">'bwo.no</pron> <gramGrp> <pos>adj.</pos> <gen>masc.</gen> </gramGrp> </form> <sense> .... </sense> <etym type="inheritance"> <cit type="etymon"> <oRef xml:lang="la">bónŭ</oRef> <gramGrp> <pos>adj.</pos> <gen>masc.</gen <case>nom.</case> </gramGrp> </cit> </etym></entry>

Italian < Vulgar Latin

buono < bŏnu

synchronic entry

diachronic (etymological) entry

Note: processes and changes are approximate and meant for demonstrating markup rather than asserting precise etymological diachrony of individual items;

Etymological Processes:

2)ˈbonu > ˈbon

<entry xml:lang="fr" xml:id="bon"> <form type="lemma"> <orth>bon</orth> <pron notation=“ipa">'bɔ̃G </pron> <gramGrp> <pos>adj</pos> <gen>masc.</gen> </gramGrp> </form> <sense> .... </sense> <etym type="inheritance"> <cit type=“etymon" xml:id="bónŭ" next="ˈbon"> <oRef xml:lang="la">bónŭ</oRef> <gramGrp> <case>nom.</case> </gramGrp> </cit> <cit type=“etymon” xml:id="ˈbon" prev=“bónŭ” next="ˈbɔ̃G "> <pRef xml:lang=“fro">ˈbon</oRef> </cit> <cit type=“etymon” xml:id="ˈbɔ̃G " prev=“ˈbon"> <pRef xml:lang="fro">bɔ̃G </oRef> </cit> </etym></entry>

bon < bónŭ

French < Vulgar Latin

(2) Intermediate phonological

change

(1) Root level etymological

process

(3) Final phonological

change

Inheritance&

Phonological Changes

Note: processes and changes are approximate and meant for demonstrating markup rather than asserting precise etymological diachrony of individual items;

3)ˈbon > ˈbɔ̃8

Etymological Processes: Borrowing*

Key Linguistic concepts:

Description of lexical process:• where a language takes a lexical item from different language;

• aka: loaning, importing;• often have historical and

practical explanation for need

• source language;• source form(s); phonetic, orthographic

• importing language;• imported or borrowed form;• semantic/meta-linguistic

concept;Source Language:

Importing Language:

Meta-linguistic Concept:

Borrowed Form(s):

Source Form(s):

orth(i..n)

pron(i..n)

orth(i..n)

pron(i..n)

Jack Bowers
corresponds to contents of <lang>; and the value of xml:lang in <pRef> and <oRef> (iso 693-3 code)

Etymological Processes: Borrowing*

<entry xml:id="taxi" xml:lang="jpn"> <form type="lemma"> <orth type="transliterated" notation="romanji">takushī</orth> <orth notation="katakana">タクシー </orth> <pron notation="ipa">taku'shi:</pron> <gramGrp> <pos>noun</pos> </gramGrp> </form>

<sense corresp="http://dbpedia.org/page/Taxicab"> <usg type=“dom">transportation</usg> </sense>

<etym type="borrowing"> <lbl>source</lbl> <lang>English</lang>

<cit type="etymon"> <oRef corresp="http://en.wiktionary.org/wiki/taxi" xml:lang="en">taxi</oRef> <pRef notation=“ipa" corresp=“http://en.wiktionary.org/wiki/taxi#Pronunciation" xml:lang="en-US">'tæksi</pRef> </cit>

</etym></entry>

Japanese < English: taxi(cab)

Borrowed Form(s):

Source Form(s):

Meta-linguistic Concept:

Importing Language

Source Language

Ontological Profile for borrowed concept

<cit type=“etymon">

<orth @type @notation>

<pron @notation>

<form

type=“lemma"><gramGrp>

<pos>

<etym

type=“borrowing”>

<entry @xml:id>

<oRef @corresp @xml:lang>

<sense @corresp>

<lbl>

<lang>

<usg type=“dom”>

TEI Model for Japanese ‘takushī’

Etymological Process: Borrowing

Lexical entry:

<pRef @notation @corresp @xml:lang>

Ontological resource for entry

External lexical entry resource for source term

External pronunciation resource for source term

Description of process:

Key components

• Domain of concept (y): Source Domain;

• Domain of concept (x): Target Domain

Source Concept:

Salient Attributes

Target Concept:

• Lexical innovation based in human cognition;

• Describe/understand one concept (x) in terms of concept (y);

• Requires a change in semantic domains;

• Mapping between concepts is only limited to certain salient attributes;

• Results in lexical Polysemy

Etymological Processes: Metaphor

Source Domain Profile:

Domain (x)

Target Domain Profile:

Domain (y)

LexicalSource Form(s)

PolysemousLexical Form(s)

phoneticorthographic

Jack Bowers
may want to use animations to move things in and out of screen so you can use: - bullet points;- images;- show dbpedia english description;
Jack Bowers
Which means that the will be homophonous and homographic items in the language (at least for a while) : <orth><oRef>; <pron><pRef>

Etymological Processes: Metaphor

Source Concept: bean

Target Concept: kidney

color shape

Source Domain Profile:

Legumes Food

Target Domain Profile:

Body Internal Organs

Ontological Profile for target concept

LexicalSource Form(s)

[ndù.ʧí]ntuchi

PolysemousLexical Form(s)

Mixtepec-Mixtec‘ntuchi’ (bean > kidney)

Jack Bowers
may want to use animations to move things in and out of screen so you can use: - bullet points;- images;- show dbpedia english description;

<entry xml:id="kidney"> <form type=“lemma"> <orth>ntuchi</orth> <pron notation="ipa">ndù.ʧí</pron> <!— gramGrp cluster—> </form> <sense corresp="http://dbpedia.org/resource/Kidney"> ….. <usg type="dom">Body</usg> <usg type=“dom">InternalOrgans</usg>

<etym type="metaphor"> <cit type=“etymon"> <oRef corresp="#bean">ntuchi</oRef> <pRef corresp="#bean">ndù.ʧí</pRef> <gloss>bean</gloss> </cit> </etym>

<entry xml:id="bean"> <form type=“lemma"> <orth>ntuchi</orth> <pron notation="ipa">ndù.ʧí</pron>

<!— gramGrp cluster—> </form> ….. <sense corresp="http://dbpedia.org/resource/Pinto_bean"> <usg type="dom">Legume</usg> <usg type="dom">Food</usg> ……. <!— translation info here—> </sense></entry>

Etymological Processes: Metaphor

Ontological Profile for target concept

dbpedia ontology entry for: ‘pinto

bean’

dbpedia ontology entry for: ‘kidney’

pointer to entry for ‘bean’

Jack Bowers
may want to use animations to move things in and out of screen so you can use: - bullet points;- images;- show dbpedia english description;

<usg type=“dom”>

<etym type=“metaphor”>

<gloss>

<cit type=“etymon”>

<lbl>

<sense @corresp>

<entry @xml:id>

<cit type=“translation” @xml:lang>

<oRef @corresp>

<gramGrp>

<pos>

<orth>

<pron @notation>

<form>

@type=“lemma"

<sense @correp>

<form type=“lemma”>

<usg type=“dom”>

<cit @type @xml:lang><gramGrp

>

<entry @xml:id>

TEI Model for Mixtepec-Mixtec ‘ntuchi’

Etymological process: Metaphor

Lexical entry:

Source entry:<pRef @corresp @notation>

Ontological resource for entry (kidney):

Ontological resource for Source entry (bean):

<oRef @corresp>

<pron @notation>

<orth>

<pos>

Etymological Processes: Metonymy

Description of lexical process:

Key Linguistic concepts:

• concept (y) stands for concept (x);• no change in semantic domains;• one “vehicle” entity provides

mental access to another, (i.e. a target) within the same domain.;

• source concept (cognitive);• target concept (cognitive);• source form (lexical);• target form (lexical): • results in (synchronic) polysemy

Vehicle Concept:

Target Concept:

Domain (X)

Jack Bowers
FINISH THIS FROM LITERATURE, NOTES
Jack Bowers
either change this term’concepts’ in header or in bullet points
Jack Bowers
Which means that the will be homophonous and homographic items in the language (at least for a while) : <orth><oRef>; <pron><pRef>
Jack Bowers
MERONYMY

Etymological Processes: MetonymyMixtepec-Mixtec: ‘kiti’ (horse)

<entry xml:id=“animal”> <form type="lemma"> <orth>kiti</orth> <pron notation="ipa">kì.tí</pron> <!—gramGrp here —> </form> <sense corresp="http://dbpedia.org/resource/Animal"> <usg type=“dom">Living Beings</usg> <usg type=“dom">Animal</usg>

<cit type="translation" xml:lang="eng"> <oRef>animal</oRef> </cit>

<!—other translations here —> </sense></entry>

<entry xml:id=“animal-horse”> <form type=“lemma"> <orth>kiti</orth> <pron notation="ipa">kì.tE í</pron> <!—gramGrp here —> </form> <sense corresp="http://dbpedia.org/resource/Horse"> <usg type=“dom”>Animal</usg>

<etym type="metonymy"> <date notBefore="1517"/> <cit type="etymon"> <oRef corresp="#animal">kiti</oRef> <pRef notation="ipa" corresp="#animal">kì.tE í</pRef> <gloss>animal</gloss> </cit> <note>In this lexical item, the language reflects the history, since there were no horses in Mexico until the arrival of the Spanish, there was no Mixtecan word for 'horse', thus they categorical noun for 'animal' was used to describe the unnamed animal. </note> </etym> <cit type="translation" xml:lang="eng"> <oRef>horse</oRef> </cit>

<!—other translations here —> </sense></entry>

Vehicle Concept; entryTarget Concept; entry

Jack Bowers
MERONYMY

<usg type=“dom”>

<form type=“lemma">

<entry @xml:id>

<sense @corresp>

<cit type=“translation” @xml:lang>

<oRef>

<gramGrp>

<pos>

<orth>

<pron @notation>

<sense @corresp>

<form type=“lemma”>

<pron @notation>

<usg type=“dom”>

<cit type=“translation” @xml:lang>

<gramGrp>

<entry @xml:id>

TEI Model for Mixtepec-Mixtec ‘kiti’ (horse)Etymological process: Metonymy

Lexical entry:

Source entry:

<etym type=“metonymy”>

Ontological resource for entry:

Ontological resource for Source entry:

<orth>

<cit type=“etymon”>

<note>

<gloss>

<oRef @corresp><pRef @corresp @notation>

<date @notBefore>

<pos>

<oRef>

Jack Bowers
points to entry (id) for the source term;the <(p/o)Ref>’s point (indirectly) to their respective <orth> and <pron> forms;
Jack Bowers
points to entry (id) for the source term;the <(p/o)Ref>’s point (indirectly) to their respective <orth> and <pron> forms;
Jack Bowers
this is the link to an external ontological entry for the target concept;if the conceptual/lexical source of the metaphor also has entry (and each are thorough) then these two data points could be used in various types of automatic processes treating metaphor..
Jack Bowers
this is the domain of both source and target term;

Etymological Processes: Compounding

Description of lexical process:

• Combines surface forms of two lexical items to form new one;

• Become the sum of its lexical and semantic parts;

• Can involve metaphor, metonymy, and/or grammaticalization

Etymon(i)*:

Etymon(ii)*:

grammatical info(i)

grammatical info(ii)

semantic/meta-

linguistic info(ii)

semantic/meta-

linguistic info(ii)

etym.process(0..n)

etym.process(0..n)

Jack Bowers
Jack Bowers
when is does, each get their own embedded <etym> elementGIVE A NON ENCODED LIST OF EXAMPLES FROM MULTIPLE LANGUAGES OF EACH DIFFERENT TYPE OF COMPOUND:handshue portmonnaie …

Etymological Processes: Compounding(with Metonymy)

Salient attribute of location = “the presence of hummingbirds”

Mixtepec-Mixtec: Yucha Nchu’u ’Puebla State’

<etym type="metonymy"> <cit type="etymon"> <oRef corresp=“#hummingbird”>Nchu’u</pRef>

<gramGrp> <pos>concrete noun</pos>

</gramGrp> <gloss>hummingbird</gloss> </cit> </etym>

<entry xml:id=“Puebla-state" xml:lang="mix" type="compound"> <form type="lemma"> <orth><seg corresp=“#lake">Yucha</seg> <seg corresp=“#hummingbird”>Nchu’u</seg></orth>

<!— <gramGrp> here —>….. </form>

Etymon(1): <sense corresp="http://dbpedia.org/resource/Puebla_State">

<etym type="compounding">

</etym> …. </sense></entry>

<cit type="etymon"> <oRef corresp=“#lake”>Yucha</pRef>

<gramGrp> <pos>concreteNoun</pos>

</gramGrp> <gloss>hummingbird</gloss> </cit>

Etymological process(ii): Metonymy

(Primary) Etymological process: Compounding

Etymon(2):

Jack Bowers
Jack Bowers
when is does, each get their own embedded <etym> elementGIVE A NON ENCODED LIST OF EXAMPLES FROM MULTIPLE LANGUAGES OF EACH DIFFERENT TYPE OF COMPOUND:handshue portmonnaie …

<oRef @corresp>

<form

type=“lemma">

<gramGrp>

<pos>

<orth>

<seg @corresp>

Etymological Processes: Borrowing & Compounding

TEI model for Mixtepec-Mixtec “Yucha Nchu’u”

<gloss>

<cit type=“etymon”>

<pos>

<gramGrp>

<oRef @corresp>

<gloss>

<cit type=“etymon”>

<etym type=“metonymy”>

<pos>

<gramGrp>

Lexical entry:

<entry @xml:id type=“compound”>

<etym type=“compounding”>

<sense @corresp>

<seg @corresp>

Ontological resource for entry:

Alt (2006) LMF etymology extension proposal; merged with the LMF Core package

Form

Representation

Lexical Entry

Lexical DB

Text Representation

Lexical Resource

Global Information

Statement

Form Representation

0…n

1…n

0…1

0…n

Etymon Etymological Link

Etymology

0…n

1…n

1…n

1…n

0…nSense

0…n

0…n0…

n

0…n

1…1

Definition

pompel

limoes+pamplemousse pompelmoes

Synchronic Diachronic

DutchModern French

/etymologicalLink//source/=“..”/target/=“…”/etymologicalClass/=/composition//biblSource/=“Boulan, König…”/confidenceScore/=“probable”

Etymology of French ‘pamplemousse’: from Trésore de la Langue Française (TFL)

Etymological stageComposition

(eg., Compounding)

Etymological stage Loan Word

(eg., Borrowing)

/etymon//orth/=“pompelmoes”/language/=”nl”/pos/=“commonNoun”/gender/=“feminine”/gloss/=“Citrus Maxima”

/etymologicalLink//source/=“..”/target/=“…”/etymologicalClass/=/loan word//biblSource/=“TLF”

Alt (2006) LMF Etymology Extension: Borrowing Stage

/etymon//orth/=“limoes”/language/=“nl”/pos/=“commonNoun”/gloss/=“citron”

/etymon//orth/=“pompel”/language/=“nl”/pos/=“adjective”/gloss/=“gros, enflé”

<entry xml:id="LE1" xml:lang=“fr"> <form type="lemma"> <orth>pamplemousse</orth> .... </form> <sense> .... </sense>

….. </etym></entry>

<cit type="etymon" xml:id="L2"> <oRef xml:lang="nl">pompelmoes</oRef> <gloss xml:lang="lat">Citrus maxima</gloss> <gramGrp> <pos>commonNoun</pos> <gen>feminine</gen> </gramGrp> <note>probablement de l’origine tamoule, De Vries, Nederl</note></cit>

<etym type=“borrowing"> …..

<ref target=“#TLF”>TLF</ref>

…..

Alt (2006) LMF Etymology Extension: Borrowing StageConverted TEI Markup

Note: our TEI structures do not explicitly use an equivalent of /etymologicalLink/ or “ /source/=“..”/target/=“…” ) as this link is implicitly present in the xml data structure

Dutch

ModernFrench

pompelmoes

pamplemousse

/etymologicalLink//source/=“..”/target/=“…”/etymologicalClass/=/loan word//biblSource/=“TLF”

/etymon//orth/=“pompelmoes”/language/=”nl”/pos/=“commonNoun”/gender/=“feminine”/gloss/=“Citrus Maxima”

<!— ‘compounding’ section goes here —>

pompel

limoes+pamplemousse pompelmoes

Synchronic Diachronic

DutchModern French

/etymologicalLink//source/=“..”/target/=“…”/etymologicalClass/=/composition//biblSource/=“Boulan, König…”/confidenceScore/=“probable”

Etymological stageComposition

(eg., Compounding)

Etymological stage Loan Word

(eg., Borrowing)

/etymon//orth/=“limoes”/language/=“nl”/pos/=“commonNoun”/gloss/=“citron”

/etymon//orth/=“pompel”/language/=“nl”/pos/=“adjective”/gloss/=“gros, enflé”

/etymon//orth/=“pompelmoes”/language/=”nl”/pos/=“commonNoun”/gender/=“feminine”/gloss/=“Citrus Maxima”

Alt (2006) LMF Etymology Extension: Compounding Stage Etymology of French ‘pamplemousse’:

from Trésore de la Langue Française (TFL)

/etymologicalLink//source/=“..”/target/=“…”/etymologicalClass/=/loan word//biblSource/=“TLF”

TEI Implementation of Alt (2006) LMF Etymology Extension: Compounding Stage

<entry xml:id="LE1" xml:lang=“fr"> <form type="lemma"> <orth>pamplemousse</orth> .... </form> <sense> .... </sense> <etym type="borrowing"> ……

….. </etym></entry>

<etym type=“compounding”>

<ref target="#Boulan-König">Boulan, König...</ref> </etym>

<cit type="etymon"> <oRef xml:lang="nl">pompel</oRef> <gramGrp> <pos>adjective</pos> </gramGrp> <gloss>gros, enflé</gloss></cit><cit type=“etymon"> <oRef xml:lang="nl">limoes</oRef> <gramGrp> <pos>commonNoun</pos> </gramGrp> <gloss>citron</gloss></cit>

/etymon//orth/=“pompel”/language/=“nl”/pos/=“adjective”/gloss/=“gros, enflé”

/etymon//orth/=“limoes”/language/=“nl”/pos/=“commonNoun”/gloss/=“citron”

pompel

limoes+

pamplemousse

HistoricalDutch

Modern French

/etymologicalLink//source/=“..”/target/=“…”/etymologicalClass/=/composition//biblSource/=“Boulan, König…”/confidenceScore/=“probable” <!— ‘borrowing’ section goes here —>

Note: our TEI structures do not explicitly use an equivalent of /etymologicalLink/ or “ /source/=“..”/target/=“…” ) as this link is implicitly present in the xml data structure

<lbl>

<lang>

<sense> 0…n

<oRef @xml:lang>

<etym type=“borrowing”>

<ref @target>

<form

type=“lemma">

<gramGrp>

<pos>

<c>

<orth>

<seg @corresp>

Etymological Processes: Borrowing & Compounding

TEI model for ‘pompelmousse’ as converted from LMF (Alt 2006)

<gloss @xml:lang>

<cit type=“etymon”>

<gen>

<note>

<pos>

<gramGrp>

<oRef @xml:lang>

<gloss @xml:lang>

<cit type=“etymon”>

<etym type=“compounding”>

<ref @target>

<pos>

<gramGrp>

Lexical entry:

<seg @corresp>

<entry @xml:id type=“compound”>

Étymol. et Hist. 1. 1re moitié du xiies. put cel estre (Psautier Oxford, 54, 13 ds T.-L.); ca 1160 puet estre (Eneas, 9003, ibid.); début xves. peut-estre (Quinze joies mariage, éd. J. Rychner, XII, 12); 1824 peut-être bien (Joubert, loc. cit.); 2. 1636 employé elliptiquement pour répondre évasivement à une question (Corneille, Le Cid, I, 2); 3. 1775 détaché en fin de phrase, exprimant le défi, l'ironie (Beaumarchais, Barbier de Séville, II, 2); 4. fin xiies. puet estre que (Flore et Blancheflor, éd. J.-L. Leclanche, 407); 1641 peut-estre que (Corneille, Cinna, III, 1); 5. 1637 subst. un peut-estre (Id., La Place royale, IV, 6). Comp. de peut, 3epers. du sing. de l'ind. prés. de pouvoir* et de être*.

<entry xml:id="peut-être" xml:lang="fr" type="compound"> <form type="lemma"> <orth><seg corresp="#pouvoir-3s-pres-ind">peut</seg><c>-</c><seg corresp="#être">être</seg></orth> <gramGrp> <pos>adv.</pos> </gramGrp> </form>…</entry>

Étymol. et Hist.1. 1re moitié du xiies. put cel estre (Psautier Oxford, 54, 13 ds T.-L.); ca 1160 puet estre (Eneas, 9003, ibid.); début xves. peut-estre (Quinze joies mariage, éd. J. Rychner, XII, 12); 1824 peut-être bien (Joubert, loc. cit.); 2. 1636 employé elliptiquement pour répondre évasivement à une question (Corneille, Le Cid, I, 2); 3. 1775 détaché en fin de phrase, exprimant le défi, l'ironie (Beaumarchais, Barbier de Séville, II, 2); 4. fin xiies. puet estre que (Flore et Blancheflor, éd. J.-L. Leclanche, 407); 1641 peut-estre que (Corneille, Cinna, III, 1); 5. 1637 subst. un peut-estre (Id., La Place royale, IV, 6). Comp. de peut, 3epers. du sing. de l'ind. prés. de pouvoir* et de être*.

PEUT-ÊTRE, adv.

Encoding from existing sources:

synchronic portion of entry

Trésor de la Langue Française

For “compound” entry types, @corresp can (optionally) be used in the <seg> element to point to the individual sub components of the item within a project or externally;

Étymol. et Hist.1. 1re moitié du xiies. put cel estre (Psautier Oxford, 54, 13 ds T.-L.); ca 1160 puet estre (Eneas, 9003, ibid.); début xves. peut-estre (Quinze joies mariage, éd. J. Rychner, XII, 12); 1824 peut-être bien (Joubert, loc. cit.); 2. 1636 employé elliptiquement pour répondre évasivement à une question (Corneille, Le Cid, I, 2); 3. 1775 détaché en fin de phrase, exprimant le défi, l'ironie (Beaumarchais, Barbier de Séville, II, 2); 4. fin xiies. puet estre que (Flore et Blancheflor, éd. J.-L. Leclanche, 407); 1641 peut-estre que (Corneille, Cinna, III, 1); 5. 1637 subst. un peut-estre (Id., La Place royale, IV, 6). Comp. de peut, 3epers. du sing. de l'ind. prés. de pouvoir* et de être*.PEUT-ÊTRE, adv.

Encoding from existing sources:

non-linguistic content portion of diachronic entry

…. <etym xml:id=“PEUT-ÊTRE-adv-Étym-et-Hist” > <lbl>Étymol. et Hist.</lbl> <num>1.</num> …… <num>2.</num> ….. <num>3.</num> …… <num>4.</num> ….. <num>5.</num> …… <note> Comp. de peut, 3epers. du sing. de l'ind. prés. de pouvoir* et de être*. </note> </etym>…

Trésor de la Langue Française

Étymol. et Hist.

2. 1636 employé elliptiquement pour répondre évasivement à une question (Corneille, Le Cid, I, 2);

1. 1re moitié du xiies. put cel estre (Psautier Oxford, 54, 13 ds T.-L.); ca 1160 puet estre (Eneas, 9003, ibid.); début xves. peut-estre (Quinze joies mariage, éd. J. Rychner, XII, 12); 1824 peut-être bien (Joubert, loc. cit.);

3. 1775 détaché en fin de phrase, exprimant le défi, l'ironie (Beaumarchais, Barbier de Séville, II, 2);

4. fin xiies. puet estre que (Flore et Blancheflor, éd. J.-L. Leclanche, 407); 1641 peut-estre que (Corneille, Cinna, III, 1);

5. 1637 subst. un peut-estre (Id., La Place royale, IV, 6).

Comp. de peut, 3epers. du sing. de l'ind. prés. de pouvoir* et de être*.

PEUT-ÊTRE, adv.

Encoding from existing sources:

diachronic portion of entry

….<sense> <etym xml:id=“PEUT-ÊTRE-adv-Étym-et-Hist” type="inheritance"> <lbl>Étymol. et Hist.</lbl> <num>1.</num> …… <num>2.</num> ….. <num>3.</num> …… <num>4.</num> ….. <num>5.</num> …… <note> Comp. de peut, 3epers. du sing. de l'ind. prés. de pouvoir* et de être*. </note> </etym></sense>…

Trésor de la Langue Française

<cit type="attestation"> <date> </date> <oRef> </oRef> <gramGrp> <!—appropriate element here —> </gramGrp> <bibl> </bibl> <note> </note></cit>….

template

2. 1636 employé elliptiquement pour répondre évasivement à une question (Corneille, Le Cid, I, 2);

1. 1re moitié du xiies. put cel estre (Psautier Oxford, 54, 13 ds T.-L.); ca 1160 puet estre (Eneas, 9003, ibid.); début xves. peut-estre (Quinze joies mariage, éd. J. Rychner, XII, 12); 1824 peut-être bien (Joubert, loc. cit.);

3. 1775 détaché en fin de phrase, exprimant le défi, l'ironie (Beaumarchais, Barbier de Séville, II, 2);

4. fin xiies. puet estre que (Flore et Blancheflor, éd. J.-L. Leclanche, 407); 1641 peut-estre que (Corneille, Cinna, III, 1);

5. 1637 subst. un peut-estre (Id., La Place royale, IV, 6).

Étymol. et Hist.1. 1re moitié du xiies. put cel estre (Psautier Oxford, 54, 13 ds T.-L.); ca 1160 puet estre (Eneas, 9003, ibid.); début xves. peut-estre (Quinze joies mariage, éd. J. Rychner, XII, 12); 1824 peut-être bien (Joubert, loc. cit.); 2. 1636 employé elliptiquement pour répondre évasivement à une question (Corneille, Le Cid, I, 2); 3. 1775 détaché en fin de phrase, exprimant le défi, l'ironie (Beaumarchais, Barbier de Séville, II, 2); 4. fin xiies. puet estre que (Flore et Blancheflor, éd. J.-L. Leclanche, 407); 1641 peut-estre que (Corneille, Cinna, III, 1); 5. 1637 subst. un peut-estre (Id., La Place royale, IV, 6). Comp. de peut, 3epers. du sing. de l'ind. prés. de pouvoir* et de être*.

Encoding from existing sources:

diachronic portion of entry

<cit type="attestation"> <date notBefore="1200" notAfter="1250">1re moitié du xiies</date> <oRef xml:lang="fro">put cel estre</oRef> <bibl>(Psautier Oxford, 54, 13 ds T.-L.)</bibl></cit>

Trésor de la Langue Française

iso 639-3 codeOld French (842-ca. 1400) fro

iso 639-3 codeMiddle French (ca. 1400 - 1600) frm

<cit type="attestation"> <date notBefore="1400" notAfter="1450">début xves</date> <oRef xml:lang="frm">peut-estre</oRef> <bibl>(Quinze joies mariage, éd. J. Rychner, XII, 12)</bibl></cit>

<cit type="attestation"> <date when="1824">1824</date> <oRef>peut-être bien</oRef> <bibl>(Joubert, loc. cit.)</bibl></cit>….

1re moitié du xiies. put cel estre (Psautier Oxford, 54, 13 ds T.-L.);

1824 peut-être bien (Joubert, loc. cit.);

début xves. peut-estre (Quinze joies mariage, éd. J. Rychner, XII, 12);

1.

Conclusions and Summary

Our TEI recommendations can facilitate:• linking and integrating corresponding data structures between

the synchronic and diachronic levels;• the use of open source lexical resources and ontological

information;• a more principled and consistent set of TEI guidelines for digitally

encoding etymological information;

• better compatibility between information traditionally kept, and formatted separately in etymological dictionaries, lexical dictionaries and linguistic analyses;

• models for encoding ubiquitous processes of linguistic change for multiple levels of language;

• theoretically agnostic data structures;

• a more diverse set of etymological examples for the TEI guidelines;