Upload
nora-carmella-sullivan
View
216
Download
0
Tags:
Embed Size (px)
Citation preview
Linguistic Representation of Finnish in the Medical Domain
Spoken Language Translation System
Marianne Santaholma, University of Geneva, TIM/ISSCO
Outline MedSLT system overview MedSLT Finnish language resources:
CorporaGeneration grammarLexiconInterlingua Finnish mapping rules
Initial evaluation results Summary
MedSLT system (1) Open source medical domain SLT
system Diagnosis tool for doctors One-way dialog Multilingual Coverage: medical sub-domains Architecture: based on general linguistic
resources
Speech Platform Interface Process Translation Server
UnificationGrammar Database
RecognitionPackage
Nuance Voice Platform(recognition, playback)
Application Specific Data
Regulus Runtime
Time System
Regulus Compile
Time Component
GenerationGrammar
Outline MedSLT Finnish language resources:
Corpora Generation grammarLexiconInterlingua Finnish mapping rules
Initial evaluation results Summary
Finnish corpora (1) Headache and chest pain sub-domains Created by translating the original English
corpora Serve as the primary source to decide
what kind of structure rules and vocabulary necessary to introduce into Finnish language module
Finnish corpora (2)
Concepts covered frequency of pain, duration of pain, location of pain etc
Examples Do you have headaches in the morning?
• In the evening? Is your headache stubbing?
• severe? Are your headaches caused by coffee?
• By cheese?
Outline MedSLT Finnish language resources:
Corpora Generation grammarLexiconInterlingua Finnish mapping rules
Initial evaluation results Summary
FIN generation grammar (1)
Specialized grammar for spoken languageReflects the specific text type and
discourse of the domain 57 grammar rules Unification formalism Developed on the Regulus platform
https://sourceforge.net/projects/regulus/
FIN generation grammar (2)
FIN grammar developed by manual grammar adaptation from the Regulus general English grammar
The Finnish structure rules highly similar to English counterparts
In Finnish more phenomena resolved at morphology level rather than syntax
(Rayner et al., 2000. Spoken Language Translator)
FIN generation grammar (2)'How frequent are your headaches?'s:[sem= @fronting_sem(Adj, S), wh=y\/rel, wh=Wh, vform=VForm,
inv=Inv, whmoved=y, takes_adv_type=none, gapsin=null, gapsout=null] -->adjp:[sem=Adj, wh=Wh, adjpos=pred, gapsin=null, gapsout=null], s:[sem=S, wh=n, vform=VForm, inv=Inv, whmoved=n, gapsin=adjp_gap, gapsout=null].
'Kuinka yleisiä päänsärkynne ovat?’ *how frequent your_headaches are?'s:[sem= @fronting_sem(Adj, S), wh=y, inv=n, vform=inf,
whmoved=y, takes_adv_type=none, gapsin=null, gapsout=null] -->adjp:[sem=Adj, wh=y, agr=Agr, adj_pos=pred, adj_case=Case, adj_degr=positive, gapsin=null, gapsout=null],s:[sem=S, wh=n, agr=Agr, vform=inf, inv=n, whmoved=n, gapsin=adjp_gap, gapsout=null].
Outline MedSLT Finnish language resources:
Corpora Generation grammarLexiconInterlingua Finnish mapping rules
Initial evaluation results Summary
Finnish Lexicon (1)
Domain specific ~ 530 lexical entries Difficulty: enumeration of all word forms
Example:Lievittää, ‘to relieve’, question form, sg 3., present.
verb:[sem=[[event, lievittää], [tense, present]], vform=q_ko, agr=sg, subcat=trans, subj_n_case=nom, subj_sem_n_type=(cause\/activity), obj_sem_n_type=perception_body, obj_case=ptv, takes_adv_type=frequency] --> lievittääkö.
Finnish Lexicon (2) Use of macros in lexical entries
macro(noun_perception_body([SgNom, PlNom, SgPtv, PlPtv], Sem), (noun:[sem=[Sem], sem_n_type=perception_body, agr=sg, case=nom]--> SgNoun)). macro(noun_perception_body([SgNom, PlNom, SgPtv, PlPtv], Sem), (noun:[sem=[Sem], sem_n_type=perception_body, agr=pl, case=nom]--> PlNoun)). macro(noun_perception_body([SgNom, PlNom, SgPtv, PlPtv], Sem), (noun:[sem=[Sem], sem_n_type=perception_body, agr=sg, case=ptv]--> SgPtv)). macro(noun_perception_body([SgNom, PlNom, SgPtv, PlPtv], Sem), (noun:[sem=[Sem], sem_n_type=perception_body, agr=pl, case=ptv]--> PlPtv)).
@noun_perception_body([särky, säryt, särkyä, särkyjä], [symptom, särky]).
Outline MedSLT Finnish language resources:
Corpora Generation grammarLexiconInterlingua Finnish mapping rules
Initial evaluation results Summary
Interlingua to FIN mapping MedSLT interlingua interlingua_constant([<key>, <value>])
‘interlingua_constant([symptom, headache])’
Interlingua mapping rules Transformation
Source InterlinguaInterlingua Target
Two types of rules: Simple interlingua transfer_lexicon entries Complex interlingua transfer_rules
SOURCE INTERLINGUA TARGET
ENG: Does the redwine make your headache worse?
FIN: Pahentaako punaviini päänsärkyä?
[[adj,worse], [cause,red_wine], [event,make_adj], [prep,subj], [secondary_symptom, headache], [spec,the_sing], [tense,present], [utterance_type,ynq], [voice, active]]
[[sc,when], [clause, [[utterance_type,dcl], [pronoun,you], [tense,present], [voice,active], [action,drink], [cause,red_wine]]], [event,become_worse], [symptom,headache], [tense,present], [utterance_type,ynq], [voice,active]]
[[cause,punaviini], [event,pahentaa], [symptom,päänsärky], [tense,present], [utterance_type,ynq]]
transfer_rule([[sc,when], [clause, [[utterance_type,dcl], [pronoun,you], [tense,present], [voice,active], [action, drink], ECause]], [event, become_worse], [voice,active]],
[[event, pahentaa], @efin_cause (ECause)]).
transfer_lexicon([symptom, headache], [symptom, päänsärky]).
Outline MedSLT Finnish language resources:
Corpora Generation grammarLexiconInterlingua Finnish mapping rules
Initial evaluation results Summary
Evaluation (1)
Evaluation of Eng-Fin translation performance on headache sub-domain corpus of 870 utterances Comparison with Eng-Fre translation performance
Evaluation in two phases:1. Judging of speech recognition:
good vs. bad2. Judging of translations:
good/acceptable/bad
Evaluation (2)
60
4.4 0.5
35
75.8
19.2
0.7 4.4
0
20
40
60
80
FIN
FRE
FIN 60 4.4 0.5 35
FRE 75.8 19.2 0.7 4.4
Good translation
Acceptable translation
Bad translation
No translation
Evaluation (3)
Lexical gapsExample
“Does the pain radiate to the neck?”
(in coverage sentence) “Is the pain in the neck?”
(not in coverage sentence).
- Finnish ablative vs adessive case
‘kaulalle’ vs ‘kaulalla’
Summary
Development of MedSLT Finnish language module by partly adapting the existing resources. English and Finnish grammar rules highly
similar despite the differences between the languages
Difficulty the Finnish rich morphology that however can be resolved for some degree by using macros in lexicon
Initial evaluation of translation performance
References
MedSLT http://sourceforge.net/projects/medslt/ http://www.issco.unige.ch/projects/medslt
Regulus https://sourceforge.net/projects/regulus/ http://www.issco.unige.ch/projects/regulus