27
An interactive environment for creating and validating syntactic rules Panagiotis Bouros*, Aggeliki Fotopoulou, Nicholas Glaros Institute for Language and Speech Processing (ILSP) {pbour, afotop, nglaros}@ilsp.gr

An interactive environment for creating and validating syntactic rules Panagiotis Bouros*, Aggeliki Fotopoulou, Nicholas Glaros Institute for Language

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

An interactive environment for creating and validating syntactic rules

Panagiotis Bouros*, Aggeliki Fotopoulou, Nicholas Glaros

Institute for Language and Speech Processing (ILSP){pbour, afotop, nglaros}@ilsp.gr

* Current affiliation is National and Kapodistrian University of Athens, Department. of Informatics and Telecommunications

2

Outline

IntroductionMotivationArchitectureWorking EnvironmentFunctionalityReal-World scenarioConclusion

3

Introduction (1)

Checking human free text challenge Word-by-word approach

Efficient for automatic check of spelling errors Prominent in languages with poor morphology

Phrase-by-phrase approach No misspelled words but still incorrect syntax, e.g. “I

listens to the music.” Rule based syntactic analysis Highly inflectional languages, e.g. Greek

Need for advanced spelling checkers

4

Introduction (2)

Building advanced spelling checkers Statistical approaches

N-grams Smoothing techniques

Syntactic analysis framework Morphological lexicon Set of syntactic rules

5

Motivation Focus on syntactic analysis Support of ILSP’s advanced spelling checker (Symfonia) Interactive environment:

User-friendly for language specialists – no need for computer programming knowledge

Enables user to easily create, edit, view and test syntactic rules Graphical tree representation XML storing mechanism – targeted speller independent Ready-to-execute targeted speller code

Supports monitoring and validation of syntactic rules application and interaction

Text corpora Check all or a subset of syntactic rules Identification and handling of possible conflicts Generation of detailed reports with rich monitoring information

6

Architecture (1)

Graphical Rule CreatorRule HandlerRules KernelLexiconRules Kernel Monitor

7

Architecture (2)

Lexicon

<rule> <description/></rule>

XML

Rules Kernel Monitor

Graphical Rule Creator

Rules Kernel

Main ScreenMain Screen

Rule Handler

uses

uses

uses

report

8

Working Environment

Rules integratedinto Rules Kernel

Rule status

Create ruleEdit ruleRemove ruleDisable ruleEnable ruleExport ruleExport Rules Kernel

Monitorprocedure

9

Create rule

Focus on LexiX Specify properties Specify rule context using tree representation

LexiX valid grammatical characterizations Specify lexi i.e. grammatical characteristics of a word

Rule result Restriction in specific words Inheritance of grammatical characteristics from

adjacent words Alternative rule environments

Set of lexis

10

Edit rule

Similar to rule’s creationXML rule file parsing -> filled tree

representationUser modifications on:

Rule properties Rule context

11

More functionalities

Remove ruleDisable/Enable ruleExport rule

To high level programming language E-mail to targeted syntactic speller

programmers

Export Rules Kernel

12

Monitor procedure (1)

Generation and selection of rules Optimized performance of spelling check

engine Consistent set of rules

Need to check one or more rules against the others Identify and minimize possible conflicts and

insufficiencies

13

Monitor procedure (2)

Two kinds of checking Interactive

when a spelling error occurs, the user picks one of the automatically generated spelling suggestions

Automatic the system picks the first in the list of spelling suggestions

by default

But first of all Specify input text Set of rules syntactic rules Report Document with erroneous sentences

14

Report structure

15

Real-World scenario

Solve ambiguity “πιο” (more) – “ποιο” (which) Same phonetic transcription /pjo/ Different grammatical category, adverb –

pronoun

Two syntactic rules need Decision “πιο” Decision “ποιο”

16

Real-World scenario

Rule environment: Lexi1 LexiX Lexi2 LexiX characterized by ambiguity “πιο” – “ποιο” Lexi1 article Lexi2 either an adjective or a noun or an

adverb

Then Lexi1 adverb – “πιο”

17

Real-World scenario

Rule environment: LexiX Lexi1 Lexi2 Lexi3 Lexi4 Lexi5

Some or all of Lexi1, Lexi2, Lexi3, Lexi4 maybe missing LexiX characterized by ambiguity “πιο” – “ποιο” Lexi1 article Lexi2 adjective Lexi3 noun Lexi4 particle Lexi5 verb

Then LexiX pronoun – “ποιο”

18

Conclusion

Focus on syntactic analysisSupport of ILSP’s advanced spelling

checker (Symfonia)Interactive user-friendly environment for

Fast generation of syntactic rules Create, edit, view

Real time monitoring and validation of their application in existing text corpora

19

Questions (?)

20

Backup slides

21

Specify rule properties

22

Rule Graphical Tree Representation

23

Specify lexi’s grammatical characteristics

24

Specify LexiX correspondence

25

Monitor settings

26

Interactive checking procedure

27

Rule Graphical Tree Representation (Real-World scenario)