Arabic Syntactic Trees Zdeněk Žabokrtský Otakar Smrž Center for Computational Linguistics...

Preview:

DESCRIPTION

April 15, 2003Arabic Syntactic Trees: from Constituency to Dependency3 Constituency X Dependency Non-terminal nodes + Text tokens Constituent labeling on non-terminals Slots and traces Linguistic Data Consortium, University of Pennsylvania Sentence root node + Text tokens Analytical function for every tree node Government and roles CCL & IFAL & ICL, Charles University in Prague

Citation preview

Arabic Syntactic Trees

Zdeněk ŽabokrtskýOtakar Smrž

Center for Computational LinguisticsFaculty of Mathematics and PhysicsCharles University in Prague

from Constituency to Dependency

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 2

Motivation & Background Linguistic Data Consortium Arabic Treebank

Constituent-syntax bracketing ~100k words published Modification from English to Arabic

Prague Arabic Dependency Treebank Dependency approach to syntax ~50k words in

progress Pre-step to tectogrammatical description

Motivation: co-operation and resource exchange Our goal: transform the data from one annotation

scheme to the other

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 3

Constituency X Dependency Non-terminal nodes

+ Text tokens Constituent labeling

on non-terminals Slots and traces

Linguistic Data Consortium, University of Pennsylvania

Sentence root node + Text tokens

Analytical function for every tree node

Government and roles

CCL & IFAL & ICL, Charles University in Prague

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 4

Model Arabic Phrase I Trace of the antecedent

subject Compound function of

the head of the clause – outer and inner perspectives

Free word-order compliant

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 5

Outline of the Transformation

1. Build temporary dependency tree Contraction of the input phrase-structure tree Uniquely determined by head selection function Implementation: simple recursive procedure

2. Create analytical tree topology Post-processing (corrections) of the temporary dep.

tree, e.g., substituting traces with trace coindexed fillers

Re-arrangement of special complex constructs

3. Assign analytical functions

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 6

Head Selection Function For each constituent, select the head

constituent among its children Based on (ordered) handcrafted rules Examples:

If there is a node with tag=PREP among the children, then it is the head

If there is a node with phrase_label=VP among the children, then it is the head

... etc ... If nothing was selected by the rules, then the

rightmost child is selected

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 7

Analytical Function Assignment

Based on (ordered) handcrafted rules and lexical lists

Completes the process, does not override previous assignments

Examples: phrase_label=NP-SBJ afun=Sb lemma=wa- afun=Coord pos_tag=CONJ afun=AuxC ... etc ...

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 8

Model Arabic Phrase II Sister-like co-ordination Conjunction of co-ordination

Status constructus

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 9

Model Arabic Phrase III Non-expressed subject (?) Complex modality

constructs Principal discrepancies

between descriptions – both in topology and labeling

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 10

Model Arabic Sentence Wa lam yakun mina ’s-sahli `alay hi

muwāğahatu kāmīrāti ’t-tilfizyūni wa `adasāti ’l-muşawwirīna wa huwa yaş`adu ’l-bāşa.

It was not easy for him to face the television cameras and the lenses of photographers as he was getting on the bus.

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 11

Constituency Annotation

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 12

Dependency Annotation

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 13

Evaluation & Conclusion Implementation still in progress, fine-tuning

needed

10,000 words manually annotated in both styles ~60% of correctly aimed dependencies

2nd Prague Penn Arabic Treebanking Workshop, May 2003 in Prague

Transfer from dependency to constituency?

April 15, 2003 Arabic Syntactic Trees: from Constituency to Dependency 14

Related Work New tool for assignment of analytical functions

Based on machine learning (C5-trained decision trees) Error rate 17% (supposing the topology of the tree is

correct)

First experiments with Arabic dependency parser

Incorporated into the process of annotation of Prague Arabic Dependency Treebank

Recommended