View
215
Download
0
Category
Preview:
Citation preview
Evaluating an ‘off-the-shelf’
POS-tagger on Early Modern German text
Silke Scheible, Richard Jason Whitt, Martin Durrell, and Paul Bennett
The GerManC projectSchool of Languages, Linguistics, and Cultures
University of Manchester (UK)
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Overview
• Motivation• The GerManC corpus• POS-tagger and tagset• Challenges• Results
2
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Motivation
• Goal: – POS-tagged version of GerManC corpus
3
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Motivation
• Goal: – POS-tagged version of GerManC corpus
• Problems:– No specialised tagger available for EMG– Limited funds: Manual annotation not
feasible for whole corpus
4
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Motivation
• Goal: – POS-tagged version of GerManC corpus
• Problems:– No specialised tagger available for EMG– Limited funds: Manual annotation not
feasible for whole corpus
• Question:– How well does an ‘off-the shelf’ tagger for
modern German perform on Early Modern German data?
5
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Motivation
• Tagger evaluation requires gold standard data
6
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Motivation
• Tagger evaluation requires gold standard data
• Idea: – Develop gold-standard subcorpus of
GerManC – Use subcorpus to test and adapt modern
NLP tools– Create historical text processing pipeline
7
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Motivation
• Tagger evaluation requires gold standard data
• Idea: – Develop gold-standard subcorpus of
GerManC – Use subcorpus to test and adapt modern NLP
tools– Create historical text processing pipeline
• Results useful for other small humanities-based projects wishing to add POS annotations to EMG data
8
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
The GerManC corpus
9
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
The GerManC corpus
• Purpose: Studies of development and standardisation of German language
• Texts published between 1650 and 1800
• Sample corpus (2,000 words per text)• Total corpus size: ca. 1 million words• Aims to be “representative”
10
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
The GerManC corpus
• Eight genres
11
Orally-oriented
Print-oriented
DramasNewspapersLettersSermons
Narrative proseHumanities textsScience & medicine textsLegal texts
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
The GerManC corpus
• Three periods
12
1650-1700
1700-1750
1750-1800
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
The GerManC corpus
• Five regions
13
North German
West Central German
East Central German
West Upper German
East Upper German
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
The GerManC corpus
• Three 2,000-word files per genre/period/region
• Total size: ca. 1 million words
14
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Gold-standard subcorpus: GerManC-GS
• One 2,000-word file per genre and period from North German region 24 files
• > 50,000 tokens• Annotated by two historical linguists• Gold standard POS tags, lemmas, and
normalised word forms
15
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
POS-tagger
• TreeTagger (Schmid, 1994)• Statistical, decision tree-based POS
tagger• Parameter file for modern German
supplied with the tagger• Trained on German newspaper corpus• STTS tagset
16
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
STTS-EMG
1. PIAT (merged with PIDAT): Indefinite determiner, as in ‘viele solche Bemerkungen’
(‘many such remarks’)
17
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
STTS-EMG
2. NA: Adjectives used as nouns, as in ‘der Gesandte’ (‘the ambassador’)
18
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
STTS-EMG
3. PAVREL: Pronominal adverb used as relative, as in ‘die Puppe, damit sie spielt’ (‘the doll with which she plays’)4. PTKREL: Indeclinable relative particle, as in‘die Fälle, so aus Schwachheit entstehen’ (‘the cases which arise from weakness’)
19
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
STTS-EMG
5. PWAVREL: Interrogative adverb used as relative, as
in ‘der Zaun, worüber sie springt’(‘the fence over which she jumps’)6. PWREL: Interrogative pronoun used as relative,
as in ‘etwas, was er sieht’ (‘something which he sees’)
20
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
POS-tagging in GerManC-GS
• New categories account for 2% of all tokens
• IAA on POS-tagging task: 91.6%
21
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Challenges: Tokenisation issues
• Clitics:– hastu: hast du
(‘have you’)- wirstu: wirst du
(‘will you’)
22
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Challenges: Tokenisation issues
• Clitics:– has|tu: hast du
(‘have you’)- wirs|tu: wirst du
(‘will you’)
23
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Challenges: Tokenisation issues
• Clitics:– has|tu: hast du
(‘have you’)- wirs|tu: wirst du
(‘will you’)
• Multi-word tokens:– obgleich vs. ob gleich
(‘even though’)
24
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Challenges: Tokenisation issues
• Clitics:– has|tu: hast du
(‘have you’)- wirs|tu: wirst du
(‘will you’)
• Multi-word tokens:– obgleich/KOUS vs. ob/KOUS gleich/ADV
(‘even though’)
25
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Challenges: Spelling variation
• Spelling not standardised:– Comet Komet– auff auf– nachdeme nachdem– ko�mpt kommt– Bothenbrodt Botenbrot– differiret differiert– beßer besser– kehme käme– trucken trockenen– gepressett gepreßt– büxen Büchsen
26
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Challenges: Spelling variation
• All spelling variants in GerManC-GS normalised to a modern standard Assess what effect spelling variation has on the performance of automatic tools Help improve automated processing?
• Important for:–Automatic tools (POS tagger!)–Accurate corpus search
27
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Challenges: Spelling variation
Proportion of normalised word tokens plotted against time28
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Questions
• What is the “off-the-shelf” performance of the TreeTagger on historical data from the EMG period?
• Can the results be improved by running the tool on normalised data?
29
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Results
Original data Normalised data
Accuracy
69.6% 79.7%
30
TreeTagger accuracy on original vs. normalised input
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Improvement through normalisation over time
31
Tagger performance plotted against publication date
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Effects of spelling normalisation on POS tagger performance
32
For normalised tokens: Effect of using original (O)/normalised (N) input on tagger accuracy
+: correctly tagged; -: incorrectly tagged
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Comparison with “modern” results
• Performance of TreeTagger on modern data: ca. 97% (Schmid, 1995)
• Current results seem low
33
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Comparison with “modern” results
• Performance of TreeTagger on modern data: ca. 97% (Schmid, 1995)
• Current results seem low• But:– Modern accuracy figure: evaluation of
tagger on the text type it was developed on (newspaper text)
34
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Comparison with “modern” results
• Performance of TreeTagger on modern data: ca. 97% (Schmid, 1995)
• Current results seem low• But:– Modern accuracy figure: evaluation of
tagger on the text type it was developed on (newspaper text)
– IAA higher for modern German (98.6%)
35
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Conclusion
• Substantial amount of manual post-editing required
• Normalisation layer can improve results by 10%, but so far only half of all annotations have positive effect
36
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
Future work
• Adapt normalisation scheme to account for more cases
• Automate normalisation (Jurish, 2010)• Retrain state-of-the-art POS taggers Evaluation?• Provide detailed information about
annotation quality to research community
37
Evaluating an ‘off-the-shelf’ POS-tagger on Early Modern German text
38
Thank you!
Martin.Durrell@manchester.ac.uk
Paul.Bennett@manchester.ac.ukSilke.Scheible@manchester.ac.ukRichard.Whitt@manchester.ac.uk
http://tinyurl.com/germanc
Recommended