View
247
Download
1
Category
Tags:
Preview:
Citation preview
A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
Timm Heuss1,2 Bernhard Humm2
ChrisFan Henninger2 Thomas Rippl2
1University of Plymouth, UK
2University of Applied Sciences Darmstadt
Agenda
1. Mo%va%on 2. Evalua%on Procedure 3. NER Tools 4. Evalua%on Results 5. Conclusions
04/09/14 2 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
Agenda
1. Mo%va%on 2. Evalua%on Procedure 3. NER Tools 4. Evalua%on Results 5. Conclusions
04/09/14 3
1 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
MoFvaFon
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 4
About Our Research Project „InnovaFve WissensvermiRlung“
• We are researching new and enhanced ways of searching and displaying the online collec%ons of galleries, libraries, archives and museums (GLAMs)
• Primary Use Cases: § Museum -‐ Städel Museum § Library -‐ University and State Library Darmstadt
• Project setup allows us to research in realis%c environments J
Exhibit Authority File LOD Cloud
Tiere als Bildmo-ve hat den im Ersten Weltkrieg vor Verdun gefallenen Marc, dem Mitbegründer der Künstlervereinigung „Blauer Reiter“ ...
MoFvaFon
• Demanded: Automa%c or Semi-‐Automa%c Named En%ty Recogni%on (NER) solu%ons § Use Case Museum (Customer View) – Speeding up a high-‐effort, manual task § Use Case Mediapla\orm (Provider View) – Interconnec%ng content into the LOD Cloud
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 5
Bridging the SemanFc Gap
Animal
Dog Cat
Living being
Tiere
Marc
Verdun
MoFvaFon
• Gangemi [1] has compared NER-‐Tools using common texts and general purpose vocabulary
• Hooland [2] has compared NER-‐Tools using domain-‐specific texts and general purpose vocabulary
• This work compares NER-‐Tools using domain-‐specific texts and domain-‐specific vocabulary
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 6
PosiFon To Related Work Common Vocabulary
Common Input Texts
Domain-‐Specifc Input Texts
Domain-‐Specific Vocabulary
Hooland [2]
Gangemi [1]
This Work
MoFvaFon
Which tool approach is suitable for a domain-‐specific scenario?
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 7
Three Classes of Approaches
Implementa%on & Integra%on Effort
Control on Results
As-‐A-‐Service-‐Tools
Locally Installed Pla\orms
Custom NLP-‐Pipelines
Agenda
1. Mo%va%on 2. Evalua%on Procedure 3. NER Tools 4. Evalua%on Results 5. Conclusions
04/09/14 8
2 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
EvaluaFon Procedure
04/09/14 9
The Principle Process
Input NER-Pipeline
Output Unstructured Input Text
Authority File In
put
Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
Unstructured Input Text
Keywords
• German Input-‐texts Only • Different wri%ng styles • Ancient, domain-‐specific use
of language in input texts • Domain-‐specific, fine-‐grained vocabulary • „IconClass as Linked Open Data“
is not connected to the LOD cloud • IconClass has syntac%c and seman%c issues • IconClass concepts are hard to find, we
start with IconClass Keywords EvaluaFon under realisFc condiFons
EvaluaFon Procedure
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 10
The Process In This Scenario / Challenges
Source: h
kp://www.europ
eana.eu/po
rtal/record/08501/
A609BC
38097F41D2
6C51AF
0C411769F36A
976118.htm
l
Input NER-Pipeline
Output
IconClass In
put
German Descrip%on
Maria in Vorderansicht, in ganzer Figur, in rothem Kleide und dunkelblauem Mantel, auf einem Throne ...
IconClass Keywords German
Descrip%on
EvaluaFon Procedure
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 11
Anatomy of IconClass
11D3281 Christ as Woman Prefered Label
narrower
11D328 11D32 11D3 11D
broader Woman
Religion
Beard
Adult Subject
11D3281(+5)
Christ as Woman – DD – Christ Beardless
11DD3281
EvaluaFon Procedure
Prepare Input Text
• Produce a reference transla%on (for the tools that do not support German)
Invoke NER-‐Tool
• Fire Tool-‐Specific Request (usually REST)
• Parse Tool-‐specific results
Compare With Golden Standard
• Results to given vocabulary
• Count False Posi%ves, True Pos%ves and True Nega%ves
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 12
EvaluaFon In Three Steps
Agenda
1. Mo%va%on 2. Evalua%on Procedure 3. NER Tools 4. Evalua%on Results 5. Conclusions
04/09/14 13
3 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
NER Tools
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 14
Tool SelecFon, Based On Related Work
As-‐A-‐Service Local PlaZorms NLP-‐Pipeline
NER Tools
• We took the best (>48% F1) NER-‐performers of Gangemi [1]
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 15
Tool SelecFon, Based On Related Work
As-‐A-‐Service Local PlaZorms NLP-‐Pipeline
AlchemyAPI AIDA
CiceroLite FOX Stanbol
FRED
NERD
Open Calais
Wikimeta
Zemanta
NER Tools
• We took the best (>48% F1) NER-‐performers of Gangemi [1]
• ... and added a UIMA-‐based approach „Customized Pipeline“, represen%ng a third class of tools
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 16
Tool SelecFon, Based On Related Work
As-‐A-‐Service Local PlaZorms NLP-‐Pipeline
AlchemyAPI AIDA UIMA-‐based
CiceroLite FOX Stanbol GATE-‐based
FRED
NERD
Open Calais
Wikimeta
Zemanta
Agenda
1. Mo%va%on 2. Evalua%on Procedure 3. NER Tools 4. Evalua%on Results 5. Conclusions
04/09/14 17
4 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 18
Average Scores (English)
• Recall (QuanFty) the higher, the more relevant items have been retrieved
• Precision (Quality) the higher, the more retrieved items are relevant
• (Balanced) F1-‐Score harmonic mean of precision and recall
27%
23%
61%
78%
3%
2%
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
F1
Recall
worst
best
Average
Usable Results beyond this
point
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 19
F1-‐Impacts of Domain-‐SpecificaFon
Common Vocabulary
Common Input Texts
Domain-‐Specific Input Texts
Domain-‐Specific Vocabulary
Gangemi [1] ø F1 = 68% σ= 9%
Hooland [2] ø F1 = 49% σ= 11%
This Work ø F1 = 26% σ= 18%
-‐19%
-‐23%
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 20
Recall-‐Impacts of Domain-‐SpecificaFon
Common Vocabulary
Common Input Texts Gangemi [1]
ø R = 62% σ= 14%
Hooland [2] ø R = 37% σ= 10%
This Work ø R = 19% σ= 17%
-‐25%
-‐18% Domain-‐Specific Input Texts
Domain-‐Specific Vocabulary
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 21
Individual Tool Performances, English
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Stanbol (default)
Open Calais
Zemanta
Wikimeta
AIDA
FOX
FRED
AlchemyAPI
CiceroLite
NERD
Stanbol (IconClass)
Customized Pipeline
Recall
F1
Usable Results beyond this point
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 22
Individual Tool Performances, German
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
AIDA
FOX
FRED
Open Calais
Wikimeta
Zemanta
Stanbol (default)
CiceroLite
NERD
AlchemyAPI
Stanbol (IconClass)
Customized Pipeline
Recall
F1
Usable Results beyond this point
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 23
Individual Tools, MulFlanguage Comparison
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Stanbol (default)
CiceroLite
NERD
AlchemyAPI
Stanbol (IconClass)
Customized Pipeline
F1 (English Input)
F1 (German Input)
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 24
Involved Tool Classes
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Stanbol (default)
Open Calais
Zemanta
Wikimeta
AIDA
FOX
FRED
AlchemyAPI
CiceroLite
NERD
Stanbol (IconClass)
Customized Pipeline
Recall
F1 Best 3 Performers Representers of all three involved tool classes
NLP Pipeline
Locally installed Pla\orm
As-‐A-‐Service
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 25
Possibilty of Loading Custom Vocabularies
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Stanbol (default)
Open Calais
Zemanta
Wikimeta
AIDA
FOX
FRED
AlchemyAPI
CiceroLite
NERD
Stanbol (IconClass)
Customized Pipeline
Recall
F1 Best 2 Performers Capable of Loading Custom Vocabularies
Programmed
Configured
0% 10% 20% 30% 40% 50% 60% 70%
Medium + Low
High Effort
Medium Effort
Low Effort
EvaluaFon Results
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 26
DistribuFon of Individual Results, Best Three Performers
^
Stanbol (Iconclass)
46%
22% 27%
37%
15%
100% = all Keywords in Golden Standard
Customized Pipeline
51%
NERD 44%
Medium Effort
Medium Effort
Low Effort
Agenda
1. Mo%va%on 2. Evalua%on Procedure 3. NER Tools 4. Evalua%on Results 5. Conclusions
04/09/14 27
5 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
Conclusions
• NER tools in domain-‐specific scenarios perform significantly worse than in common scenarios • Posi%ve key factors:
§ Use of English input texts (even Machine Transla%ons are s%ll beker than na%ve German)
§ The given vocabulary can be loaded (programmed or configured) into the NER tool
§ Different tool approaches seem to be complementary
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 28
Lessons Learned
Conclusions
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 29
Future Work
• We will pursuit a combinatory approach of custom NLP-‐components, Stanbol and NERD, wired together with UIMA • There is s%ll work to do towards finding proper Iconclass-‐Concepts § Exploi%ng NLP meta data § Knowledge Engineering
Some rights reserved. This work is published under the Crea%ve Commons Akribu%onNonCommercial-‐ShareAlike 3.0 License. Commercial distribu%on of this work requires a prior wriken permission of the author. Non-‐commercial distribu%on is permiked. Derived work is permiked with some limita%ons. See hkp://crea%vecommons.org/licenses/by-‐nc-‐sa/3.0/legalcode for the full license statement.
Thanks for your aRenFon!
04/09/14 30
Do you have any quesFons?
Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary
This project (HA project no. 322/12-‐12) is funded in the framework of Hessen ModellProjekte, financed with funds of LOEWE – Landes-‐Offensive zur Entwicklung Wissenschaylich-‐ökonomischer Exzellenz, Förderlinie 3: KMU-‐Verbundvorhaben (State Offensive for the Development of Scien%fic and Economic Excellence).
Timm.Heuss@h-‐da.de
Sources
• [1] A. Gangemi. A Comparison of Knowledge Extrac%on Tools for the Seman%c Web. In P. Cimiano, O. Corcho, V. Presu|, L. Hollink, and S. Rudolph, editors, The Seman%c Web: Seman%cs and Big Data, number 7882 in Lecture Notes in Computer Science, pages 351–366. Springer Berlin Heidelberg, Jan. 2013.
• [2] S. v. Hooland, M. D. Wilde, R. Verborgh, T. Steiner, and R. V. d. Walle. Exploring en%ty recogni%on and disambigua%on for cultural heritage collec%ons. Literary and Linguis%c Compu%ng, page fqt067, Nov. 2013.
• [3] Europeana -‐ hkp://www.europeana.eu/portal/record/08501/ A609BC38097F41D26C51AF0C411769F36A976118.html
04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 31
Recommended