A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary

A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary

Timm Heuss1,2 Bernhard Humm2

ChrisFan Henninger2 Thomas Rippl2

1University of Plymouth, UK

2University of Applied Sciences Darmstadt

Agenda

1.  Mo%va%on 2.  Evalua%on Procedure 3.  NER Tools 4.  Evalua%on Results 5.  Conclusions

04/09/14 2 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary

Agenda

04/09/14 3

1 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary

MoFvaFon

04/09/14 Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary 4

About Our Research Project „InnovaFve WissensvermiRlung“

•  We are researching new and enhanced ways of searching and displaying the online collec%ons of galleries, libraries, archives and museums (GLAMs)

•  Primary Use Cases: §  Museum -‐ Städel Museum §  Library -‐ University and State Library Darmstadt

•  Project setup allows us to research in realis%c environments J

Exhibit Authority File LOD Cloud

Tiere als Bildmo-ve hat den im Ersten Weltkrieg vor Verdun gefallenen Marc, dem Mitbegründer der Künstlervereinigung „Blauer Reiter“ ...

MoFvaFon

•  Demanded: Automa%c or Semi-‐Automa%c Named En%ty Recogni%on (NER) solu%ons §  Use Case Museum (Customer View) – Speeding up a high-‐effort, manual task §  Use Case Mediapla\orm (Provider View) – Interconnec%ng content into the LOD Cloud

Bridging the SemanFc Gap

Animal

Dog Cat

Living being

Verdun

MoFvaFon

•  Gangemi [1] has compared NER-‐Tools using common texts and general purpose vocabulary

•  Hooland [2] has compared NER-‐Tools using domain-‐specific texts and general purpose vocabulary

•  This work compares NER-‐Tools using domain-‐specific texts and domain-‐specific vocabulary

PosiFon To Related Work Common Vocabulary

Common Input Texts

Domain-‐Specifc Input Texts

Domain-‐Specific Vocabulary

Hooland [2]

Gangemi [1]

This Work

MoFvaFon

Which tool approach is suitable for a domain-‐specific scenario?

Three Classes of Approaches

Implementa%on & Integra%on Effort

Control on Results

As-‐A-‐Service-‐Tools

Locally Installed Pla\orms

Custom NLP-‐Pipelines

Agenda

04/09/14 8

EvaluaFon Procedure

04/09/14 9

The Principle Process

Input NER-Pipeline

Output Unstructured Input Text

Authority File In

Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary

Unstructured Input Text

Keywords

•  German Input-‐texts Only •  Different wri%ng styles •  Ancient, domain-‐specific use

of language in input texts •  Domain-‐specific, fine-‐grained vocabulary •  „IconClass as Linked Open Data“

is not connected to the LOD cloud •  IconClass has syntac%c and seman%c issues •  IconClass concepts are hard to find, we

start with IconClass Keywords EvaluaFon under realisFc condiFons

EvaluaFon Procedure

The Process In This Scenario / Challenges

Source: h

kp://www.europ

eana.eu/po

rtal/record/08501/

A609BC

38097F41D2

6C51AF

0C411769F36A

976118.htm

Input NER-Pipeline

Output

IconClass In

German Descrip%on

Maria in Vorderansicht, in ganzer Figur, in rothem Kleide und dunkelblauem Mantel, auf einem Throne ...

IconClass Keywords German

Descrip%on

EvaluaFon Procedure

Anatomy of IconClass

11D3281 Christ as Woman Prefered Label

narrower

11D328 11D32 11D3 11D

broader Woman

Religion

Adult Subject

11D3281(+5)

Christ as Woman – DD – Christ Beardless

11DD3281

EvaluaFon Procedure

Prepare Input Text

• Produce a reference transla%on (for the tools that do not support German)

Invoke NER-‐Tool

• Fire Tool-‐Specific Request (usually REST)

• Parse Tool-‐specific results

Compare With Golden Standard

• Results to given vocabulary

• Count False Posi%ves, True Pos%ves and True Nega%ves

EvaluaFon In Three Steps

Agenda

04/09/14 13

NER Tools

Tool SelecFon, Based On Related Work

As-‐A-‐Service Local PlaZorms NLP-‐Pipeline

NER Tools

•  We took the best (>48% F1) NER-‐performers of Gangemi [1]

AlchemyAPI AIDA

CiceroLite FOX Stanbol

Open Calais

Wikimeta

Zemanta

NER Tools

•  We took the best (>48% F1) NER-‐performers of Gangemi [1]

•  ... and added a UIMA-‐based approach „Customized Pipeline“, represen%ng a third class of tools

AlchemyAPI AIDA UIMA-‐based

CiceroLite FOX Stanbol GATE-‐based

Open Calais

Wikimeta

Zemanta

Agenda

04/09/14 17

EvaluaFon Results

Average Scores (English)

•  Recall (QuanFty) the higher, the more relevant items have been retrieved

•  Precision (Quality) the higher, the more retrieved items are relevant

•  (Balanced) F1-‐Score harmonic mean of precision and recall

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Recall

Average

Usable Results beyond this

EvaluaFon Results

F1-‐Impacts of Domain-‐SpecificaFon

Common Vocabulary

Common Input Texts

Domain-‐Specific Input Texts

Gangemi [1] ø F1 = 68% σ= 9%

Hooland [2] ø F1 = 49% σ= 11%

This Work ø F1 = 26% σ= 18%

-‐19%

-‐23%

EvaluaFon Results

Recall-‐Impacts of Domain-‐SpecificaFon

Common Vocabulary

Common Input Texts Gangemi [1]

ø R = 62% σ= 14%

Hooland [2] ø R = 37% σ= 10%

This Work ø R = 19% σ= 17%

-‐25%

-‐18% Domain-‐Specific Input Texts

EvaluaFon Results

Individual Tool Performances, English

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Stanbol (default)

Open Calais

Zemanta

Wikimeta

AlchemyAPI

CiceroLite

Stanbol (IconClass)

Customized Pipeline

Recall

Usable Results beyond this point

EvaluaFon Results

Individual Tool Performances, German

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Open Calais

Wikimeta

Zemanta

Stanbol (default)

CiceroLite

AlchemyAPI

Stanbol (IconClass)

Customized Pipeline

Recall

Usable Results beyond this point

EvaluaFon Results

Individual Tools, MulFlanguage Comparison

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Stanbol (default)

CiceroLite

AlchemyAPI

Stanbol (IconClass)

Customized Pipeline

F1 (English Input)

F1 (German Input)

EvaluaFon Results

Involved Tool Classes

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Stanbol (default)

Open Calais

Zemanta

Wikimeta

AlchemyAPI

CiceroLite

Stanbol (IconClass)

Customized Pipeline

Recall

F1 Best 3 Performers Representers of all three involved tool classes

NLP Pipeline

Locally installed Pla\orm

As-‐A-‐Service

EvaluaFon Results

Possibilty of Loading Custom Vocabularies

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Stanbol (default)

Open Calais

Zemanta

Wikimeta

AlchemyAPI

CiceroLite

Stanbol (IconClass)

Customized Pipeline

Recall

F1 Best 2 Performers Capable of Loading Custom Vocabularies

Programmed

Configured

0% 10% 20% 30% 40% 50% 60% 70%

Medium + Low

High Effort

Medium Effort

Low Effort

EvaluaFon Results

DistribuFon of Individual Results, Best Three Performers

Stanbol (Iconclass)

22% 27%

100% = all Keywords in Golden Standard

Customized Pipeline

NERD 44%

Medium Effort

Low Effort

Agenda

04/09/14 27

Conclusions

•  NER tools in domain-‐specific scenarios perform significantly worse than in common scenarios •  Posi%ve key factors:

§  Use of English input texts (even Machine Transla%ons are s%ll beker than na%ve German)

§  The given vocabulary can be loaded (programmed or configured) into the NER tool

§  Different tool approaches seem to be complementary

Lessons Learned

Conclusions

Future Work

• We will pursuit a combinatory approach of custom NLP-‐components, Stanbol and NERD, wired together with UIMA •  There is s%ll work to do towards finding proper Iconclass-‐Concepts §  Exploi%ng NLP meta data §  Knowledge Engineering

Some rights reserved. This work is published under the Crea%ve Commons Akribu%onNonCommercial-‐ShareAlike 3.0 License. Commercial distribu%on of this work requires a prior wriken permission of the author. Non-‐commercial distribu%on is permiked. Derived work is permiked with some limita%ons. See hkp://crea%vecommons.org/licenses/by-‐nc-‐sa/3.0/legalcode for the full license statement.

Thanks for your aRenFon!

04/09/14 30

Do you have any quesFons?

Heuss et al.: A Comparison of NER Tools w.r.t. a Domain-‐Specific Vocabulary

This project (HA project no. 322/12-‐12) is funded in the framework of Hessen ModellProjekte, financed with funds of LOEWE – Landes-‐Offensive zur Entwicklung Wissenschaylich-‐ökonomischer Exzellenz, Förderlinie 3: KMU-‐Verbundvorhaben (State Offensive for the Development of Scien%fic and Economic Excellence).

Timm.Heuss@h-‐da.de

Sources

•  [1] A. Gangemi. A Comparison of Knowledge Extrac%on Tools for the Seman%c Web. In P. Cimiano, O. Corcho, V. Presu|, L. Hollink, and S. Rudolph, editors, The Seman%c Web: Seman%cs and Big Data, number 7882 in Lecture Notes in Computer Science, pages 351–366. Springer Berlin Heidelberg, Jan. 2013.

•  [2] S. v. Hooland, M. D. Wilde, R. Verborgh, T. Steiner, and R. V. d. Walle. Exploring en%ty recogni%on and disambigua%on for cultural heritage collec%ons. Literary and Linguis%c Compu%ng, page fqt067, Nov. 2013.

•  [3] Europeana -‐ hkp://www.europeana.eu/portal/record/08501/ A609BC38097F41D26C51AF0C411769F36A976118.html

A Comparison of NER Tools w.r.t. a Domain-Specific Vocabulary

Technology

Franchising In India W.R.T. Baskin Robbins shiny.doc

Ner voussys drugs

Good Practices of BAs w.r.t One Network & PSG

Hil' Ner Chanuka

NER Highlights

Par Ner Ship

Ner Advt12013

ladda ner manual

BOND: BERT-Assisted Open-Domain Named Entity …2.1 Distantly Supervised NER NER is the process of locating and classifying named entities in text into predeﬁned entity cat-egories,

Engineering Stores Accountal. We will discuss regarding :- We will discuss regarding :- Duties of AEN w.r.t. stores. Duties of AEN w.r.t. stores. Stores

NER DESIGNS

Relevance of Thirukural w.r.t. Management

Term Paper on inventory management w.r.t. retail industry

Ner Ner Leadsheet

Compliance Report w.r.t. Environmental Clearance granted

IWT in NER

Visual Merchandising w.r.t. Pantaloons Retail India Ltd

Motivation w.r.t Human Resource Management

Subject: Weightage w.r.t marks Chapter wise

report on consumer preference w.r.t to coffee