Evaluating Named Entity Recognition and Disambiguation in News and Tweets

EVALUATING NAMED ENTITY RECOGNITION AND DISAMBIGUATION IN NEWS AND TWEETS

Giuseppe Rizzo Università degli studi di Torino Marieke van Erp VU University Amsterdam Raphaël Troncy EURECOM

EVALUATING NER & NED• NER typically an NLP task (MUC, CoNLL, ACE)

• NED took flight with availability of large structured resources (Wikipedia, DBpedia, Freebase)

• Tools for NER and NED have started popping up outside regular research outlets (TextRazor, DBpedia Spotlight, AlchemyAPI)

• Unclear how well these tools perform

THIS WORK

• Evaluation & comparison of 10 out-of-the-box NER and NED tools through NERD API as well as a combination of the tools in NERD-ML

• Two types of data: Newswire & Tweets

• http://nerd.eurecom.fr

• Ontology, REST API & UI

• Uniform access to 12 different extractors/linkers: AlchemyAPI, DBpedia Spotlight, Extractiv, Lupedia, OpenCalais, Saplo, SemiTags, TextRazor, THD, Wikimeta, Yahoo! Content Analysis, Zemanta

NERD-ML

• The aim of NERD-ML is to combine the knowledge of the different extractors into a better named entity recogniser

• Uses NERD predictions, Stanford NER & extra features

• Naive Bayes, k-NN, SMO

• CoNLL 2003 English NER with AIDA CoNLL-YAGO links to Wikipedia (5,648 NEs/4,485 links in test set)

• Making Sense of Microposts 2013 (MSM’13) for NER on Twitter domain + 62 randomly selected tweets from Ritter et al.’s corpus with links to DBpedia resources (MSM: 1,538 NEs/Ritter : 177 links in test set)

PER LOC ORG MISC OVERALL

Precision

RESULTS NER NEWSWIRE

Precision F1AlchemyAPIDBpedia SpotlightExtractivLupediaOpenCalaisSaploTextrazorYahooWikimetaZemantaStanford NERNERD-ML Run01NERD-ML Run02NERD-ML Run03Upper Limit

Recall

RESULTS NER MSM

Precision

Precision Recall F1 AlchemyAPIDBpedia SpotlightExtractivLupediaOpenCalaisSaploTextrazorWikimetaZemantaRitter et al.Stanford NERNERD-ML Run01NERD-ML Run02NERD-ML Run03Upper Limit

RESULTS NED

AlchemyAPI DBpedia Spotlight Extractiv Lupedia Textrazor Yahoo Zemanta

AIDA-YAGO 70.63 26.93 51.31 57.98 49.21 0.0 35.58

TWEETS 53.85 25.13 74.07 65.38 58.14 76.00 48.57

DISCUSSION

• Still a ways to go, but for certain classes NER is getting close to really good results

• MISC class is (and probably always will be?) hard

• Bigger datasets needed (for tweets and NED)

• NED task can use standardisation

THANK YOU FOR LISTENING

• Try out our code at: https://github.com/giusepperizzo/nerdml

ACKNOWLEDGEMENTS

This research is funded through the LinkedTV and NewsReader projects, both funded by the European Union’s 7th Framework Programme grants GA 287911 and ICT-316404).

Evaluating Named Entity Recognition and Disambiguation in News and Tweets

Technology

DataEngConf: Concrete Named Entity Disambiguation

Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell

Distributed Entity Disambiguation with Per-Mention Learning

Entity Disambiguation

Analysis Name Entity Disambiguation Using Mining Evidence … · i Analysis Name Entity Disambiguation Using Mining Evidence Method Adelya Astari, Moch. Arif Bijaksana, Arie Ardiyanti

Named Entity Recognition in Tweets: TwitterNLP Ludymila Lobo Twitter NLP Ludymila Lobo

Medical Entity Disambiguation Using Graph Neural ... - arXiv

5. Taxonomy induction, entity disambiguation, coreference ......5. Taxonomy induction, entity disambiguation, coreference resolution Simon Razniewski Winter semester 2019/20 1 Announcements

Wissen im Web - KIT · Knowledge for Intelligence • entity recognition & disambiguation • understanding natural language & speech • knowledge services & reasoning for semantic

NLP AND IR METHODS FOR HANDLING …aplace4places.github.io/presentations/Bruno.pdfNAMED ENTITY DISAMBIGUATION RESOURCES • AIDA/YAGO • Babelfy (entity linking and word sense disambiguation)

Clustering Image Search Results by Entity Disambiguationkzhu/papers/kzhu-image.pdf · Clustering Image Search Results by Entity Disambiguation Kaiqi Zhao, ... descriptive tags for

AIDArabic+ Named Entity Disambiguation for Arabic Textpeople.mpi-inf.mpg.de/~gadelrab/downloads/Mohamed... · AIDArabic+ Named Entity Disambiguation for Arabic Text Masterarbeit im

PyCon 2013 - Experiments in data mining, entity disambiguation and how to think data-structures for designing beautiful algorithms

Entity Linking meets Word Sense Disambiguation: a Uniﬁed ...moro/MoroRaganatoNavigli_TACL2014.pdf · Entity Linking meets Word Sense Disambiguation: a Uniﬁed Approach Andrea Moro,

Semantic Parsing and Beyond to Create a …valeriobasile.github.io/presentations/dusseldorf2018.pdfWord Sense Disambiguation Entity Linking Discourse Representation Structure DBPedia

Semantic based entity retrieval and disambiguation system

An Entity Disambiguation Approach Based on Wikipedia and ... · research on entity extraction in entity linking can be classified into two approaches, one is dictionary-based method,

Large scale entity extraction and probabilistic record ...•Large scale data integration • Probabilistic linking •Entity disambiguation and resolution SALT •Batch oriented Big

Entity Linking meets Word Sense Disambiguation: a Unied ...navigli/pubs/TACL_2014_Babelfy.pdf · Entity Linking meets Word Sense Disambiguation: a Unied Approach Andrea Moro, Alessandro

Named Entity Recognition In Tweets: An Experimental Study