Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
A Distant Learning Approach for Extracting Hypernym Relations
from Wikipedia Disambiguation Pages
Mouna Kamela,b, Cassia Trojahnb, Adel Ghamniab,c,
Nathalie Aussenac-Gillesb, Cécile Fabrec
aUniversité de Perpigan Via Domitia, FrancebIRIT, CNRS, Université de Toulouse, France
cCLLE, équipe ERSS, Université de Toulouse, France
07/09/2017Extracting hypernym relations 1
Outline
• Goal and Context
• Background
– Distant Supervised Learning Approach
– Classification
• Application to Wikipedia Disambiguation Pages
• Conclusion and Perspectives
07/09/2017 Extracting hypernym relations 2
Yet another study about hypernym relations
• Hypernym relations– Class / sub-class or entity / class
– backbone of semantic resources
07/09/2017 Extracting hypernym relations 3
Yet another study about hypernym relations
• Hypernym relations– Class / sub-class or entity / class
– backbone of semantic resources
• Extracting hypernym relations from natural language– Corpus specificities: domain granularity, corpus gender, language,
explicitness of the text structure, etc.
– the intended aim: linguistic study, text annotation, KB population
– Targeted resource: thesaurus, lexical or heavy-weight ontology
• SemPedia project http://www.irit.fr/Sempedia
– to enrich DBPedia for French with hypernym relations
– DBPedia in French is 20 000 times poorer than DBPedia in English
– Semantic resources in French language are scarce
07/09/2017 Extracting hypernym relations 4
Goal and ContextDifferent ways to express hypernym relations in the same corpus
07/09/2017 Extracting hypernym relations 5
Outline
• Goal and Context
• Background
– Distant Supervised Learning
– Classification
• Application to Wikipedia Disambiguation Pages
• Conclusion and Perspectives
07/09/2017 Extracting hypernym relations 6
Background
– Different ways to express hypernym relations :• well-written text: expressed thanks to syntax and the lexicon
• poorly-written text: expressed thanks to the layout
– machine learning approaches: variety of linguistic clues (either syntactic, semantic, lexical, visual, structural, distributional clues)
– supervised learning: better results, but require manual annotation
– distant supervised learning: free of manual annotation, rely on a semantic resource fully automatic process
07/09/2017 Extracting hypernym relations 7
Distant Supervised Learning
• Hypothesis : “if two entities participate in a relation, all sentences that mention these two entities express that relation” (Mintz et al., 2009)
• Although this hypothesis seems too strong, it makes sense when the knowledge base used to annotate the corpus is derived from the corpus itself (Riedel et al., 2010)
• The training examples are automatically collected using a knowledge base:
– for every pair of entities linked (resp. not linked) in the knowledge base and appearing together within a sentence, a positive (resp. negative) learning example is built.
07/09/2017 Extracting hypernym relations 8
Distant Supervised Learning
• Hypothesis : “if two entities participate in a relation, all sentences that mention these two entities express that relation” (Mintz et al., 2009)
• Although this hypothesis seems too strong, it makes sense when the knowledge base used to annotate the corpus is derived from the corpus itself (Riedel et al., 2010)
• The training examples are automatically collected using a knowledge base:
– for every pair of entities linked (resp. not linked) in the knowledge base and appearing together within a sentence, a positive (resp. negative) learning example is built.
Connect(relation, entities), is-a(sentence, text-unit)
07/09/2017 Extracting hypernym relations 9
if two entities participate in a relation, all sentences that mention these two entities express that relation
Distant Supervised Learning
• Hypothesis : “if two entities participate in a relation, all sentences that mention these two entities express that relation” (Mintz et al., 2009)
• Although this hypothesis seems too strong, it makes sense when the knowledge base used to annotate the corpus is derived from the corpus itself (Riedel et al., 2010)
• The training examples are automatically collected using a knowledge base:
– for every pair of entities linked (resp. not linked) in the knowledge base and appearing together within a sentence, a positive (resp. negative) learning example is built.
– Sentence -> learning features -> feature vector for that entity pair.
– the set of feature vectors feed a multi-class logistic regression classifier
07/09/2017 Extracting hypernym relations 10
Classification
• Binary classification task (isA or not-isA classes)
• Maximum Entropy classifier (Max-Ent) :
– relevant when the conditional independence of the features cannot be assured (in NLP, words obviously are not independent in their use).
– allows the management of a great number of features.
The probability that individual x (here a relation) belongs to class y
07/09/2017 Extracting hypernym relations 11
Outline
• Goal and Context
• Background
– Distant Supervised Learning Approach
– Classification
• Application to Wikipedia Disambiguation Pages
• Conclusion and Perspectives
07/09/2017 Extracting hypernym relations 12
Wikipedia Disambiguation Pages
07/09/2017 Extracting hypernym relations 13
Wikipedia Disambiguation Pages
07/09/2017 Extracting hypernym relations 14
Application to WikipediaDisambiguation Pages
• Corpora– Reference corpus: 20 pages ; manual annotation (entities and relations linking entities)
– Training corpus: all remaining French disambiguation pages (5904 pages)
• Semantic resource : BabelNet (www.babel.org)
– very large multilingual semantic network with about 14 million entries (Babel synsets)
– connects concepts and named entities with semantic relations
– rich in hypernym relations
• Features
07/09/2017 Extracting hypernym relations 15
Processing chain
Preprocessing
Corpus (Wikipediadisambiguation pages)
Annotated corpus
Term pairs extraction
(<T11, T1
2>, sent1>)(<T2
1, T22>, sent2>)
(<T31, T3
2>, sent3>)…
SemanticresourceBabelNet
{ <Tj1, Tj
2>, sentj, <traitj1, …, traitj
p>, neg> }j
Gazetteer(Babelnet terms)
TTG
{ <Ti1, Ti
2>, senti, <traiti1, …, traiti
p>, pos >}i
Feature vectorsbuilding
test set (2000 +, 2000 -)
{ <Tj1, Tj
2>, sentj, <traitj1, …, traitj
p>, neg> }j
training set (4000 +, 4000 -)
Binary logisticregression(MaxEnt)
Evaluation(precision, recall,
F-measure)
Learning model
{ <Ti1, Ti
2>, senti, <traiti1, …, traiti
p>, pos >}i
07/09/2017 Extracting hypernym relations 16
Application to WikipediaDisambiguation Pages
Evaluation of the model (test set)
07/09/2017 Extracting hypernym relations 17
Application to WikipediaDisambiguation Pages
• Evaluation on the reference corpus– 688 true positive examples and 278 true negative examples
– Comparison between 2 baselines and 2 models
• Baseline1: generic lexico-syntactic patterns for French
• Baseline2: generic patterns AND ad-hoc patterns for the disambiguation pages
• Model_POSL: trained with vectors composed of POS and lemma features
• Model_AllFeatures; trained with vectors composed of all features
07/09/2017 Extracting hypernym relations 18
Application to WikipediaDisambiguation Pages - discussion
– Number of true positive hypernym relations per type of hypernym expression
– Quantitative gain: machine learning identifies more examples, no development cost, ensuring a systematic and less empirical approach.
– Impact of the way relations are expressed:
• ML performs as well as patterns on well-written text
• Ad-hoc pattern perform (a little) better on low-written text
• ML can identify all forms of relation expressions (current patterns are unable to identify relations with head modifiers)
07/09/2017 Extracting hypernym relations 19
• Examples– correctly identified by ML
– would require additional ad-hoc patterns > extra cost
(1) Louis Babel, prêtre-missionnaire oblat et explorateur du Nouveau-Québec (1826-1912) .
<Louis Label, prêtre-missionnaire oblat>
<Louis Label, explorateur du Nouveau-Quebec>
(2) La fontaine a aussi désigné le “vaisseau de cuivre ou de quelque autre métal, oùl’on garde de l’eau dans les maisons”, et encore le robinet de cuivre par où coule l’eau d’une fontaine, ou le vin d’un tonneau, ou quelque autre liqueur que ce soit.
<fontaine, robinet de cuivre>
Application to WikipediaDisambiguation Pages
07/09/2017 Extracting hypernym relations 20
Conclusion and perspectives
• Distant learning using lexical and grammatical features – identifies different ways of expressing relations
– including most of those identified by patterns
• Future work
– investigate additional features such as semantic, distributional or lay-out features
– Combine learning and patterns
– Train a model on the whole set of Wikipedia pages.
07/09/2017 Extracting hypernym relations 21