Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
December 7, 2013
Joint and Pipeline Probabilistic Modelsfor Fine-Grained Sentiment Analysis
Roman Klinger and Philipp CimianoSemantic Computing Group, CIT-EC, Bielefeld University, Germany
SENTIRE Workshop at IEEE ICDM, Dallas, TX, USA
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
What is this talk about?
Aspect subjective
The battery life of this camera is too short.Aspect
Aspect subjective
The battery life of this camera is too short.Aspect
Roman Klinger and Philipp Cimiano 2 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Outline
1 Introduction
2 Probabilistic model for subjective term and target identification
3 Evaluation of different pipeline orders
4 Joint Model
5 Evaluation of the Pipeline vs. Joint Model
6 Summary and Discussion
Roman Klinger and Philipp Cimiano 3 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Introduction
Sentiment Analysis/Opinion Mining
Often modelled as classification or segmentation task
Fine-Grained Opinion Mining:
Involves prediction of aspect/target, subjective terms, polarity,
relations
Our previous work: Developed model to analyze:
Given subjective phrases ⇒ impact on target prediction
Given targets ⇒ impact on subjective phrase prediction
Both with perfect and realistic prior knowledge
Contribution of this paper:
Present a flexible model which takes into account
inter-dependencies
Roman Klinger and Philipp Cimiano 4 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Previous and Related Work
Extracting subjective phrases:
B. Yang et al. (2012). ``Extracting opinion expressions with
semi-Markov conditional random fields''. In: EMNLP-CoNLL
Given perfect subjective phrases, predict targets:
N. Jakob et al. (2010). ``Extracting opinion targets in a single- and
cross-domain setting with conditional random fields''. In: EMNLP
ILP approach
B. Yang et al. (2013). ``Joint Inference for Fine-grained Opinion
Extraction''. In: ACL.Our work:..
.
Real-world setting, predict all entities
Relational structure in multiple directions
Flexible, easy to augment
Roman Klinger and Philipp Cimiano 5 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Outline
1 Introduction
2 Probabilistic model for subjective term and target identification
3 Evaluation of different pipeline orders
4 Joint Model
5 Evaluation of the Pipeline vs. Joint Model
6 Summary and Discussion
Roman Klinger and Philipp Cimiano 6 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Factor Graphs
A Factor Graph is a bipartite graph over factors and variables
Factor Ψi computes a scalar
over all variables
Let x⃗ be observed variables, y⃗
output variables
Common definition:
Ψi(⃗xi, y⃗i) =
exp
(∑k
θkifki(⃗xi, y⃗i)
)(parameters θki and sufficient statistics fki(·))
Probability distribution:
p(⃗y|⃗x) = 1
Z(⃗x)
∏i
Ψi(⃗xi, y⃗i)
Ψ1
Ψ2 Ψ3
x2x1
y1 y2 y3 y4
~x = (x1, x2)T
~y = (y1, y2, y3, y4)T
Roman Klinger and Philipp Cimiano 6 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Templates for Factor Graphs
Probability distribution
p(⃗y|⃗x) = 1
Z(⃗x)
∏i
exp
(∑k
θkifki(⃗xi, y⃗i)
)
A Factor Template Tj consists of
parameters θjk and statistic functions fjksome description of variables yielding tupels (⃗xj, y⃗j)
Roman Klinger and Philipp Cimiano 7 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Templates for Factor Graphs
x3
x4
y1
x1 x2
A A B
B
T1: same value and y1
T2: different value and y1
Parameters θjk, feature functions fjk are shared across tupels
p(⃗y|⃗x) = 1
Z(⃗x)
∏Tj∈T
∏(⃗xi ,⃗yi)∈Ti
exp
(∑k
θkjfkj(⃗xi, y⃗i)
)Examples for descriptions:
Markov Logic Networks (Richardson et al., 2006)
Imperatively defined factor graphs (McCallum et al., 2009)
Roman Klinger and Philipp Cimiano 8 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Variable Definition
Extraction of aspects and subjective phrases as segmentation
Application of a semi-Markov-like model
Implementation in FACTORIE (McCallum et al., 2009)
Aspect subjective
The battery life of this camera is too short.Aspect
Roman Klinger and Philipp Cimiano 9 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Templates
Aspect subjective
The battery life of this camera is too short.Aspect
Single-Span-Template
lower-case string, POS, and both
Combined with IOB-like-prefixes
Sequence of POS tags
Roman Klinger and Philipp Cimiano 10 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Templates
Aspect subjective
The battery life of this camera is too short.Aspect
Inter-Span-Template (partially inspired by Jakob et al., 2010)
Does the target span contain the noun that is closest to the
subjective phrase?
Are there spans of both types in the sentence?
Is there a one-edge dependency relation between
subjective phrase and target?
Single-Span features only if one of those holds!
Roman Klinger and Philipp Cimiano 11 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Learning and Inference
Inference: Metropolis Hastings sampling
(a Markov Chain Monte Carlo method)
Learning: Sample Rank (Wick et al., 2011)
.Objective Function..
.
f(t) = maxg∈s
o(t, g)
|g|− α · p(t, g) ,
t is a span, g is a gold span
o(t, g) is length of overlap
p(t, g) number of 'outside' tokens
Roman Klinger and Philipp Cimiano 12 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Outline
1 Introduction
2 Probabilistic model for subjective term and target identification
3 Evaluation of different pipeline orders
4 Joint Model
5 Evaluation of the Pipeline vs. Joint Model
6 Summary and Discussion
Roman Klinger and Philipp Cimiano 13 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Cameras (Kessler et al., 2010)
Given subjective terms, how good is target prediction?
Predicting subjective terms, how good is target prediction?
Given target terms, how good is subjective prediction?
Predicting targets terms, how good is subjective prediction?
0
0.2
0.4
0.6
0.8
1
pred. S. → T. pred. T. → S. Gold S. → T. Gold T. → S. Jakob 2010
F 1
Target-F1 PartialSubjective-F1 Partial
Target-F1Subjective-F1
0.53 0.44
0.65 0.61 0.65 0.71
0.48
0.32
0.58
1.00
0.50 0.54 0.60
1.00
0.65
1.00
Roman Klinger and Philipp Cimiano 13 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Cars (Kessler et al., 2010)
0
0.2
0.4
0.6
0.8
1
pred. S. → T. pred. T. → S. Gold S. → T. Gold T. → S. Jakob 2010
F 1
Target-F1 PartialSubjective-F1 Partial
Target-F1Subjective-F1
0.51 0.43
0.62 0.64 0.69
0.74
0.43 0.33
0.55
1.00
0.50 0.56
0.66
1.00
0.70
1.00
Roman Klinger and Philipp Cimiano 14 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Twitter (Spina et al., 2012)
0
0.2
0.4
0.6
0.8
1
pred. S. → T. pred. T. → S. Gold S. → T. Gold T. → S.
F 1
Target-F1 PartialSubjective-F1 Partial
Target-F1Subjective-F1
0.42 0.32
0.67
0.40 0.41
0.60
0.26
0.13
0.40
1.00
0.22 0.28
1.00
0.35
Roman Klinger and Philipp Cimiano 15 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Outline
1 Introduction
2 Probabilistic model for subjective term and target identification
3 Evaluation of different pipeline orders
4 Joint Model
5 Evaluation of the Pipeline vs. Joint Model
6 Summary and Discussion
Roman Klinger and Philipp Cimiano 16 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Idea for Joint Model
Model relation explicitly
Aspect subjectiveThe battery life of this camera is too short.
Aspect
target
Sentence
Features in three templates
Single Span
Inter Span
Relation
(new: similar to inter span, but measuring another variable)
Roman Klinger and Philipp Cimiano 16 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Sampler
.Pipeline..
.
Propose spans, span changes
Propose adding relations for each aspect-subjective pair
.Joint..
.
Propose subjective phrases
Propose aspects as targets of each subjective phrase
Propose span changes, removing relations
Roman Klinger and Philipp Cimiano 17 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Objective functions
.Pipeline..
.
Spans as before (f(t))
Relations accuracy-based
.Joint..
.
Relation:
h(su, ta) = max(su∗,ta∗)∈rel∗
−1 if o(su, su∗) = 0 or o(ta, ta∗) = 0
12(o(su, su∗) + o(ta, ta∗)) else
Span:
g(t) = βf(t) +∑
(su,ta)∈rel(t)
h(su, ta)
Roman Klinger and Philipp Cimiano 18 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Pipeline vs. joint
Joint ModelPipeline Model
Subjectives
Aspects
Iteration
Training Prediction
Relations
Subjectives
Aspects
Relations
0
1
2
3
Training/Prediction
Subjectives
Aspects
Relations
Single span
Inter span
Relation
Templates Iteration
1
Roman Klinger and Philipp Cimiano 19 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Results
0
0.2
0.4
0.6
0.8
1
Pipeline Joint
F 1
Camera
0
0.2
0.4
0.6
0.8
1
Pipeline Joint
F 1
Car
Aspect PartialSubjective Partial
Relation PartialAspect
SubjectiveRelation
Roman Klinger and Philipp Cimiano 20 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Summary
Joint Modelling has positive impact
Clearly observable for aspects
Slight to moderate drop for subjective phrase and relation
Easy do adapt to other characteristics
(opinion holder, polarity, dependencies, etc.)
Roman Klinger and Philipp Cimiano 21 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Bibliography I
Jakob, N. et al. (2010). ``Extracting opinion targets in a single- and
cross-domain setting with conditional random fields''. In: EMNLP.
Kessler, J. S. et al. (2010). ``The 2010 ICWSM JDPA Sentment Corpus for the
Automotive Domain''. In: ICWSM-DWC 2010.
McCallum, A. et al. (2009). ``FACTORIE: Probabilistic Programming via
Imperatively Defined Factor Graphs''. In: NIPS.
Richardson, M. et al. (2006). ``Markov logic networks''. In: Machine
Learning 62.1-2, pp. 107–136. ISSN: 0885-6125.
Spina, D. et al. (2012). ``A Corpus for Entity Profiling in Microblog Posts''.
In: LREC Workshop on Information Access Technologies for Online
Reputation Management.
Roman Klinger and Philipp Cimiano 22 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
. .Introduction . . . . . . .Probabilistic Model . . .Evaluations 1 . . . .Joint Model .Evaluations 2 .Summary
Bibliography II
Wick, M. et al. (2011). ``SampleRank: Training factor graphs with atomic
gradients''. In: ICML.
Yang, B. et al. (2012). ``Extracting opinion expressions with semi-Markov
conditional random fields''. In: EMNLP-CoNLL.
— (2013). ``Joint Inference for Fine-grained Opinion Extraction''. In: ACL.
Roman Klinger and Philipp Cimiano 23 / 24
Technische Fakultät · AG Semantic ComputingUniversität Bielefeld
Thank you!