Upload
jun-li-lu-
View
418
Download
1
Embed Size (px)
Citation preview
Entity Identification on Microblogsby CRF Model with Adaptive Dependency
Dept. of Social Informatics,
Kyoto University, Japan
Jun-Li Lu Makoto P. Kato Takehiro Yamamoto Katsumi Tanaka
@2015 IEEE/WIC/ACM International Conference on Web Intelligence (WI2015)
2
Outline
• Entity identification
• How an entity is mentioned
• Method• Feature
• Conditional Random Field (CRF) model
• Adaptive dependency
• Experiment results & conclusion
3
Problem definition:Entity identification on microblogs
Jacoby is leaving for the
rival and betrays Red Sox;
Yankees seems aiming
for championship.
microblog
…
…
…
Given mention, to find mapped entity?
4
How an entity is mentioned?
… is a
professional
baseball teamOur baseball team
is the rival to
Yankees
mention
attribute
Boston Red Sox
rival
… is a
professional…
New York Yankeesname
relationship
Direct-reference• Name: mention is partial or full name of an entity
Indirect-reference• Attribute: mention is to describe an entity
• Relationship: mention is the relationship between two entities
• Metaphor: mention contains another entity’s name but is to map an entity
entity’s article
5
Related work
• Two sub-tasks: NER (Named Entity Recognition), NED (Named Entity Disambiguation)
• NER and NED jointly considered [TKDE2015, WWW2014]
• Mining additional context for NED, in addition to KB [KDD2013]
• On well-written doc. v.s. on short-and-noisy microblog [WWW2014]
• Efficient prediction algorithm [WSDM2015]
=> Past works focused on direct-reference
6
Our contribution
• Survey for indirect-reference• indirect-reference was not infrequent in microblogs
• Novel feature for indirect-reference• topic-specific translation, “entity-known-as” pattern, …
• A efficient model that considers dependency between entities• predicting entities together by CRF model
• getting proper dependency among entities
Presenting flow
Introduction toEntity Identification
Feature CRF Model with Adaptive Dependency
Experiment results
How to measure entities?
How to predict entities?
Previous-work features
…are an
baseball
team…
New York Yankees
microblog
the Yankees is the
rival to ……In 2015, [New
York Yankees] won
championship
…[New York
Yankees
|yankees]… yankees…
# of found documents
Boston Red Sox
[Boston
Red Sox]
…New York
Yankee…
writer’s recent microblogs
…
yankees
…
1. Keyword
2. Context similarity
3. Entities’ correlation
4. Mention entity’s name
5. Occurrence frequency
6. User interest
Jaccard-index
bag-of-wordssimilarity
prob.(yankees)
1.
2.
3.
4.5.
6.
candidate-entity candidate-entity
# of found cases
How to measure entities?
match
9
For indirect-reference: topic-specific translation
• To get microblog’s meaning based on topic knowledge: Effective when microblog is abstract
• How we did
…the player is
leaving for…
microblog topic translation
player =“outfielder”
“goalkeeper”
…the playeris leaving for…
news
“New York Yankees”
“Jacoby Ellsbury”
…
“pitcher”
“outfielder”“shortstop”…
“player”=>“outfielder”
responded or writer’s past microblogs
microblog-related data top proper-noun translation by semantic-similarity
“baseball”
“soccer”
top terms in topic(related Wikipedia documents)
10
For indirect-reference:pattern
• Effective when mention is normal-noun: e.g., no hint for entity’s name
• Pattern 1: entity-known-as
• Pattern 2: entity-performing-action
mention+ known-as-phrase
“pinstripes” + “known as”
action
“hit”
“New York Yankees…known as
…pinstripes”
“Jacoby Ellsbury
…hit”
Presenting flow
Introduction toEntity Identification
Feature CRF Model with Adaptive Dependency
Experiment results
How to measure entities?
How to predict entities?
12
Conditional Random Field (CRF) model
• To predict multiple entities together by proper dependency
• Linear + Non-sequential CRF:• to make prediction tractable, linear time 𝑂(𝑛𝑐2)
• If cycle-CRF, time is exponential, 𝑂(𝑐𝑛)
• to allow proper dependency among entities
n: # of mentions/a microblog; c: # of candidates/a mention
with dependency
pro
bab
ility
𝑌2=
without
𝑌2𝑌1𝑌2𝑌1
13
Adaptive dependency
• To make proper dependency among entities
• By entities’ correlation
CRF model of adaptive dependency
𝑋1 𝑋2 𝑋3 𝑋4 𝑋5
𝑌5𝑌4𝑌2 𝑌3𝑌1
Pick 𝑖, 𝑗 with max adaptive dependencyand not making cycle
………
“baseball”
to make high to make low
“singing”
c c
14
Prediction probability
• CRF model
𝑝 𝒚 𝒙 =1
𝑧 𝒙𝑒𝑥𝑝
𝑖
[
𝑓∈𝐹𝛼
𝑤𝑓 𝑓 𝑦𝑖 +
𝑓∈𝐹𝛽
𝑤𝑓 𝑓 𝑥𝑖 , 𝑦𝑖 ] +
𝑖,𝑗 ∈L
𝑓∈𝐹𝛾
𝑤𝑓𝑓 𝑦𝑖 , 𝑦𝑗
• CRF model with adaptive dependency
𝑝 𝒚 𝒙 =1
𝑧 𝒙𝑒𝑥𝑝
𝑖
[
𝑓∈𝐹𝛼
𝑤𝑓 𝑓 𝑦𝑖 +
𝑓∈𝐹𝛽
𝑤𝑓 𝑓 𝑥𝑖 , 𝑦𝑖 ] +
𝑙= 𝑖,𝑗 ∈L
𝑓∈𝐹𝛾
𝛿 𝑙 𝑤𝑓𝑓 𝑦𝑖 , 𝑦𝑗
𝒚=(𝑦1,…, 𝑦𝑛): a set of entities; 𝒙=(𝑥1,…, 𝑥𝑛): a set of mentions; 𝑧(𝒙): normalization; 𝑤𝑓: weight of feature f
𝐹𝛼: a set of features of an entity𝐹𝛽: a set of features of an entity and mention
𝐹𝛾: a set of features of two entities
L: a set of connections between 𝑌𝑖 , 1 ≤ 𝑖 ≤ 𝑛
𝛿 𝑙 , adaptive dependency:top-k value of 𝑓∈𝐹𝛽𝑤𝑓 𝑓 𝑥𝑖 , 𝑦𝑖 + 𝑓∈𝐹𝛽𝑤𝑓 𝑓 𝑥𝑗 , 𝑦𝑗 + 𝑓∈𝐹𝛾𝑤𝑓 𝑓 𝑦𝑖 , 𝑦𝑗
Presenting flow
Introduction toEntity Identification
Feature CRF Model with Adaptive Dependency
Experiment results
How to measure entities?
How to predict entities?
16
Experiment outline
• Microblog annotation
• Candidate entity generation
• Performance• Overall: features + model
• Feature comparison
• CRF model with adaptive dependency
17
Microblog annotation
• Credible ground-truth: 3 annotators on 500 random tweets from Twitter (2014/10)
• Annotation result:
=> Multiple mentions in a microblog (2.61 per tweet)
=> Indirect-reference was not infrequent (indirect:direct≈2:3)
Twitter-tag Tweet # Mention # direct-ref. # indirect-ref #
#Yankees 86 228 153 108
#Obama 92 227 167 87
#Ebola 97 241 151 156
#Nobel 94 287 228 124
#Islam 92 219 151 95
Mean per tweet 2.61 1.84 1.24
18
Candidate entity generation
• Direct reference: mention is partial or full name of entity
• Indirect reference: mention is included in entity’s main page in Wikipedia
30
50
70
90
10
60
11
0
16
0
21
0
26
0
31
0
36
0
41
0
46
0
51
0
56
0
61
0
66
0
71
0
76
0
81
0
86
0
91
0
96
0
20
00g
t-en
titi
es i
n
candid
ates
(%
)
size of top candidate entities
for direct reference
for indirect reference
=> Weak for indirect-reference
19
Baseline method
• Baseline-model: sequence-rank one-by-one• 𝑎𝑟𝑔𝑚𝑎𝑥𝑒∈𝐶p(yi = e|y1, … , y𝑖−1, 𝑦𝑖+1,…, yn)
• 𝐶: candidates for yi
𝑌5𝑌4𝑌2 𝑌3𝑌1
1𝑜 2𝑜 3𝑜 4𝑜 5𝑜
Context similarity
Entities’ correlation
Mention entity’s name
User interest
Occurrence frequency
Keyword
Topic-specific translation
Pattern
Writing behavior
Ourfeatures
Baseline-features
Ranking order:
20
Overall performance
• Our CRF model (or all features) was always better
=> CRF model works regardless of features
=> Multiple features are required
MRR=1/𝑞 𝑖 1/ranki, where q: # of test, ranki: rank position of ground-truth entities at test i
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
MR
R
**
**
(+SEM)
Our CRF
All-feature (including ours)Baseline-feature
Baseline-model
21
Feature comparison
• Our feature was effective for indirect-reference
00.10.20.30.40.5
Topic-
specific
translation,
Eq. 1a-b
Occurrence
frequency
Entities'
correlation
Topic-
specific
translation,
Eq. 1c-f
Keyword Context
similarity
Pattern Writing
behavior
Mention
entity's
name
User
interest
MR
R for indirect-reference
0
0.2
0.4
0.6
0.8
Occurrence
frequency
Mention
entity's
name
Topic-
specific
translation,
Eq. 1a-b
Entities'
correlation
Topic-
specific
translation,
Eq. 1c-f
Pattern Context
similarity
Writing
behavior
Keyword User
interest
MR
R for direct-reference
(+SEM)
(+SEM)
Our feature
Baseline-feature
22
Effect of CRF model with adaptive dependency
• Our adaptive dependency was a little worse than best• but note that our complexity is in linear
appearing order
𝑂(𝑐𝑛) 𝑂(𝑛𝑐2) 𝑂(𝑛𝑐)complexity 𝑂(𝑛𝑐2) 𝑂(𝑛𝑐2)
00.10.20.30.40.50.60.7
Fully connected Adaptive Occurrence order Random No dependency
MR
R
(+SEM)
23
Conclusion
• Contribution:• Surveyed on microblogs for indirect-reference• Effective feature for indirect-reference• Accurate and efficient: CRF model with adaptive dependency
• Finding:• Not good for getting candidates for indirect-reference
• Limited performance on some novel feature
• Multiple features were required when direct/indirect references are mixed
• Thank you for listening