21
104 Microsoft Research β€’ Automatic highlighting β€’ Contextual entity search

Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

104Microsoft Research

β€’ Automatic highlighting

β€’ Contextual entity search

Page 2: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

105Microsoft Research

The Einstein Theory of Relativity

Page 3: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

106Microsoft Research

The Einstein Theory of Relativity

Page 4: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

107Microsoft Research

The Einstein Theory of Relativity

Page 5: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

108Microsoft Research

EntityThe Einstein Theory of Relativity

Page 6: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

109Microsoft Research

ContextThe Einstein Theory of RelativityEntity

Page 7: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

110Microsoft Research

The Einstein Theory of Relativity ContextThe Einstein Theory of RelativityEntity

Page 8: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

111Microsoft Research

Key phrase

ContextEntity page

(reference doc)

Tasks X (source text) Y (target text)

Automatic highlighting Doc in reading Key phrases to be highlighted

Contextual entity search Key phrase and context Entity and its corresponding (wiki) page

Page 9: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

112Microsoft Research

Key phrase

ContextEntity page

(reference doc)

Tasks X (source text) Y (target text)

Automatic highlighting Doc in reading Key phrases to be highlighted

Contextual entity search Key phrase and context Entity and its corresponding (wiki) page

Page 10: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

113Microsoft Research

ray of light

Ray of Light (Experiment)

Ray of Light (Song)

The Einstein Theory of Relativity

Page 11: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

114Microsoft Research

ray of light

Ray of Light (Experiment)

Ray of Light (Song)

The Einstein Theory of Relativity

Page 12: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

115Microsoft Research

β€’ Two interestingness tasks for recommendation

β€’ Modeling interestingness via DSSM

β€’ Training data acquisition

Page 13: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

116Microsoft Research

𝑃 𝐻

…

I spent a lot of time finding music that was motivating and

that I'd also want to listen to through my phone. I could

find none. None! I wound up downloading three Metallica

songs, a Judas Priest song and one from Bush.…

http://runningmoron.blogspot.in/

β€’ (text in 𝑃, anchor text of 𝐻)

𝑃

𝐻

Page 14: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

117Microsoft Research

𝐻 𝑃′

…

I spent a lot of time finding music that was motivating and

that I'd also want to listen to through my phone. I could

find none. None! I wound up downloading three Metallica

songs, a Judas Priest song and one from Bush.…

http://runningmoron.blogspot.in/

β€’ (anchor text of 𝐻 & surrounding words, text in 𝑃′)

http://en.wikipedia.org/wiki/Bush_(band)

Page 15: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

118Microsoft Research

Page 16: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

119Microsoft Research

β€’ Random: Random baseline

β€’ Basic Feat: Boosted decision tree learner with document features, such

as anchor position, freq. of anchor, anchor density, etc.

0.041

0.215

0.062

0.253

0

0.2

0.4

0.6

Random Basic Feat

NDCG@1 NDCG@5

Page 17: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

120Microsoft Research

β€’ + LDA Vec: Basic + Topic model (LDA) vectors [Gamon+ 2013]

β€’ + Wiki Cat: Basic + Wikipedia categories (do not apply to general documents)

β€’ + DSSM Vec: Basic + DSSM vectors

0.041

0.215

0.345

0.505 0.554

0.062

0.253

0.3800.475 0.524

0

0.2

0.4

0.6

Random Basic Feat + LDA Vec + Wiki Cat + DSSM

Vec

NDCG@1 NDCG@5

Page 18: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

121Microsoft Research

source

target

Page 19: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

122Microsoft Research

β€’ BM25: The classical document model in IR [Robertson+ 1994]

β€’ BLTM: Bilingual Topic Model [Gao+ 2011]

0.041

0.215

0.062

0.253

0

0.2

0.4

0.6

0.8

BM25 BLTM

NDCG@1 AUC

Page 20: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

123Microsoft Research

β€’ DSSM-bow: DSSM without convolutional layer and max-pooling structure

β€’ DSSM outperforms classic doc model and state-of-the-art topic model

0.041

0.215 0.223 0.259

0.062

0.253

0.699 0.711

0

0.2

0.4

0.6

0.8

BM25 BLTM DSSM-bow DSSM

NDCG@1 AUC

Page 21: Automatic highlighting Contextual entity searchxye/papers_and_ppts/ppts/entities...Microsoft Research 116 𝐻 … I spent a lot of time finding music that was motivating and that

124Microsoft Research

β€’ Extract labeled pairs from Web browsing logs