23
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illin ois at Urbana-Champaign Bin Tan : department of Computer Science University of Illinois at Urbana-Champaign ChengXiang Zhai : department of Computer Science University of Illinois at Urbana-Champaign Present by Chia- Hao Lee

Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

Embed Size (px)

Citation preview

Page 1: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

Context-Sensitive Information Retrieval Using Implicit Feedback

Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

Bin Tan : department of Computer Science University of Illinois at Urbana-Champaign

ChengXiang Zhai : department of Computer Science University of Illinois at Urbana-Champaign

Present by Chia-Hao Lee

Page 2: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

2

outline

• Introduction • Problem Definition • Language Models for Context-Sensitive Information Retrie

val– Basic retrieval model – Fixed Coefficient Interpolation (FixInt)– Bayesian Interpolation (BayesInt)– Online Bayesian Updating (OnlineInt)– Batch Bayesian updating (batchUp)

• Experiments• Conclusions and Future Work

Page 3: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

3

Introduction

• In most existing information retrieval models, the retrieval problem is treated as involving one single query and a set of documents.

• From a single query, however, the retrieval system can only have very limited clue about the user’s information need.

• An optimal retrieval system thus should try to exploit as much additional context information as possible to improve retrieval accuracy, whenever it is available.

Page 4: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

4

Introduction

• There are many kinds of context that we can exploit.

• Relevance feedback is known to be effective for improving retrieval accuracy.

• However, relevance feedback requires that a user explicitly provides feedback information, such as specifying the category of the information need or marking a subset of retrieved documents as relevant documents.

Page 5: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

5

Introduction

• A major advantage of implicit feedback is that we can improve the retrieval accuracy without requiring any user effort.

• For example, if the current query is “java”, without knowing any extra information, it would be impossible to know whether it is intended to mean the Java programming language or Java island in Indonesia.

Page 6: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

6

Problem Definition

• There are two kinds of context information we can use for implicit feedback.– Short-term context– Long-term context

• Short-term context is the immediate surrounding information which throws light on a user’s current information need in a single session.

• A session can be considered as a period consisting of all interactions for the same information need.

Page 7: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

7

Problem Definition

• In a single search session, a user may interact with the search system several times. During interactions, the user would continuously modify the query.

• Therefore for the current query , there is a query history.

• associated with it, which consists of the preceding queries given by the same user in the current session.

• Indeed, our work has shown that the short-term query history is useful for improving retrieval accuracy.

kQ

11 ,, kQ QQH

Page 8: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

8

Problem Definition

• A user would presumably frequently click some documents to view.

• We refer to data associated with these actions as clickthrough history.

• The clickthrough data may include the title, summary, and perhaps also the content and location of the clicked document.

• Our work has shown positive results using similar clickthrough information.

Page 9: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

9

Language models for context-sensitive information retrieval

• We propose to use statistical language models to model a user’s information need and develop four specific context-sensitive language models to incorporate context information into a basic retrieval model.

• 1. Basic retrieval model

We compute , which serves as the score of the document.

One advantage of this approach is that we can naturally incorporate the search context as additional evidence to improve our estimate of the query language model.

DQD

Page 10: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

10

Language models for context-sensitive information retrieval

• Our task is to estimate a context query model, which we denote by , based on the current query , as well as the query and clickthough history .

• We will use to denote the count of word ω in text X, which could be either a query or a clicked document’s summary or any other text.

• We will use to denote the length of text X or the total number of words in X.

kwp kQ

QH CH

X

Xwc ,

Page 11: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

11

Language models for context-sensitive information retrieval

• 2. Fixed Coefficient Interpolation (FixInt)

Our first idea is to summarize the query history with a unigram language model and the clickthrough history with another unigram language model .

i

ii Q

QwcQwp

,

1

11

1 ki

iiQ Qwp

kHwp

i

ii C

CwcCwp

,

1

11

1 ki

iiC Cwp

kHwp

QC HwpHwpHwp 1

HwpQwpwp kk 1

QCkk HwpHwpQwpwp 11

QH QHwp

CHwp

Page 12: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

12

Language models for context-sensitive information retrieval

• 3. Bayesian Interpolation (BayesInt)

One possible problem with the FixInt approach is that the coefficient, especially α, are fixed across all the queries.

If our current query is very long, we should trust the current query more, whereas if has just one word, it may be beneficial to put more weight on the history.

To capture this intuition, we treat and as Dirichlet priors and as the observed data to estimate a context query model using Bayesian estimator.

QHwp CHwp

kQ

kQ

kQ

Page 13: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

13

Language models for context-sensitive information retrieval

The estimated model is given by

k

CQk

k Q

HwpHwpQwcwp

,

CQ

kk

k

k HwpHwpQ

QwpQ

Q

: the prior sample size for QHwp

: the prior sample size for CHwp

Page 14: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

14

Language models for context-sensitive information retrieval

• 4. Online Bayesian Updating (Online Up)

4.1 Bayesian updating

Let be or current query model and T be a new piece of text evidence observed. To update the query model based on T, we use to define a Dirichlet prior parameterized as

With such a conjugate prior, the predictive distribution of

NTT wpwpDir ,,1

T

T

T

wpTwcwp

,

wp

Page 15: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

15

Language models for context-sensitive information retrieval

• 4.2 Sequential query model updating

We use such information to define a prior on the query model, which is denoted by .

After we observe the first query , we can update the query model based on the new observed data .

The update query model can then be used for ranking documents in response to . As the user’s views some documents, the displayed summary text for such documents can serve as some new data for us to further update the query model to obtain .

'0

1Q

1C

1

'1

1Q

1Q

Page 16: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

16

Language models for context-sensitive information retrieval

We see two types of updating:

(1) updating based on a new query

(2) updating based on a new clicked summary

Thus we have the following updating equations:

iQ

iC

ii

iii

i Q

wpQwcwp

'1,

ii

iiii C

wpCwcwp

,'

: the equivalent sample size for the prior when updating the model based on a query

i

: the equivalent sample size for the prior when updating the model based on a clicked summary

i

Page 17: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

17

Language models for context-sensitive information retrieval

• 5. Batch Bayesian updating (BatchUp)

The updating equations are as follows.

ii

iiii Q

wpQwcwp

1,

i

i

jj

ii

i

jj

i

C

wpCwc

wp

1

1

1

1

,

: the same interpretation as in OnlineUpi: indicates to what extent we want to trust the clicked summariesi

Page 18: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

18

Experiments

Page 19: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

19

Experiments

Page 20: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

20

Experiments

Page 21: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

21

Experiments

Page 22: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

22

Experiments

Page 23: Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign

23

Conclusions

• In this paper, we have explored how to exploit implicit feedback information, including query history and clickthrough history within the same search session, to improve information retrieval performance.

• Experiment results show that using implicit feedback, especially clickthrough history, can substantially improve retrieval performance without requiring any additional user effort.