20
Relevance-Based Language Models Victor Lavrenko and W.Bruce Cro ft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001 Presented by Yi-Ting

Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Embed Size (px)

Citation preview

Page 1: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Relevance-Based Language Models

Victor Lavrenko and W.Bruce CroftDepartment of Computer Science

University of Massachusetts, Amherst, MA 01003SIGIR 2001

Presented by Yi-Ting

Page 2: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Outline

Introduction Related Work-

Classical Probabilistic Model、Language Modeling Approaches

Relevance Model Experimental Results Conclusions

Page 3: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

introduction

A departure from the traditional models of relevance and emergence of language modeling frameworks.

To introduce a technique for constructing a relevance model from the query alone and compare resulting query-based relevance models to language models constructed with training data.

Page 4: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Related Work

Classical Probabilistic Models Probability ranking principle: P(D|R)/P(D|N) Generative language models: P(w|R)/P(w|N) Binary Independence Model multiple-Bernoulli language models:

2-Poisson Model

( | ) ( | ) 1 |W D W D

P D R P w R P w N

Page 5: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Related Work

Language Modeling Approaches Focused on viewing documents themselves as

models and queries as strings of text randomly sampled from these models.

By Ponte and Croft,

By Miller et al., Song and Croft,

Berger and Lafferty view the query Q as a potential translation of the document D, and use powerful estimation techniques.

( | ) ( | ) 1 |D D DW Q W D

P Q M P w M P w M

( | ) ( | ) wq

D DW

P Q M P w M

Page 6: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Relevance Model

To refer to a mechanism that determines the probability P(w|R) of observing a word w in the documents relevant.

Justified way of estimating the relevance model when no training data is available.

Ranking with a Relevance Model:( | ) ( | )

( | ) ( | )w D

P D R P w R

P D N P w N

Page 7: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Relevance Model

Assume that the query is a sample from any specific document model. Instead we assume that both the query and the documents are samples from an unknown relevance model R.

Page 8: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Relevance Model

Let Q=q…qk. The probability of w to the conditional probability of observing w given that we just observed q1…qk:

1

1

1

| | ....

, ....|

....

k

k

k

P w R P w q q

P w q qP w R

P q q

Page 9: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Relevance Model

Method1: i.i.d sampling Assume that the query q…qk words and the w in relevant d

ocuments are sampled identically and independently from a unigram distribution MR.

1 1

1

1

1

1

, .... , .... |

, .... | | |

, .... | |

k kM M

k

k ii

k

k iiM M

P w q q P M P w q q M

P w q q M P w M P q M

P w q q P M P w M P q M

Page 10: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Relevance Model

Method2: conditional sampling Fixing a value of w according to some prior P(w). Then pic

k a distribution Mi according to P(Mi|w), sample the query word qi from Mi with probability P(qi|Mi).

Assume the query words q…qk to be independent of each other, but keep their dependence on w:

1

1

1

1

, .... |

| | |

, .... | |

k

k ii

i i i iM M

k

k i i ii M M

P w q q P w P q w

P q M P M w P q M

P w q q P w P M w P q M

Page 11: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Relevance Model Method 2 performs slightly better in terms of

retrieval effectiveness and tracking errors.

Page 12: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Relevance Model

Estimation Details Set the query prior P(q…qk) in equation(6) to be:

P(w) to be:

P(w|MD) to be:

P(Mi|w) to be: P(Mi|w) = P(w|Mi)P(w)/P(Mi)

1 k 1 kP(q ...q )= P(w, q ...q )w

P(w)= P(w|M) ( )M M

P M

D

tf(w,D)P(w|M )= (1 ) ( | )

( , )v

P w Gtf v D

Page 13: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Experimental Results

Model cross-entropy Cross-entropy is an information-theoretic measure

of distance between two distribution, measured bits.

Both models exhibit lowest cross-entropy around 50 documents, however the model estimated with Method 2 achieves lower absolute cross-entropy.

Page 14: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Experimental Results

Page 15: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Experimental Results

TREC ad-hoc retrieval Two query sets: TREC title queries 101-150 and 1

51-200. Baseline: the performance of the language modeli

ng approach, where we rank documents by their probability of generating the query.

Average precision、Wilcoxon test、 R_Precision

Relevance models is an attractive choice in high-precision applications.

Page 16: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Experimental Results

Page 17: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001
Page 18: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Experimental Results

TDT topic tracking The tracking task in TDT is evaluated using Detec

tion Error Tradeoff (DET) : the tradeoff between Misses and False errors.

Ran the tracking system on a modified task.

TDT tracking systems with 1 to 4 training examples.

Page 19: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Experimental Results

Page 20: Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA 01003 SIGIR 2001

Conclusions

Our technique is also relatively insensitive to the number of documents used in “expansion”.

our model is simple to implement and does not require any training data.

There are a number of interesting directions for further investigation of relevance models.

Our method produces accurate models of relevance, which leads to significantly improved retrieval performance.

demonstrated that unsupervised Relevance Models can be competitive with supervised topic models in TDT.