24
Outline Introduction The Classical Approach The Language Modeling Approach Smoothing Techniques Relation with Classical Approach Language Modeling for Information Retrieval Manoj Kumar Chinnakotla KReSIT IIT Bombay Language Technologies for the Web Mar 2006 Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Language Modeling for Information Retrieval

Manoj Kumar Chinnakotla

KReSITIIT Bombay

Language Technologies for the WebMar 2006

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 2: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Outline

1 Introduction

2 The Classical Approach

3 The Language Modeling Approach

4 Smoothing Techniques

5 Relation with Classical Approach

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 3: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Probabilistic Models

The Central Problem in IR

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 4: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Probabilistic Models

Is this Document Relevant?

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 5: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Probabilistic Models

Probabilistic Models

Model uncertainties in the problem well

Example

Is this term relevant?

Is this document relevant?

The Random VariablesRelevance (R) -R2 f0; 1gDocuments (D) -D 2 fD1; D2; : : : ; DNgQuery (Q) -Q 2 fAll Possible QueriesgA Term (Ai) - Ai 2 f0; 1g

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 6: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Probabilistic Models

The Ranking Function

Rank documents based on Posterior Probability of Relevance

Score(D; Q) = P(R= 1jD; Q) (1)

Ranking using followinglog-odds ratiois equivalent

Score(D; Q) = logP(R= 1jD; Q)P(R= 0jD; Q) (2)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 7: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Probabilistic Models

Probabilistic Ranking Principle

Due to Robertson [6]

Central theorem for Probabilistic IR

Theorem

Ranking documents using the log-odds ratio ofposterior probabilityof relevanceis optimalwith respect to various retrieval measures (likeAverage Precision).

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 8: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

The Classical Approach

Due to Robertson-Sparck Jones [7]

Generative Model of Relevance

Rank documents based on the following log-odds ratio

Score(D; Q) = logP(DjQ; R= 1)P(DjQ; R= 0) (3)

For queryQ, most of collectionC is irrelevant

P(:jQ; R= 0) � P(:jQ; C) (4)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 9: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Binary Independence Retrieval Model

Assumingterm independence, we have

Score(D; Q) = XAi2D

logP(Ai jQ; R= 1)

P(Ai jQ; C) (5)

Popularly known as “Binary Independence Retrieval (BIR)”Model

Need to estimateRelevance Distribution P(:jQ; R= 1)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 10: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Are we back to the Original Problem?

EstimatingP(:jQ; R= 1) is equivalent to solving the originalproblem!

Challenge - No sample relevant documents available initiallyCurrent Approaches

Choose some initial estimates forP(wjQ; R= 1)Iteratively assume topk documents retrieved arerelevantUpdate estimates

Accuracy depends on initial separation achieved

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 11: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Are we back to the Original Problem?

EstimatingP(:jQ; R= 1) is equivalent to solving the originalproblem!

Challenge - No sample relevant documents available initiallyCurrent Approaches

Choose some initial estimates forP(wjQ; R= 1)Iteratively assume topk documents retrieved arerelevantUpdate estimates

Accuracy depends on initial separation achieved

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 12: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Are we back to the Original Problem?

EstimatingP(:jQ; R= 1) is equivalent to solving the originalproblem!

Challenge - No sample relevant documents available initiallyCurrent Approaches

Choose some initial estimates forP(wjQ; R= 1)Iteratively assume topk documents retrieved arerelevantUpdate estimates

Accuracy depends on initial separation achieved

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 13: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Are we back to the Original Problem?

EstimatingP(:jQ; R= 1) is equivalent to solving the originalproblem!

Challenge - No sample relevant documents available initiallyCurrent Approaches

Choose some initial estimates forP(wjQ; R= 1)Iteratively assume topk documents retrieved arerelevantUpdate estimates

Accuracy depends on initial separation achieved

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 14: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

The Language Modeling Approach

Basic Idea (Ponte and Croft [5])

Assuming documentD is relevant, what is the likelihood of userchoosing current queryQ to retrieveD?

Model the language of each document as a distribution overwords (Unigram)Individual document distributionsP(wjD) called“LanguageModels”Rank documents based onposterior probability of documentgiven query

P(DjQ) = P(QjD)| {z }Query Likelihood

�Document Priorz }| {

P(D) (6)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 15: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

A Shift in Paradigm

Some Immediate BenefitsAllows integration of document importance throughDocumentPrior P(D)Document Priorcould be estimated from Link Analysisalgorithms (Page Rank, HITS)Ease of Estimation - Document size usually larger than the queryDocument Language ModelsP(wjD) could be pre-computed atindex time

Assuming uniform document priors,Query Likelihood RankingFunctionis given by

Score(D; Q) = Yw2D

P(wjD) (7)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 16: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Smoothing Techniques

MotivationThe Maximum Likelihood Estimator (MLE) forP(wjD) given by

Pml = c(w;D)Pw2D c(w;D) (8)

Since document length is limited, MLEPml

Assigns zero probability to words not observed inDHas high variance

Solution - Smoothing the MLE using collection modelP(wjC)Example

Jelinik-Mercer Smoothing

P�(wjD) = �Pml(wjD) + (1� �)P(wjC) (9)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 17: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Modeling of Relevance

Relation with Classical Approach

Figure:Two different factorizations of the same jointP(D; QjR)Two Approaches EquivalentLM Approach makes additional assumptions

Justification for Assumptions

For a given documentD, a language model is actually a model of thequeries to which the document isrelevant.

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 18: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Modeling of Relevance

Where is Relevance?

Notion of relevance assumed implicitly in the model

This is a problem while handling “Relevance Feedback”Solution - Query Models or Relevance Models [3, 4]

Relevance Model or Query Model - Distribution encoding theinformation needAssume queryQ to be sample from Relevance Model�R

New Ranking Function - Divergence Based

Score(D) = KL(Djj�R)= X

w

P(wjD) � logP(wjD)P(wj�R) (10)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 19: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Modeling of Relevance

Where is Relevance?

Notion of relevance assumed implicitly in the model

This is a problem while handling “Relevance Feedback”Solution - Query Models or Relevance Models [3, 4]

Relevance Model or Query Model - Distribution encoding theinformation needAssume queryQ to be sample from Relevance Model�R

New Ranking Function - Divergence Based

Score(D) = KL(Djj�R)= X

w

P(wjD) � logP(wjD)P(wj�R) (10)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 20: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Modeling of Relevance

Where is Relevance?

Notion of relevance assumed implicitly in the model

This is a problem while handling “Relevance Feedback”Solution - Query Models or Relevance Models [3, 4]

Relevance Model or Query Model - Distribution encoding theinformation needAssume queryQ to be sample from Relevance Model�R

New Ranking Function - Divergence Based

Score(D) = KL(Djj�R)= X

w

P(wjD) � logP(wjD)P(wj�R) (10)

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 21: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

OutlineIntroduction

The Classical ApproachThe Language Modeling Approach

Smoothing TechniquesRelation with Classical Approach

Modeling of Relevance

Implications for the LM Approach

Problem of Retrieval) Estimating Two DistributionsRelevance ModelP(wj�R)Document Language ModelsP(wjD)

Offers natural way to incorporate “Relevance Feedback”

Given relevant documents, update Relevance Model�R

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 22: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

Appendix References

ReferencesI

CROFT, W. B., AND LAFFERTY, J.

Language Modeling for Information Retrieval.Kluwer Academic Publishers, 2003.

JOHN LAFFERTY AND CHENGXIANG ZHAI .

Probabilistic Relevance Models Based on Document and Query Generation.In Language Modeling for Information Retrieval(2003), vol. 13, Kluwer International Series on IR, pp. 1–10.

LAFFERTY, J.,AND ZHAI , C.

Document Language Models, Query Models, and Risk Minimization for Information Retrieval.In SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development ininformation retrieval(New York, NY, USA, 2001), ACM Press, pp. 111–119.

LAVRENKO, V., AND CROFT, W. B.

Relevance Based Language Models.In SIGIR ’01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development ininformation retrieval(New York, NY, USA, 2001), ACM Press, pp. 120–127.

PONTE, J. M., AND CROFT, W. B.

A Language Modeling Approach to Information Retrieval.In SIGIR ‘98: Proceedings of the ACM SIGIR conference on Research and Development in Information Retrieval(1998), pp. 275–281.

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 23: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

Appendix References

ReferencesII

ROBERTSON, S. E.

The Probability Ranking Principle in IR.Readings in information retrieval(1997), 281–286.

ROBERTSON, S. E.,AND JONES, S.

Relevance Weighting of Search Terms.Journal of the American Society for Information Science 27(1976), 129–146.

YATES, R. B., AND NETO, B. R.

Modern Information Retrieval.Pearson Education, 2005.

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval

Page 24: Language Modeling for Information Retrievalcs626-449/cs626-460-2008/public_html/2… · Language Modeling for Information Retrieval. Kluwer Academic Publishers, 2003. JOHN LAFFERTY

Appendix References

Thank You

Manoj Kumar Chinnakotla Language Modeling for Information Retrieval