64
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas Rauber Institute of Software Engineering and Interactive Systems Vienna University of Technology [email protected] http://www.ifs.tuwien.ac.at/~bashir/

Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

Embed Size (px)

Citation preview

Page 1: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Analyzing Retrieval Models using

Retrievability Measurement

Shariq Bashir

Supervisor: ao. Univ. Prof. Dr. Andreas Rauber

Institute of Software Engineering and Interactive Systems

Vienna University of Technology

[email protected]

http://www.ifs.tuwien.ac.at/~bashir/

Page 2: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Outline

Introduction to Retrievability (Findability) Measure

Setup for Experiments

Findability Scoring Functions

Relationship between Findability and Query

Characteristics

Relationship between Findability and Document

Features

Relationship between Findability and Effectiveness

Measures

Page 3: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

Introduction

Retrieval Systems are used for searching information

Rely on retrieval models for ranking documents

How to select best Retrieval Model

Evaluate Retrieval Models

State of the Art– Effectiveness Analysis, or– Efficiency (Speed/Memory)

Page 4: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Effectiveness Measures

(Precision, Recall, MAP) depends upon – Few topics– Few judged documents

Suitable for precision oriented retrieval task

Less suitable for recall oriented retrieval task – (e.g. patent or legal retrieval)

Page 5: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Findability Measure

Considers all documents

The goal is to maximize the findability of documents

Documents in Retrieval Model having higher findability are more easy to find than Retrieval Model having lower findability

Applications– Offers another measure for comparing Retrieval

Models– Subset of documents that are hard or easy to find

Page 6: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Findability Measure

Factors that affect Findability

1. User Query– [Query = Data Mining books] vs

[Query = Han Kamber books] • for searching book “Data Mining

Concepts and Techniques”

2. The maximum number of top links/docs checked

3. The ranking strategy of Retrieval Models

Page 7: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Retrievability Measure

[Leif Azzopardi and Vishwa Vinay, CIKM 2008] Given a collection D of documents, and query set Q retrievability of dD

kdq rank of dD in the result set of query qQ c the point in rank list where user will stop f(kdq,c) = 1 if kdq<= c, and 0 otherwise

Gini-Coefficient = Summarize findability scores

Page 8: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Outline

Introduction to Findability Measure

Setup for Experiments

Retrievability Scoring Functions

Relationship between Findability and Query

Characteristics

Relationship between Findability and Document

Features

Relationship between Findability and Effectiveness

Measures

Page 9: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Setup for Experiments

Collections1. TREC Chemical Retrieval Track Collection 2009 (TREC-CRT)2. USPTO Patent Collections

• USPC Class 433 (Dentistry) (DentPat)• USPC Class 422 (Chemical apparatus and process disinfecting,

deodorizing, preserving, or sterilizing) (ChemAppPat)

3. Austrian News Dataset (ATNews)

TREC-CRT, ATNews are more skewedUSPTO Collections are less skewed

Page 10: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Setup for Experiments

Retrieval Models– Standard Retrieval Models

• TFIDF, NormTFIDF, BM25, SMART– Language Models

• Jelinek-Mercer Smoothing, Dirichlet Smoothing (DirS), Two-Stage Smoothing (TwoStage), Absolute Discounting Smoothing (AbsDis)

Query Generation– All sections of Patent documents– Terms removed with document frequency (df) > 25%– All term combinations of 3- and 4-terms

Page 11: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

52 443 583 746 962 1474

Docs. Ordered by Increasing Vocabulary Size

Setup for Experiments

TREC-CRT ATNews

ChemAppPatDentPat

5 101 155 198 255 427

Docs. Ordered by Increasing Vocabulary Size

243 597 690 776 895

Docs. Ordered by Increasing Vocabulary Size

284 381 426 463 504 559 866

Docs. Ordered by Increasing Vocabulary Size

Page 12: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Outline

Introduction to Retrievability Measure

Setup for Experiments

Findability Scoring Functions

Relationship between Findability and Query

Characteristics

Relationship between Findability and Document

Features

Relationship between Findability and Effectiveness

Measures

Page 13: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Findability Scoring Functions

Standard Findability Scoring Function– Does not consider the difference in Docs.

vocabulary size– Biased towards long documents

– With r(d), Doc2 has higher Findability than Doc5

– But, due to small vocabulary size Doc5 does not have larger query subset

All 3-Terms combinations

Findability Percentage Doc2 = 3600/6545 = 0.55

Doc5 = 90/120 = 0.75

Page 14: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Findability Scoring Functions

Normalize Findability– Normalize r(d) relative to number of Queries generated from d– This will account for the difference between doc lengths

– (d) queries generated from d

Page 15: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Findability Scoring Functions

Comparison between r(d) and r^(d)– Retrieval ordered by Gini-Coefficients (Retrieval Bias)– Findability Ranks of Documents

Page 16: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Findability Scoring Functions

Retrieval Model

c=100r(d)

Retrieval Model

c=100r^(d)

BM25 0.48 DirS 0.69

TwoStage 0.49 AbsDis 0.69

DirS 0.51 JM 0.69

AbsDis 0.56 BM25 0.71

NormTFIDF

0.57 TwoStage 0.72

JM 0.59 NormTFIDF 0.72

TFIDF 0.78 TFIDF 0.94

SMART 0.92 SMART 0.95

TREC-CRT ChemAppPat

Retrieval Model

c=10r(d)

Retrieval Model

c=10r^(d)

BM25 0.33 JM 0.37

AbsDis 0.34 BM25 0.38

DirS 0.36 AbsDis 0.38

TwoStage 0.37 DirS 0.39

JM 0.39 TwoStage 0.42

NormTFIDF

0.40 NormTFIDF 0.42

TFIDF 0.47 TFIDF 0.56

SMART 0.85 SMART 0.56

Correlation between r(d) and in Terms of Gini-Coefficients

Retrieval Models are ordered by r(d) and r^(d)

Page 17: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Findability Scoring Functions

Correlation between r(d) and in Terms of Documents Findability Ranks

– TREC-CRT and ATNews• The correlation between r(d) and is low (high difference)• Due to large difference between document lengths

– ChemAppPat and DentPat• The correlation between r(d) and is high (low difference)• Due to not large difference between document lengths

Correlation between r(d) and r^(d)

Back

Page 18: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Findability Scoring Functions

Which Findability Functions is better (r(d) or r^(d) ).– On Gini-Coefficient it is difficult to decide

. . . . .Bucket 1 Bucket 2 Bucket 30

Ordered the documents based on findability scores and then partitioned into 30 Buckets

40 Random Docs (Known Items)

40 Random Docs (Known Items)

40 Random Docs (Known Items)

One Query/Document between 4 – 6 length

One Query/Document between 4 – 6 length

One Query/Document between 4 – 6 length

Low Findability Buckets <---------------------------------> High Findability Buckets

. . . . .

. . . . .

The goal is to search known-item using its own Query Effectiveness of Known-Items is measured through Mean Reciprocal Rank (MRR)

Low MRR Effectiveness <-------------Expected Results-------> High MRR Effectiveness

Page 19: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Retrievability Scoring Functions

Which Findability Functions is better (r(d) or r^(d) ).– Expected Results

• High findability buckets should have high effectiveness, since they are easy to findable than low findability buckets

• Positive correlation with MRR

– r^(d) buckets have good positive correlation with MRR than r(d)

TREC-CRT ChemAppPat

Correlation between Findability and MRR

Page 20: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Outline

Introduction to Findability Measure

Setup for Experiments

Findability Scoring Functions

Relationship between Findability and Query

Characteristics

Relationship between Findability and Document

Features

Relationship between Findability and Effectiveness

Measures

Page 21: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Query Characteristics and Findability

Q = Query SetFindability Score of

Documents GINI-Coefficients

Queries do not have similar quality Some queries are very specific (target oriented)

than others What is the effect of query quality on Findability Need to analyze Findability with different query

quality subsets

Creating Query Quality Subsets– Supervised Quality Labels: We do not have supervised labels– Query Characteristics (QC):

• Query Result List size• Query Term Frequencies in the Documents• Query Quality based on Query Performance Prediction Methods

For each QC, large query set is partitioned into 50 subsets

Current Findability Analysis Style

Page 22: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Query Characteristics and Findability

Query Subsets with Query Quality

Q = Query Set

Query Quality is predicted Simplified Clarity Score (SCS) [He & Ounis SPIRE

2004]

Q ordered by SCS score

And Partitioned into 50 Subsets

Query Subset 1 = Findability Analysis

Query Subset 2 = Findability Analysis

Query Subset 50 = Findability Analysis

TR

EC

-CR

T

Colle

ctio

ns

• X-Axis = Query Subsets ordered by Low SCS score to High SCS score

• Y-Axis = Gini-Coefficients

• Low SCS scores Subsets = High Gini-Coefficients

• High SCS scores Subsets = Low Gini-Coefficients

...

Page 23: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Outline

Introduction to Findability Measure

Setup for Experiments

Retrievability Scoring Functions

Relationship between Findability and Query

Characteristics

Relationship between Findability and Document

Features

Relationship between Findability and Effectiveness

Measures

Page 24: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Document Features and Findability

Findability Analysis

Query Processing

Large Processing Time

Large Computation Resources

Relationship between Document Features and Findability Scores

Can we predict Findability without processing exhaustive set of queries

Does not require heavy Processing

Only predict Findability Ranks

Can’t predict Gini-Coefficients

Page 25: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Document Features and Findability

The following three classes of document features are considered

– Surface Level features• Based on (Term Frequencies within Documents) and (Term

Document Frequencies within Collection)

– Features based on Term Weights• Based on the Term Weighting strategy of retrieval model

– Density around Nearest Neighbors of Documents• These features are based on the density around nearest neighbors

of documents

Page 26: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Document Features and Findability

# Feature Description

1 NATF Average of the normalized term frequencies of a document

2 freq Counts the frequency of high frequent terms of a document (tft,d/|d| > 0.03)

3 NATF_freq Computes the NATF with the frequent terms of (freq) feature

4 GC_terms Computes the term frequency inequality between terms of a document

5 freq_GC The total number of terms of a document that increase the GC_terms score greater than GC_terms = 0.25

6 ADF Considers the average document frequency of the terms

7 freq_low_df

Counts the frequency of terms of a document that have document frequency < 5% of total collection size

8 ADF_freq Computes the ADF score only based on the freq_low_df terms

9 Document Length

Document length

10 Vocabulary Size

Total number of unique terms

Surface Level Features

Page 27: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Document Features and Findability

TREC-CRT

ChemAppPat

Page 28: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Document Features and Findability

Combining Multiple Features– No feature performs best for all collections and for all retrieval

models– Worth to analyze to what extent combining multiple features

increases the correlation– Regression Tree, 50%/50% training/testing splitting

Correlation by combining multiple features

Correlation with best single feature

% of increase in correlation

Page 29: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Outline

Introduction to Findability Measure

Setup for Experiments

Findability Scoring Functions

Relationship between Findability and Query

Characteristics

Relationship between Findability and Document

Features

Relationship between Findability and Effectiveness

Measures

Page 30: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Relationship between Findability and Effectiveness

IR

Goal: Maximizing Findability

Does not need Relevance Judgments

Findability Measure Effectiveness Measures (Recall, Precision, MAP)

Does any relationship exist between both?

Maximizing Findability -> Maximizing Effectiveness

If relationship exists

Automatic Retrieval Models Ranking Tuning/Increasing Retrieval Model Effectiveness on

the basis of Findability Measure

Goal: Maximizing Effectiveness

Depends upon Relevance Judgments

Page 31: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Relationship between Findability and Effectiveness

Retrieval Models– Standard Retrieval Models and Language Models– Low Level Features of IR (tf, idf, doc length, vocabulary size,

collection frequency)– Term Proximity based Retrieval Models

# Feature Description

1 f1 (SumMinDist) Sum of minimum distances of all query term pairs

2 f2 (SumMaxDist) Sum of maximum distances of all query term pairs

3 f3 (AvgDist) Average of the sum of all query term pairs distances in the document

4 f4 (MinDistCount) Frequency of query’s term pairs that have a minimum distance of less than 4 terms

5 f5 (AvgPairDist) Similar to f3, this feature calculates the average of the sum of distances between all query’s term pairs and all single terms of a query

6 f6 (CoOccurrence) Counts the frequency of co-occurrence of query’s term pairs within a window of less than 4 terms

7 f7 (PairCoOccurrence)

Counts the frequency of co-occurrence of query’s term pairs with single terms of query within a window of less than 10 terms

8 f8 (MinCover) Shortest text segment in the document that covers all terms of a query at least once

Page 32: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Feature Gini-Coefficient with c=100 Feature Recall@100

1 PairCoOccurrence

0.39 1 JM 0.184

2 MinDistCount 0.45 2 DirS 0.177

3 CoOccurrence 0.49 3 TwoStage 0.174

4 BM25 0.52 4 AbsDis 0.170

5 SumMinDist 0.55 5 MinDistCount 0.156

6 TwoStage 0.56 6 BM25 0.156

7 DirS 0.57 7 CoOccurrence 0.147

8 MinCover 0.58 8 PairCoOccurrence

0.139

9 AbsDis 0.60 9 AvgPairDist 0.134

10 JM 0.62 10 MinCover 0.130

11 NormTFIDF 0.62 11 SumMinDist 0.126

12 ntf(d,q) 0.63 12 ntf(d,q) 0.107

13 AvgPairDist 0.66 13 SumMaxDist 0.107

14 AvgDist 0.67 14 AvgDist 0.106

15 SumMaxDist 0.68 15 NormTFIDF 0.082

16 |d| 0.74 16 SMART 0.074

17 sdf(d,q) 0.85 17 sdf(d,q) 0.042

18 scf(d,q) 0.85 18 tf(d,q) 0.016

19 TFIDF 0.91 19 TFIDF 0.008

20 tf(d,q) 0.92 20 scf(d,q) 0.002

21 SMART 0.93 21 |d| 0.001

22 Td 0.99 22 Td 0.001

Page 33: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Relationship between Findability and Effectiveness

Correlation = 0.80 0.75 0.80 0.73

Correlation exists Not perfect, but retrieval models having low retrieval bias

consistently appear in at least top half of the ranks

Page 34: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

Relationship between Findability and Effectiveness

Tuning Parameter values over Findability – Retrieval Models contain parameters

– Controls the query term normalization or smooth the document relevance score in case of unseen query terms

– We tune the parameter values over findability

– Examine this effect on Gini-Coefficient and Recall/Precision/MAP

Page 35: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Relationship between Findability and Effectiveness

Parameter b values are changed between 0 to 1

Page 36: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Relationship between Findability and Effectiveness

For JM Parameter values are changed between 0 to 1

Page 37: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Relationship between Findability and Effectiveness

Evolving Retrieval Model using Genetic Programming and Findability– Genetic Programming branch of soft computing– Helps to solve exhaustive search space problems

Retrieval Features

Randomly Combine IR Features

Selecting Best Retrieval Model

(Findability Measure)

Next Generation

Initial population

Recombination (Crossover, Mutation)

Repeat until 100 generations complete

Genetic Programming

Page 38: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Relationship between Findability and Effectiveness

Evolving Retrieval Model using Genetic Programming and Findability

– Solution (Retrieval Model) are represented with Tree structure.– Nodes of trees either operators (+,/,*) or ranking features– Ranking Features

• Low Level Retrieval Features• Term Proximity based Retrieval Features• Constant Values (0.1 to 1)

– 100 Generations are evolved with 50 solutions per generation

Page 39: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Relationship between Findability and Effectiveness

Evolving Retrieval Model using Genetic Programming and Findability

– Two correlation analysis are test

– (1) Relationship between Findability and Effectiveness on the basis of fittest individual of each generation

– (2) Relationship between Findability and Effectiveness on the basis of average fitness of each generation

Page 40: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Relationship between Findability and Effectiveness

Evolving Retrieval Model using Genetic Programming and Findability– (First): Relationship between Findability and Effectiveness on

the basis of Fittest individual of each generation

Page 41: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Relationship between Findability and Effectiveness

Evolving Retrieval Model using Genetic Programming and Findability– (Second): Relationship between Findability and Effectiveness

on the basis of Average Fitness of each generation– Generations having low average Gini-Coefficient also have high

effectiveness on Recall@100

Page 42: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Conclusions

Findability focuses on all documents not set of few judged documents

We propose normalized findability scoring function that produces better findability rank of documents

Analysis between findability and query characteristics– Different ranges of query characteristics have different retrieval bias

Analysis between findability and document features– Suitable for predicting document findability ranks

Relationship between findability and effectiveness– Findability can be used for automatic ranking– Used to find tune IR systems in un-supervised manner

Page 43: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Future Work

Query Popularity and Findability– We are not differentiating between popular and unpopular

queries

Visualizing Findability – Documents that are high findable with one model– Documents that are high findable with multiple models– Documents that are not findable with all models

Effect of Retrieval Bias in K-Nearest Neighbor classification – High Findable samples also affect the classification voting in K-

NN

Page 44: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Thank You

Page 45: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Page 46: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Gini-Coefficient

Gini-Coefficient calculates retrievability inequality between documents.

Also represents retrieval bias. Provides bird-eye view. If G = 0, then no bias, If G = 1, then only one document is

Findable, and all other document have r(d) = 0.

Documents

r(d) with RS1

r(d) with RS2

D1 2 9

D2 0 7

D3 6 12

D4 5 14

D5 34 18

D6 4 11

D7 39 19

Gini-Coefficient

0.58 0.18

Back

Page 47: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Findability Scoring Functions

ChemAppPat

TREC-CRT

Page 48: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Document Features and Retrievability

Features based on Term Weights– Terms of Documents are weighted by retrieval model.– Then terms are added into inverted lists.– Term weights in the inverted lists are sorted by

decreasing score.

# Feature Description

1 ATW Computes the average of term weights of a document.

2 ATRP Computes the average of term rank positions in the inverted lists.

3 VTRP Variance of term rank positions in the inverted lists.

4 DiffMedainWeights

Computes the average of difference of term weights of a document with the median weight of terms.

5 LowRankRatio Computes how many terms of a document are appeared in the top 200 rank positions of sorted inverted lists.

Page 49: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Document Features and Retrievability

On high skewed collections, these features have good correlation. On less skewed collections, these features do not have good

correlation. This may be because, in less skewed collection the term weights of

documents are less extreme due to almost similar doc lengths.

TREC-CRT

ChemAppPat

Back

Page 50: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Document Features and Retrievability

Document Density based Features– These feature are based on average density of the k-nearest

neighbors of documents.– k is used with 50,100, and 150.– Density is also computed with all terms of a document and top

40 (high frequency) terms of a document.

# Feature Description1 AvgDensity(k=50) Average density with 50-nearest neighbors.

2 AvgDensity(k=100) Average density with 100-nearest neighbors.

3 AvgDensity(k=250) Average density with 150-nearest neighbors.

4 AvgDensity-Top40Terms (k=50)

Average density with 50-nearest neighbors and top 40 terms.

5 AvgDensity-Top40Terms (k=100)

Average density with 100-nearest neighbors and top 40 terms.

6 AvgDensity-Top40Terms (k=150)

Average density with 150-nearest neighbors and top 40 terms.

Page 51: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

Query Expansion and Retrievability

Query Expansion methods are investigate for improving retrievability. Terms for expansion are identified via Pseudo-Relevance Feedback (PRF). Baseline results are promising (PRF selection with Top-N docs). We further, propose two PRF selection approaches

– Based on Documents Clustering.– Similarity of Retrieved Documents with Query Patent.

q = QueryD9

D4

D3

D5

D11

D2

D1

D14

Process Query

Ranked List

Pseu

do

Relevan

ce Feed

back

Extract Expansion Terms (E) from

{D9, D4, D3, D5}

qEQuery Expansion

Process Query Process Query

1. Top-N

2. Document Clustering

3. Query Patent Similarity

Page 52: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Query Expansion and Retrievability

Baseline Approaches– Both approaches rely on Top documents of query

result lists for PRF.

– Query Expansion based on Language Modeling.• Terms for expansion are ranked according to sum of

divergences between the documents they occurred and the importance of terms in the whole collection.

– Query Expansion based on Kullback-Leibler Divergence.

• Terms of expansion are ranked according to the relevance rareness of terms in PRF set as opposed to the whole collection.

Page 53: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Query Expansion and Retrievability

TREC-CRT

ChemAppPat

Page 54: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Query Expansion and Retrievability

q = QueryD9

D4

D3

D5

D11

D2

D1

D14

Process Query

Ranked List

Pseu

do

Relevan

ce Feed

back

Extract Expansion Terms (E) from

{D2, D1, D3, D4}

qEQuery Expansion

Process Query Process Query

D9

D4

D3

D5

D11

D2

D1

D14

Clustering with top N docs

Doc Cluster Size

D9 ( )

D4 (D9, D3)

D3 (D9, D4, D5)

D5 (D9, D3)

D11 (D3, D5)

D2 (D9, D4, D3, D5)

D1 (D9, D4, D3, D5)

D14 (D4)

Sort Docs on Cluster

Size

D2

D1

D3

D4

D5

D11

D14

D9

Page 55: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Query Expansion and Retrievability

D9

D4

D3

D5

D11

D2

D1

D14

Clustering with top N docs

Doc Cluster Size

D9 ( )

D4 (D9, D3)

D3 (D9, D4, D5)

D5 (D9, D3)

D11 (D3, D5)

D2 (D9, D4, D3, D5)

D1 (D9, D4, D3, D5)

D14 (D4)

Sort Docs on Cluster

Size

D2

D1

D3

D4

D5

D11

D14

D9

Constructing Clusters– Offline cluster construction, to avoid large processing time.

– Each document makes its own cluster with other docs using k-Nearest Neighbors.

Page 56: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Query Expansion and Retrievability

PRF Selection via Query Patent Similarity– In prior-art search, patent examiners usually extract

query terms for given query patent.– Due to complex structure of Patent docs, searching

relevant keywords is always a difficult problem.– Missing terms are another problem.

– Query Expansion could help to overcome this problem.

– Query Expansion depends on PRF documents.– PRF documents are ranked on the basis of Query

Patent Similarity

Page 57: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Query Expansion and Retrievability

q = QueryD9

D4

D3

D5

D11

D2

D1

D14

Process Query

Ranked List

Pseu

do

Relevan

ce Feed

back

Extract Expansion Terms (E) from

{D1, D3, D5, D2}

qEQuery Expansion

Process Query Process Query

D9

D4

D3

D5

D11

D2

D1

D14

Query Patent

Similarity with Query Patent

D1

D3

D5

D2

D11

D9

D14

D4

Rank Docs based on Similarity

Page 58: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

Query Expansion and Retrievability

Full Query Patent can be used for ranking PRF.– Full Query Patent may contains thousands of terms, and could

be distributed in documents not relevant to the query.

How to identify the best terms from query patent. We try to separate the good terms from bad terms

using term classification.

D9

D4

D3

D5

D11

D2

D1

D14

Query Patent

Similarity with Query Patent

D1

D3

D5

D2

D11

D9

D14

D4

Rank Docs based on Similarity

Page 59: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Query Expansion and Retrievability

Training Dataset for Term Classification

– 30 random prior-art (PA) topics from TREC-CRT.

– From each 30 PA topic, short queries of 4 length (based on high TF) are used as search queries.

– Baseline Score = For each query, PRF documents are ranked according to query relevance scores.

• Effectiveness scores are used a baseline score.

Page 60: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Query Expansion and Retrievability

q = QueryProcess Query

qEQuery Expansion

Process Query

Training Queries

Pseudo Relevance Feedback

Baseline Score

Unique Terms of Query PatentT1T2T3T4..Tn

Check each Term = T

qT

Process Query qE

Query Expansion

Process Query

Pseudo Relevance Feedback

qT Score

If (qT) Score > Baseline Score then T = good Term

If (qT) Score = Baseline Score then T = neutral Term

If (qT) Score < Baseline Score then T = neutral Term

Identifying good, neutral and bad terms

Page 61: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Query Expansion and Retrievability

PRF Selection via Query Patent Similarity– Terms are classified (predicted) using Term Features.

– Features are identified from Query Patents, based on expanded term (T) and query terms proximity distribution.

– J48 is used for classification.

– The overall accuracy of positive classified samples is 83%.

Page 62: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Query Expansion and Retrievability

PRF Selection via Query Patent Similarity– Results on TREC-CRT collection.– CCGen = PRF Selection using Clustering– QP-TS = PRF selection using query patent similarity

Page 63: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Query Expansion and Retrievability

PRF Selection via Query Patent Similarity– Results on TREC-CRT collection.– CCGen = PRF Selection using Clustering– QP-TS = PRF selection using query patent similarity

Page 64: Analyzing Retrieval Models using Retrievability Measurement Shariq Bashir Supervisor: ao. Univ. Prof. Dr. Andreas

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Document Features and Retrievability

The positive correlation indicates low retrievable docs are mostly lied in low density areas.

Their nearest neighbors have mostly similar term weights. This makes them low retrievable.

O High skewed collections, these features have good correlation. On Less skewed collections, these features do not have good correlation. This may be because, in less skewed collections, the term weights of

documents are less extreme due to similar doc lengths.

TREC-CRT

ChemAppPat

Back