25
Evaluating Query-Independent Object Features for Relevancy Prediction Andres R. Masegosa 1 , Hideo Joho 2 , Joemon Jose 2 1 Department of Computer Science and A.I., University of Granada, Spain 2 Department of Computing Science, University of Glasgow, UK. ECIR’07: Rome, 5th April 2007

Evaluating query-independent object features for relevancy prediction

  • Upload
    ntnu

  • View
    160

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Evaluating query-independent object features for relevancy prediction

Evaluating Query-Independent Object

Features for Relevancy Prediction

Andres R. Masegosa1, Hideo Joho2, Joemon Jose2

1 Department of Computer Science and A.I., University of Granada, Spain 2 Department of Computing Science, University of Glasgow, UK.

ECIR’07: Rome, 5th April 2007

Page 2: Evaluating query-independent object features for relevancy prediction

2

Outline 1. Introduction

2. Methodology

2.1. Conceptual Categories of Object Features

2.2. Probabilistic Classification Approach

2.3. Feature Selection and Validation Scheme

3. Experiments

3.1. Effect of Contextual Features

3.2. Effect of Feature Selection and Combination

3.3. Effect of Highly Relevant Documents

4. Discussion

5. Conclusions and Future Work

Page 3: Evaluating query-independent object features for relevancy prediction

3

Background

IR in contexts

To gain further improvement on retrieval effectiveness and user experience

To make an IR system more adaptive to a search environment (i.e., context-aware)

Many potential contexts proposed

Work task, searcher, interaction, system, document, environment, temporality, etc. (See Ingwersen and Järvelin, 2005)

How can we effectively find significant contexts and use them in IR?

Page 4: Evaluating query-independent object features for relevancy prediction

4

Background (cont’d)

Contexts are abstract

We need tangible variables that can represent a context to

offer context-aware retrieval and user support in

interactive IR

How can we determine which tangible variables are

effective to represent a context?

Many potential contexts

Many potential tangible variables

Page 5: Evaluating query-independent object features for relevancy prediction

5

Our approach

Machine learning techniques as a diagnostic tool

Group click-through documents to represent contexts

Train classifiers with candidate variables to predict document relevancy

Prediction accuracy as a measure of variables’ effectiveness

Click-through documents and relevance judgements were collected from a user study

More details later

Page 6: Evaluating query-independent object features for relevancy prediction

6

Our assumptions

If a context is significant, the effect of the

context can be represented in the relevance of

retrieved documents.

If a variable is effective, it can increase the

power of discriminating relevant documents

from non-relevant ones in a context.

Page 7: Evaluating query-independent object features for relevancy prediction

7

Focus of this study

As a preliminary study of our approach, we

focused on

Topics as an example of contexts

Query independent document features as

candidate variables

Investigation of other contexts and variables

is under way

But, not presented here

Page 8: Evaluating query-independent object features for relevancy prediction

8

Methodology I:

Introduction

Around 150 features extracted as candidates

features for relevancy prediction.

Based on informal experiments and literature

survey.

Features like number of digits, n. of words, n.

of bold tags, n. of links, page rank value….

Grouped into four independent functional groups

based on their role in a document.

Page 9: Evaluating query-independent object features for relevancy prediction

9

Methodology II:

Conceptual Categories of Object Features

A. Document Textual Features (DOC): 14 features.

B. Visual/Layout Features:

B.1. Visual Appearance (V-VS): 28 features.

B.2. Visual HTML Tags (V-TG): 27 features.

B.3. Visual HTML Attributes (V-AT): 16 features.

C. Structural Features (STR): 18 features.

D. Other Selective Features:

D.1. Selective Words in anchor texts (O-AC): 11 features.

D.2. Selective Words in document (O-WD): 11 features.

D.3. Selective HTML tags (O-TG): 7 features.

D.4. Selective HTML attributes (O-AT): 16 features.

Page 10: Evaluating query-independent object features for relevancy prediction

10

Methodology III:

Probabilistic Classification Approach

P ( R | Word1, Word2,…, Wordn)

P ( R | “IR”, No “office”, ...) = (0.99, 0.01)

Relevant Document Predicted

non-Relevant

Relevant

Document Corpus Documents Words (Features)

Word Rel non-Rel

“IR” 96% 4%

“office” 16% 84%

……

Topic: “IR related papers”

Page 11: Evaluating query-independent object features for relevancy prediction

11

Methodology IV:

Classifiers Used

Four different classifiers were used.

They estimate the probability distribution in

different ways (changing the assumptions).

Classifiers:

Naive Bayes: Features are independent.

AODE: Single dependency among features.

HNB: Assume a hidden variable.

K2-MDL: Learn a general probabilistic network.

Page 12: Evaluating query-independent object features for relevancy prediction

12

Methodology V:

Feature Selection Scheme

F1

F4

F2

F3

F4

F7

F5

F6

Non Selected Features Selected Features

Classifier

Score

78 % 57 % 63 %

Page 13: Evaluating query-independent object features for relevancy prediction

13

77 % 69 %

Methodology VI:

Feature Selection Scheme

F1

F4

F2

F4

F7

F5

F6

Non Selected Features

F3

Selected Features

Classifier

Score

86 %

Page 14: Evaluating query-independent object features for relevancy prediction

14

Methodology VII:

Feature Selection Scheme

This FS method depends on the data set.

Changing data set, changes the FS set.

FS is applied 100 times taking the 90% of the instances of the data set.

Features could be or couldn’t be selected in each step.

Features have a percentage of selection that it is used as a confident threshold.

Feature selection sheme is applied over each Conceptual Category.

F4

F7

F5

F6

F3

Selected Features

Page 15: Evaluating query-independent object features for relevancy prediction

15

Methodology VIII:

Validation Scheme

F1,F2,F3,F4,Relvance

1, 0, 1, 2, NR

12, 0, 1, 2, R

12, 0, 1, 2, NR

12, 0, 1, 2, R

12, 0, 1, 2, R

12, 0, 1, 2, NR

12, 0, 1, 2, R

12, 0, 1, 2, NR

12, 0, 1, 2, NR

12, 0, 1, 2, R

Training Data Set

F1,F2,F3,F4,Relvance, Predicted

1, 0, 1, 2, NR, R

12, 0, 1, 2, R, NR

12, 0, 1, 2, NR NR

12, 0, 1, 2, R R

12, 0, 1, 2, R R

12, 0, 1, 2, NR R

12, 0, 1, 2, R NR

12, 0, 1, 2, NR NR

12, 0, 1, 2, NR NR

12, 0, 1, 2, R R

Test Data Set

Classifier

Performance =

(Prediction Accuracy)

%6010

6

F1,F2,F3,F4,Relvance

1, 0, 1, 2, NR

12, 0, 1, 2, R

12, 0, 1, 2, NR

12, 0, 1, 2, R

12, 0, 1, 2, R

12, 0, 1, 2, NR

12, 0, 1, 2, R

2, 0, 1, 2, NR

12, 0, 1, 2, NR

2, 0, 1, 2, R

Right Prediction Wrong Prediction

Page 16: Evaluating query-independent object features for relevancy prediction

16

Methodology IX:

Validation Scheme

Training set contains the 90% of data set and

Test data set the 10%.

Random sampling is carried out 100 times.

The final estimated accuracy is the mean of

these estimations.

10 fold-cross validation was repeated 10

times.

Page 17: Evaluating query-independent object features for relevancy prediction

17

Experiments I:

Data Set 1038 click-through documents extracted from a

user study of 24 participants.

Each participant was given four topics and asked to bookmark perceived relevant documents.

375 were unique relevant and 362 were unique non-relevant.

Baseline performance was 50.9%.

In Topic division data sets, baseline was:

Topic 1: 50.0% (corrected from 60.6% by re-sampling)

Topic 2: 52.1% Topic 3: 55.2% Topic 4: 51.7%

Page 18: Evaluating query-independent object features for relevancy prediction

18

Experiments II:

Effect of Contextual Features

It shows the relative improvement with

respect to the baseline performance.

Significant improvement is in bold.

Page 19: Evaluating query-independent object features for relevancy prediction

19

Experiments III:

Effect of Feature Selection

Features selected more than 90% of times.

Larger improvement on individual topics.

Page 20: Evaluating query-independent object features for relevancy prediction

20

Experiments IV:

Effect of Feature Combination

Taking the selected features in each category

and using all together.

Found to be stable across topics (except Topic 4).

Page 21: Evaluating query-independent object features for relevancy prediction

21

Experiments V:

Effect of Highly Relevant Documents

Highly Relevant Document: Judged as relevant

by at least two participants on the same topic.

Not a perfect definition, but found to be a better

data set to mine significant features.

Effect with the “combined features”:

Page 22: Evaluating query-independent object features for relevancy prediction

22

Discussion I:

Effectiveness of QI Features

In topic-independent set, textual features and

visual/layout HTML attributes were significant.

Effectiveness of QI-features varies across topics.

Feature Selection and Combination can improve the

prediction accuracy and robustness.

Highly relevant documents are promising data set to

elicit significant features.

Page 23: Evaluating query-independent object features for relevancy prediction

23

Discussion II:

Example of significant features

Page 24: Evaluating query-independent object features for relevancy prediction

24

Conclusion & Future Work

Presented an approach to mining significant

contextual features

Investigated the effectiveness of query-independent

features for relevancy prediction.

Machine learning techniques allowed us to examine

a large number of candidate features.

Disadvantage: significant features are sometimes difficult

to interpret.

Promising approach to finding significant contextual

features.

Page 25: Evaluating query-independent object features for relevancy prediction

25

Conclusion & Future Work

Implications of our study

Categorisation of search topics can facilitate the mining of

significant contextual features

Exploiting highly relevant documents can help find more

robust contextual features

Future work

Investigate other data (.e.g, transaction logs, user

feedback, etc.)

Develop adaptive search models for re-ranking, grouping

search results, recommendation, etc.