15
Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR 2008 Presented by Jae-won Lee

Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Embed Size (px)

Citation preview

Page 1: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Enhancing Web Search by Promoting Mul-tiple Search Engine Use

Ryen W. W., Matthew R. Mikhail B. (Microsoft Research)

Allison P. H (Rice University)

SIGIR 2008

Presented by Jae-won Lee

Page 2: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Introduction

Users are generally loyal to one search engine Even though it may not satisfy their needs

A given search engine performs well for some queries and poorly for others Excessive loyalty can hinder search effectiveness

Our Goal Support engine switching by recommending the most effective

search engine for a given query

IDS Lab. Seminar - 2Center for E-Business Technology

Page 3: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Related Work

Meta search engines such as Clusty and Dogpile Merge search results

Switching search engine is more attractive Strong brand loyalty may discourage users from migrating to meta

search engine

Meta search engine eliminates the benefits of interface feature of each engine

Hurts source engine brand awareness

We lets users keep their default engine Suggest an alternative engine if it performs better for the current

query

IDS Lab. Seminar - 3Center for E-Business Technology

Page 4: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Does Switching Help Users?

Potential Benefits of Switching We quantify the benefits of multiple engine use

– Normalized Discounted Cumulative Gain (NDCG)

Measure a topical relevance of results to a given query

– Click-through rate for search results

Reasonable estimation of search result utility

NDCG

– A measure to evaluate the Web search engine performance

Where Ni : a constant for normalization

r(i) : relevance score of the ith result, 0 (bad) ~ 5 (perfect)

IDS Lab. Seminar - 4Center for E-Business Technology

Page 5: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Does Switching Help Users?

Potential Benefits of Switching (cont’d) X, Y, and Z are anonymous notations of Google, Yahoo, and Live

Search

Number of queries for which engine performs best

Engine choice for particular query is important

IDS Lab. Seminar - 5Center for E-Business Technology

Search engine Relevance (NDCG) Result click-through rate

X 952 (19.3%) 2,777 (56.4%)

Y 1,136 (23.1%) 1,226 (24.9%)

Z 789 (16.1%) 892 (18.1%)

No difference 2,044 (41.5%) 26 (0.6%)

Page 6: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Switching as Classification - Query Pro-cessing

IDS Lab. Seminar - 6Center for E-Business Technology

Query

FeatureExtractor

Classifier(offline Training)

Recommend a Search Engine

Features

Search Engines

Result sets

Page 7: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Classifier Features

Classifier must recommend an engine in real-time Derive features from result pages, a query and query-result match-

ing

Features Features from result pages

Features from the query

Features from the query-result page match

IDS Lab. Seminar - 7Center for E-Business Technology

Page 8: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Classifier Features

IDS Lab. Seminar - 8Center for E-Business Technology

Page 9: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Classifier

Notation q : query

R : result page of original engine

R’ : result page of target engine

R* = {(d1,s1), …, (dk,sk)} : Human-judged result set - dk(result

page), sk (score)

Utility of each engine : U(R) = NDCGR*(R) , U(R’) = NDCGR*(R’)

Training Each training instance D = {(x, y)}

– x = f(q, R, R’) ; comprised of features derived from the query and result page

– y = 1 iff NDCGR*(R’) >= NDCGR*(R) + margin

Switching engine if utility is higher by at least some margin

IDS Lab. Seminar - 9Center for E-Business Technology

Page 10: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Experiments

Evaluate accuracy of switching support to determine its viability

Data Set From Google, Yahoo, Live Search logs

IDS Lab. Seminar - 10Center for E-Business Technology

Total number of queries 17,111

Total number of judged pages 4,254,730

Total number of judged pages labeled Fair or higher 1,378,011

Page 11: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Precision – Recall Results

The proposed method can achieve high accuracy

Therefore, the method can be used for providing useful search engine suggestions to users

IDS Lab. Seminar - 11Center for E-Business Technology

Page 12: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Avoiding Querying the Alternative En-gine

Evaluating the utility of target engine is undesirable to some users due to the network traffic

So, only use the features from the current engine’s result pages

IDS Lab. Seminar - 12Center for E-Business Technology

Page 13: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Contribution of Features

All sets of features contribute to accuracy

Features obtained from result pages seems to provide the most benefit

IDS Lab. Seminar - 13Center for E-Business Technology

Page 14: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Conclusion

Demonstrated potential benefit of switching

Described a method for automatically determining when to switch engines for a given query

Evaluated the method and illustrated good performance, especially at usable recall

IDS Lab. Seminar - 14Center for E-Business Technology

Page 15: Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR

Copyright 2008 by CEBT

Pros. & Cons.

Pros. Propose a new research area of IR by switching support

Good explanation for user behavior

Cons. Poor explanation for equations

No analysis for the experiment results

IDS Lab. Seminar - 15Center for E-Business Technology