43
Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Embed Size (px)

DESCRIPTION

Demo

Citation preview

Page 1: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Personalizing Web Search

Jaime Teevan, MITwith Susan T. Dumais and Eric Horvitz, MSR

Page 2: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR
Page 3: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Demo

Page 4: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 5: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 6: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Study of Personal Relevancy

15 SIS users x ~10 queriesEvaluate 50 results

Highly relevant / Relevant / IrrelevantQuery selection

Previously issued query Chose from 10 pre-selected queries

Collected evaluations for 137 queries 53 of pre-selected queries (2-9/query)

Page 7: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Relevant Results Have Low Rank

1 5 9 13 17 21 25 29 33 37 41 45 49

Rank

Highly Relevant

Relevant

Irrelevant

Page 8: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Same Query, Different Intent

Different meanings “Information about the astronomical/astrological

sign of cancer” “information about cancer treatments”

Different intents “is there any new tests for cancer?” “information about cancer treatments”

Page 9: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Same Intent, Different Evaluation

Query: Microsoft “information about microsoft, the company” “Things related to the Microsoft corporation” “Information on Microsoft Corp”

31/50 rated as not irrelevant Only 6/31 do more than one agree All three agree only for www.microsoft.com

Page 10: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

More to Understand

Do people cluster? Even if they can’t state their intention

How are the differences reflected? Can they be seen from the information on a

person’s computer?Can we do better than the ranking that

would make everyone the most happy? Best common ranking: +38% Best personalized ranking: +55%

Page 11: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 12: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Personalization Algorithms

Standard IR

Related to relevance feedbackQuery expansion

Document

Query

User

Server

Client

v. Result re-ranking

Page 13: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Result Re-Ranking

Takes full advantage of SISEnsures privacyGood evaluation frameworkLook at light weight user models

Collected on server side Sent as query expansion

Page 14: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

BM25

N

ni

N

ni

wi = log

riR

with Relevance Feedback

Score = Σ tfi * wi

Page 15: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

BM25 with Relevance Feedback

N

ni

(ri+0.5)(N-ni-R+ri+0.5)

(ni-ri+0.5)(R-ri+0.5)

riR

wi = log

Score = Σ tfi * wi

Page 16: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

(ri+0.5)(N-ni-R+ri+0.5)

(ni- ri+0.5)(R-ri+0.5)

User Model as Relevance Feedback

N

ni

Rri

Score = Σ tfi * wi

(ri+0.5)(N’-ni’-R+ri+0.5)

(ni’- ri+0.5)(R-ri+0.5)wi = log

N’ = N+R

ni’ = ni+ri

Page 17: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

User Model as Relevance Feedback

N

ni

Rri

World

User

Score = Σ tfi * wi

Page 18: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

User Model as Relevance Feedback

Rri

User

N

ni

World

World related to query

Nni

Score = Σ tfi * wi

Page 19: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

User Model as Relevance Feedback

N

ni

Rri

World

UserWorld related to query

User related to query

R

Nni

ri

Query Focused Matching

Score = Σ tfi * wi

Page 20: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

User Model as Relevance Feedback

N

ni

Rri

World

UserWeb related to query

User related to query

R

N ri

Query Focused Matching

ni

World Focused Matching

Score = Σ tfi * wi

Page 21: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Page 22: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused

Page 23: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused

Page 24: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

User Representation

Stuff I’ve Seen (SIS) indexRecently indexed documentsWeb documents in SIS indexQuery historyRelevance judgmentsNone

Page 25: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyRelevance feedbackNone

Page 26: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query Focused

World Focused All SISRecent SISWeb SISQuery HistoryRelevance FeedbackNone

Page 27: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

World Representation

Document Representation Full text Title and snippet

Corpus Representation Web Result set – title and snippet Result set – full text

Page 28: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyRelevance feedbackNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

Page 29: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyRelevance feedbackNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

Page 30: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Query Expansion

All words in document

Query focused

The American Cancer Society is dedicated to eliminating cancer as a major health problem by preventing cancer, saving lives, and diminishing suffering through ...

The American Cancer Society is dedicated to eliminating cancer as a major health problem by preventing cancer, saving lives, and diminishing suffering through ...

Page 31: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyRelevance feedbackNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

All words

Query focused

Page 32: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyRelevance feedbackNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

All words

Query focused

Page 33: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Parameters

Matching

User representation

World representation

Query expansion

Query focused

World focused All SISRecent SISWeb SISQuery historyRelevance feedbackNone

Full textTitle and snippet

WebResult set – full textResult set – title and snippet

All words

Query focused

Page 34: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 35: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Baselines

Best possibleRandomText based rankingWeb rankingURL Boost

http://mail.yahoo.com/inbox/msg10

http://mail.yahoo.com/inbox/msg10

http://mail.yahoo.com/inbox/msg10

+1

+1

Page 36: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Best Parameter Settings

Richer user representation better SIS > Recent > Web > Query History > None

Suggests rich client importantEfficiency hacks don’t hurt

Snippets query focused Length normalization not an issue

Query focus good

Page 37: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Text Alone Not Enough

Better than some baselines Better than random Better than no user representation Better than relevance feedback

Worse than Web resultsBlend in other features

Web ranking URL boost

Page 38: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Good, but Lots of Room to Grow

Best combination: 9.1% improvementBest possible: 51.5% improvementAssumes best Web combination selectedOnly improves results 2/3 of the time

Page 39: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Personalizing Web Search

MotivationAlgorithmsResultsFuture Work

Page 40: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Finding the Best Parameter Setting

Almost always some parameter setting that improves results

Use learning to select parameters Based on individual Based on query Based on results

Give user control?

Page 41: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Further Exploration of Algorithms

Larger parameter space to explore More complex user model subsets Different parsing (e.g., phrases) Tune BM25 parameters

What is really helping? Generic user model or personal model Use different indices for the queries

Deploy system

Page 42: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Practical Issues

Efficiency issues Can interfaces mitigate some of the issues?

Merging server and client Query expansion

Get more relevant results in the set to be re-ranked Design snippets for personalization

Page 43: Personalizing Web Search Jaime Teevan, MIT with Susan T. Dumais and Eric Horvitz, MSR

Thank you!