53
Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Embed Size (px)

Citation preview

Page 1: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Retrieval and Evaluation Tech-niques for Personal Information

Jin Young Kim

7/26 Ph.D Dissertation Seminar

Page 2: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Personal Information Retrieval (PIR)

2

The practice and the study of supporting users to retrieve personal information effectively

Page 3: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Personal Information Retrieval in the Wild Everyone has unique information & practices

Different information and information needs Different preference and behavior

Many existing software solutions Platform-level: desktop search, folder structure Application-level: email, calendar, office suites

3

Page 4: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Previous Work in PIR (Desktop Search) Focus

User interface issues [Dumais03,06]

Desktop-specific features [Solus06] [Cohen08]

Limitations Each based on different environment and user group None of them performed comparative evaluation Research findings do not accumulate over the years

4

Page 5: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Our Approach Develop general techniques for PIR

Start from essential characteristics of PIR Applicable regardless of users and information types

Make contributions to related areas Structured document retrieval Simulated evaluation for known-item finding

Build a platform for sustainable progress Develop repeatable evaluation techniques Share the research findings and the data

5

Page 6: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

6

Essential Characteristics of PIR Many document types

Unique metadata for each type

People combine search and browsing [Teevan04]

Long-term interactions with a single user

People mostly find known-items [Elsweiler07]

Privacy concern for the data set

Field-based Search Models

Associative Browsing Model

Simulated Evaluation Methods

Page 7: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Challenge Users may remember different things about the

document How can we present effective results for both cases?

Search and Browsing Retrieval Models

Registration

James

User’s Memory

Query Retrieval Results

Search

Browsing

Lexical Memory

Associa-tive Memory

1.2.3.4.5.

7

Page 8: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Information Seeking Scenario in PIR

Registration

James

James

Registration

2011

User Input

System Output

2011

Search

Browsing

Search

A user initiate a session witha keyword query

The user switches to browsing by clicking on a email document

The user switches to back to search with a different query

Page 9: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Challenge User’s query originates from what she remembers. How can we simulate user’s querying behavior real-

istically?

Simulated Evaluation Techniques

Registration

James

User’s Memory

Query Retrieval Results

Lexical Memory

Associa-tive Memory

1.2.3.4.5.

Search

Browsing

9

Page 10: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Research Questions Field-based Search Models

How can we improve the retrieval effectiveness in PIR? How can we improve the type prediction quality?

Associative Browsing Model How can we enable the browsing support for PIR? How can we improve the suggestions for browsing?

Simulated Evaluation Methods How can we evaluate a complex PIR system by simula-

tion? How can we establish the validity of simulated evalua-

tion?

10

Page 11: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Field-based Search Models

Page 12: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

12

Searching for Personal Information An example of desktop search

Page 13: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

13

Field-based Search Framework for PIR

Type-specific Ranking Rank documents in each document collection (type)

Type Prediction Predict the document type relevant to user’s query

Final Results Generation Merge into a single ranked list

Page 14: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Type-specific Ranking for PIR

Individual collection has type-specific features Thread-based features for emails Path-based features for documents

Most of these documents have rich metadata Email: <sender, receiver, date, subject, body> Document: <title, author, abstract, content> Calendar: <title, date, place, participants>

We focus on developing general retrieval tech-niques for structured documents

Page 15: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Structured Document Retrieval Field Operator / Advanced Search Interface User’s search terms are found in multiple fields

15

Understanding Re-finding Behavior in Naturalistic Email Interaction Logs. Elsweiler, D, Harvey, M, Hacker., M [SIGIR'11]

Page 16: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

16

Structured Document Retrieval: Models Document-based Retrieval Model

Score each document as a whole

Field-based Retrieval Model Combine evidences from each field

q1 q2 ... qm

Document-based Scoring Field-based Scoring

f1

f2

fn

...

q1 q2 ... qm

f1

f2

fn

...

f1

f2

fn

...

w1

w2

wn

w1

w2

wn

Page 17: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

17

1

1

221

2

Field Relevance Different fields are important for different query

terms

‘james’ is relevant when it occurs in

<to>

‘registration’ is rele-vant

when it occurs in <sub-ject>

Field Relevance Model for Structured IR

Page 18: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

18

Estimating the Field Relevance: Overview If User Provides Feedback

Relevant document provides sufficient information

If No Feedback is Available Combine field-level term statistics from multiple

sources

content

title

from/to

Relevant Docs

content

title

from/to

Collection

content

title

from/to

Top-k Docs

+ ≅

Page 19: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Assume a user who marked DR as relevant Estimate field relevance from the field-level term

dist. of DR

We can personalize the results accordingly Rank higher docs with similar field-level term distri-

bution This weight is provably optimal under LM retrieval

framework

Estimating Field Relevance using Feed-back

19DR

- To is relevant for ‘james’- Content is relevant for ‘reg-istration’

Field Relevance:

Page 20: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Linear Combination of Multiple Sources Weights estimated using training queries

Features Field-level term distribution of the collection

Unigram and Bigram LM

Field-level term distribution of top-k docs Unigram and Bigram LM

A priori importance of each field (wj) Estimated using held-out training queries

Estimating Field Relevance without Feed-back

20

Unigram is thesame to PRM-S

Similar to MFLM and BM25F

Pseudo-rele-vance Feed-back

Page 21: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

21

Retrieval Using the Field Relevance Comparison with Previous Work

Ranking in the Field Relevance Model

q1 q2 ... qm

f1

f2

fn

...

f1

f2

fn

...

w1

w2

wn

w1

w2

wn

q1 q2 ... qm

f1

f2

fn

...

f1

f2

fn

...

P(F1|q1)

P(F2|q1)

P(Fn|q1)

P(F1|qm)

P(F2|qm)

P(Fn|qm)

Per-term Field Weight

Per-term Field Score

su

m

multiply

Page 22: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

22

Retrieval Effectiveness (Metric: Mean Reciprocal Rank)

DQL BM25F MFLM FRM-C FRM-T FRM-R

TREC 54.2% 59.7% 60.1% 62.4% 66.8% 79.4%

IMDB 40.8% 52.4% 61.2% 63.7% 65.7% 70.4%

Monster 42.9% 27.9% 46.0% 54.2% 55.8% 71.6%

Evaluating the Field Relevance Model

DQL BM25F MFLM FRM-C FRM-T FRM-R20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

80.0%

TRECIMDBMonster

Fixed Field WeightsPer-term Field Weights

Page 23: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

23

Type Prediction Methods

Field-based collection Query-Likelihood (FQL) Calculate QL score for each field of a collection Combine field-level scores into a collection score

Feature-based Method Combine existing type-prediction methods Grid Search / SVM for finding combination weights

Page 24: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

24

Type Prediction Performance

Pseudo-desktop Collections

CS Collection

FQL improves performance over CQL Combining features improves the performance further

(% of queries with correct prediction)

Page 25: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Summary So Far… Field relevance model for structured document

retrieval Enables relevance feedback through field weighting Improves performance using linear feature-based

estimation

Type prediction methods for PIR Field-based type prediction method (FQL) Combination of features improve the performance

further

We move onto associative browsing model What happens when users can’t recall good search

terms?

Page 26: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Associative Browsing Model

Page 27: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Recap: Retrieval Framework for PIR

Registration

James

James

Keyword Search Associative Browsing

27

Page 28: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

User Interaction for Associative Browsing Users enter a concept or document page by

search The system provides a list of suggestions for

browsing

Data Model User Interface

Page 29: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

29

How can we build associations?

Manually?Participants wouldn’t create associa-tions beyond simple tagging opera-

tions- Sauermann et al. 2005

Automatically?How would it match user’s preference?

Page 30: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

30

Building the Associative Browsing Model

2. Concept Extrac-tion

3. Link Extraction

4. Link Refinement

1. Document Collec-tion

Term SimilarityTemporal SimilarityCo-occurrence Click-based Train-ing

Page 31: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

31

Concept: Search Engine

Link Extraction and Refinement

Link Scoring Combination of link type scores

S(c1,c2) = Σi [ wi × Linki(c1,c2) ]

Link Presentation Ranked list of suggested items Users click on them for browsing

Link Refinement (training wi) Maximize click-based relevance

Grid Search : Maximize retrieval effectiveness (MRR) RankSVM : Minimize error in pairwise preference

Concepts Documents

Term Vector Similarity

Temporal Similarity

Tag Similarity

String Similarity Path / Type Simi-larity

Co-occurrence Concept Similar-ity

Page 32: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

32

Evaluating Associative Browsing Model

Data set: CS Collection Collect public documents in UMass CS department CS dept. people competed in known-item finding

tasks

Value of browsing for known-item finding % of sessions browsing was used % of sessions browsing was used & led to success

Quality of browsing suggestions Mean Reciprocal Rank using clicks as judgments 10-fold cross validation over the click data collected

Page 33: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Value of Browsing for Known-item Finding

Comparison with Simulation Results Roughly matches in terms of overall usage and

success ratio

The Value of Associative Browsing Browsing was used in 30% of all sessions Browsing saved 75% of sessions when used

Evaluation Type Total(#ses-sions)

Browsing used

Successfuloutcome

Simulation 63,260 9,410 (14.8%)

3,957 (42.0%)

User Study (1) 290 42 (14.5%) 15 (35.7%)

User Study (2) 142 43 (30.2%) 32 (74.4%)

Document Only

Document

+ Con-cept

Page 34: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

34

Quality of Browsing Suggestions Concept Browsing (MRR)

Document Browsing (MRR) title content tag time string cooc occur Uniform Grid SVM

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

CS/Top1 CS/Top5

title content tag time topic path type concept Uniform Grid SVM 0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

CS/Top1 CS/Top5

Page 35: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Simulated Evaluation Methods

Page 36: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

36

Challenges in PIR Evaluation Hard to create a ‘test-collection’

Each user has different documents and habits People will not donate their documents and queries

for research

Limitations of user study Experimenting with a working system is costly Experimental control is hard with real users and

tasks Data is not reusable by third parties

Page 37: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

37

Simulate components of evaluation Collection: user’s documents with metadata Task: search topics and relevance judgments Interaction: query and click data

Our Approach: Simulated Evaluation

Page 38: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Simulated Evaluation Overview Simulated document collections

Pseudo-desktop Collections Subsets of W3C mailing list + Other document types

CS Collection UMass CS mailing list / Calendar items / Crawl of home-

page

Evaluation MethodsControlled User Study Simulated Interaction

Field-based Search

DocTrack Search Game Query Generation Meth-ods

Associative Browsing

DocTrack Search + Browsing Game

Probabilistic User Model-ing

Page 39: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

39

Controlled User Study: DocTrack Game Procedure

Collect public documents in UMass CS dept. (CS Collec-tion)

Build a web interface where participants can find docu-ments

People in CS department participated

DocTrack search game 20 participants / 66 games played 984 queries collected for 882 target documents

DocTrack search+browsing game 30 participants / 53 games played 290 +142 search sessions collected

Page 40: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

40

DocTrack Game

*Users can use search and browsing for DocTrack search+browsing game

Page 41: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

41

Query Generation for Evaluating PIR Known-item finding for PIR

A target document represents an information need Users would take terms from the target document

Query Generation for PIR Randomly select a target document Algorithmically take terms from the document

Parameters of Query Generation Choice of extent : Document [Azzopardi07] vs. Field Choice of term : Uniform vs. TF vs. IDF vs. TF-IDF [Az-

zopardi07]

Page 42: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

42

Validating of Generated Queries

Basic Idea Use the set of human-generated queries for valida-

tion Compare at the level of query terms and retrieval

scores

Validation by Comparing Query-terms The generation probability of manual query q from

Pterm

Validation by Compare Retrieval Scores [Azzopardi07]

Two-sided Kolmogorov-Smirnov test

Page 43: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Validation Results for Generated Queries Validation based on query terms

Validation based on retrieval score distribution

Page 44: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

44

Probabilistic User Model for PIR Query generation model

Term selection from a target document State transition model

Use browsing when result looks marginally rele-vant

Link selection model Click on browsing suggestions based on perceived

relevance

Page 45: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

A User Model for Link Selection User’s level of knowledge

Random : randomly click on a ranked list Informed : more likely to click on more relevant item Oracle : always click on the most relevant item

Relevance estimated using the position of the target item

1 …

2 …

3 …

4 …

5 …

1 …

2 …

3 …

4 …

5 …

1 …

2 …

3 …

4 …

5 … 45

Page 46: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Success Ratio of Browsing Varying the level of knowledge and fan-out for

simulation Exploration is valuable for users with low

knowledge level

FO1 FO2 FO30.3

0.32

0.34

0.36

0.38

0.4

0.42

0.44

0.46

0.48

randominformedoracle

More Exploration46

Page 47: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

47

Community Efforts using the Data Sets

Page 48: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Conclusions & Future Work

Page 49: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Major Contributions Field-based Search Models

Field relevance model for structured document retrieval Field-based and combination-based type prediction method

Associative Browsing Model An adaptive technique for generating browsing suggestions Evaluation of associative browsing in known-item finding

Simulated Evaluation Methods for Known-item Finding DocTrack game for controlled user study Probabilistic user model for generating simulated interaction

49

Page 50: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Field Relevance for Complex Structures Current work assumes documents with flat

structure

Field Relevance for Complex Structures? XML documents with hierarchical structure Joined Database Relations with graph structure

Page 51: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Cognitive Model of Query Generation Current query generation methods assume:

Queries are generated from the complete document Query-terms are chosen independently from one

another

Relaxing these assumptions Model the user’s degradation in memory Model the dependency in query term selection

Ongoing work Graph-based representation of documents Query terms can be chosen by random walk

Page 52: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

Thank you for your attention! Special thanks to my advisor, coauthors, and all

of you here!

Are we closer to the superhuman now?

Page 53: Retrieval and Evaluation Techniques for Personal Information Jin Young Kim 7/26 Ph.D Dissertation Seminar

One More Slide: What I Learned… Start from what’s happening from user’s mind

Field relevance / query generation, …

Balance user input and algorithmic support Generating suggestions for associative browsing

Learn from your peers & make contributions Query generation method / DocTrack game Simulated test collections & workshop