Developing and Evaluating a Query Recommendation Feature to Assist Users with Online Information Seeking & Retrieval With graduate students: Karl Gyllstrom,

Developing and Evaluating a Query Recommendation Feature to Assist

Users with Online Information Seeking & Retrieval

With graduate students: Karl Gyllstrom, Earl Bailey

Diane Kelly, Assistant Professor

University of North Carolina at Chapel Hill

Background Query formulation is one of the most important

and difficult aspects of information seeking Users often need to enter multiple queries to

investigate different aspects of their information needs

Some techniques have been developed to assist users with query formulation and reformulation: Term Suggestion Query Recommendation

However, there are problems associated with each of these techniques …

ALISE Conference | January 23, 2009 | Denver, CO

Problems Term Suggestion

Works via relevance feedback (often times ‘pseudo’ relevance feedback is used which makes assumptions about the goodness of the initial query)

Users don’t have the additional cognitive resources to engage in explicit feedback (‘form’ is awkward)

Users are too lazy to provide feedback – principle of least effort (‘form’ is cumbersome)

Terms are not presented in context so it may be hard for users to understand how they can help


Problems Query Suggestion

It is hard to determine the similarity of previous queries to one another (and to the current query)

Sparsity problem: assumes a set of queries that are similar to the current query exists


Our Approach User Query: dog law enforcementSUGGESTED TERMS

Canine

Legal

Charge

Train

Drug

Traffic

Police

Search

Officer

Dog law enforcement canineCanine legal drug traffic

Dog law police enforcement drug

Dog law police drug search

SUGGESTED QUERIES


Studies Study I (System/Algorithm Evaluation no Users)

Identify and evaluate techniques for identify terms from corpus given a query

Identify and evaluate techniques for using these terms to create effective and semantically meaningful queries

Studies II-IV (Interactive Evaluation with Users) Evaluate automatic query suggestion techniques,

including Comparison with term suggestions Comparison with user-generated suggestions Investigation of effects of topic difficulty and familiarity

Compare ‘remote’ study mode with laboratory study mode


Study I: Some Questions How do we identify the best terms from the

corpus given the user’s query? How do we select the best terms from those

generated? In what order do we combine terms? How do we incorporate the initial query? How long should the recommended queries be? How many queries do we suggest?

Our Solution Implemented Tan, et al.’s (2007) clustering method

for selecting terms (language modeling framework) TREC-style evaluation using a test collection


Studies II-IV: Common Elements Two interfaces: Query Suggestion and Term

Suggestion Each subject completed two search topics with each

interface Task: Find and save documents relevant to the

information described in the topic Up to 15 minutes to search per topic

Twenty search topics in total sorted into four difficulty levels : Easy, Medium, Moderate, Difficult Each subject completed one topic from each level Rotation and counter-balancing …

Subjects searched a closed corpus of over 1 million newspaper articles (AP, NYT and XN)


Studies II-IV: Common Elements Several outcome measures:

Use of suggestions (System Log) Performance (Retrieval Results and Docs Saved)

Mean Average Precision (Baseline Relevance Assessments)

Interactive Precision and Recall (Integrate BRA with User RA)

Discounted Cumulated Gain (User RA) Perceived Effectiveness and Satisfaction (Exit

Questionnaire) Preference (Exit Questionnaire) Qualitative Feedback (Exit Questionnaire)


Studies II-IV: Common Elements And a few more independent variables:

Topic Difficulty (Pre-determined Level) Subject’s Topic Knowledge (Pre-topic

Questionnaire) Subject’s Experienced Difficulty (Exit

Questionnaire)


Studies II-IV: Common ProceduresSTART

END

Pre-Topic Questionnaire

[Repeat for 2 Systems]

Exit Questionnaire

Consent

Subject Searches[Repeat for 2 Topics]

Demographic Questionnaire

Search Experience Questionnaire


Studies II-IV: Differences Study II (n=43)

Subjects completed this study remotely Study III (n=25)

Eye-tracking data collected from first 12 subjects Study IV (n=22)

Additional qualitative data collection via stimulated recall for two searches (one per system)

Study III and IV Variation in Source of Suggestions: Half received

system-generated suggestions (same as Study II) and half received user-generated suggestions (extracted from Study II subjects)


Preliminary Results Use


Preliminary Results Use and Source of Suggestions


Preliminary Results Use & Topic


Preliminary Results Perceived Effectiveness and Satisfaction

For 7 of the 11 Exit Questionnaire items, query suggestion was rated higher than term suggestion. These items concerned: ‘Cognitive Assistance’ (e.g., helped me think more

about the topic and understand its different aspects) Satisfaction

Term suggestion was rated higher with respect to Modification Ease of Use

There were few differences in ratings of system-generated suggestions and user-generated suggestions


Preliminary Results


Preference

Next Steps Continue data analysis …

Impact of topic difficulty and knowledge Eye-tracking data ‘Typing’ of suggestions Temporal/Stage Analysis


BACK


Documents

Developing and Evaluating a Query Recommendation Feature to Assist Users with Online Information Seeking & Retrieval With graduate students: Karl Gyllstrom,