14
Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

  • View
    229

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Experimental Components for the Evaluation of Interactive Information

Retrieval Systems

Pia Borlund

Dawn Filan3/30/04610:551

Page 2: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

The Goal• To evaluate IR systems in a way that is as

close to actual information seeking process as possible, while still being in a controlled environment.

Page 3: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Research Questions• Can simulated information needs be

substituted for real information needs?

• What makes a good simulated situation with reference to semantic openness and types of topics of the simulated situations?

Page 4: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Hybrid Evaluation Model• Increased demand

– Relevance revolution– Cognitive revolution– Interactive revolution

• Combine two main approaches– System-driven approach (controlled)– Cognitive user-centered approach (realism)

Page 5: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

The Experimental Setting3 components:

• The involvement of potential users as test persons

• The application of dynamic and individual information needs

• Use of dynamic relevance judgements

Page 6: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Ideal IIR Setting• Real users who state personal information

needs to the system and judge the relevance of the retrieved documents under controlled circumstances.

• Use of “simulated work task”

• Must be under controlled circumstances so that results can be compared across systems and user groups.

Page 7: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Simulated Work Task• Triggers and develops a simulated

information need by allowing for user interpretations of the situation.

• Platform against which situational relevance is measured.

• 2 variants applied:– Complete need applied (sim 1)– Only situation applied (sim 2)

Page 8: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Situational Relevance• User-centered, realistic, and dynamic

measure of relevance

• Judgements are not based on the request or query, but rather relate to the person’s requirements and mental state at the time of the retrieval

• Assessed continuously and interactively during the session

Page 9: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Relevance (Schamber, Eisenberg, and Nilan)

• Multidimensional cognitive concept whose meaning is dependant in users’ perceptions of information and their information needs

• Dynamic concept that depends on users’ judgements of quality of the relationship between information and the information need

• Complex but systematic and measurable concept if approached conceptually and operationally from the user’s perspective

Page 10: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Meta-Evaluation• Should simulated work tasks be

recommended as a component of the experimental setting for evaluating IIR systems?

Page 11: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Meta-Evaluation Questions• Possibility of substituting real information

needs with simulated information needs through the application of simulated work task situations.

• Whether the variants of the simulated task makes any difference to the test persons’ treatment of the information need

• What characterizes a good simulated work task in terms of how tailored the task should be to the user

Page 12: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Test Setting• Full-text online system applying TREC data

and probabilistic-based retrieval engine

• Search activity and relevance scores were logged

• 24 users from various academic backgrounds and education levels

• Asked to prepare a personal information need

Page 13: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Testing Procedure• Brief questionnaire

• Introduction

• Explanation of the test person’s role

• Demo of the system

• Execution of 6 search tasks (training, real, 4 simulated tasks)

• Post-search interview

Page 14: Experimental Components for the Evaluation of Interactive Information Retrieval Systems Pia Borlund Dawn Filan 3/30/04 610:551

Conclusions• One can substitute real information needs

with simulated information through the application of simulated work tasks

• One can mix simulated and real information needs

• Treatment of the information need did not differ between the group that received the work task and request, and those who received just the work task.