14

Click here to load reader

GaZIR: Gaze-based Zooming Interface for Image Retrieval

Embed Size (px)

DESCRIPTION

GaZIR is a gaze-based interface for searching and browsing images. We first describe the system in detail, how users interact with it and how the system predicts the relevance of images the user is searching by using eye-tracking. We then go on to experiment with the system by testing the predictions of image relevancy and actual image retrieval accuracy.

Citation preview

Page 1: GaZIR: Gaze-based Zooming Interface for Image Retrieval

GaZIR“Gaze-based Zooming Interface

for Image Retrieval”László Kozma, Arto Klami, Samuel Kaski

Helsinki Institute for Information Technology HIIT

Intelligent Multimodal Interaction 2014Francesco Bonadiman

Craig Kershaw

Page 2: GaZIR: Gaze-based Zooming Interface for Image Retrieval

What is GaZIR?

● Gaze-based interface for searching and browsing images○ Collecting information from what the user would do

naturally via eye-tracking (implicit feedback)

● The user can zoom-in and out○ Focusing on the centre or the borders○ Allowing Image retrieval

Page 3: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Put a Ring on it!

● Consists of 3 Rings of images○ Each consecutive ring shows the next set of

relevant images based on information gathered from the previous ring

● Better for predicting image relevancy○ Avoids users scanning images row-by-row, as with

grid-based layouts

Page 4: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Rings

Page 5: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Eye Tracking

● Eye-movements tracking user’s pupils○ Fixation of >120ms → Relevant image

● 3 main advantages of Eye-Tracking○ Effortlessness, user only needs to look at the images ○ “I-will-know-it-when-I-see-it” search problems○ Hands are not needed → Motor disabilities

Page 6: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Similar Work

● Only preliminary studies

● Oyekoya et al.○ Simple retrieval → relevance from viewing time

● Klami et al.○ More complex predictions○ Only measure isolated predictions ○ Artificial setup

Page 7: GaZIR: Gaze-based Zooming Interface for Image Retrieval

… and GaZIR?

● GaZIR → combines two approaches○ More sophisticated relevance predictor ○ Real retrieval search engine○ Gaze-based interaction designed interface

Page 8: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Aim of the research paper

● To provide a user interface that is more fluid and natural with searching

● Test whether it is feasible to construct and if it works in practice with pre-existing CBIR search engines ○ Designed to work with any CBIR engine

Page 9: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Data collection

● Simplifications were made○ user was only expected to zoom inwards○ not allowed to reset the process○ images only retrieved when zooming-in○ mouse wheel used for zooming (no eye control)

● Training data collected to create a model○ show images closer to users’ expectations

Page 10: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Experiment 1

● 6 different users

● Each of them performing 6 search tasks○ look into the MirFlickr database○ search images matching the category description○ indicate which ones were relevant

● On average around 120 images○ eye movement over 4300 user-task-image instances

Page 11: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Experiment 2

● 3 average users from the previous

● 6 new search tasks:○ 2 with the gaze-based relevance predictor○ 2 with a dummy interface○ 2 same interface + explicit feedback (mouse click)

● Performances measurement:○ by counting the proportion of relevant images

Page 12: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Results

● Prediction accuracy > random for all users○ confirms “relevance through eye movements”

● Huge differences between the users○ due to different tasks or use of the system○ for some prediction accuracy → excellent○ for others → slightly better than random○ explicit feedback (mouse click) → the best○ predicted feedback → comparable for 50% of tasks

Page 13: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Contribution

● Distinction “false positive and negative”○ former: look similar to the relevant but miss details○ latter: images (too) easy to recognize as relevant

● Promising results → further experiments

● GaZIR is concluded to be○ “first attempt of building a sophisticated image

retrieval interface utilizing implicit gaze information”

Page 14: GaZIR: Gaze-based Zooming Interface for Image Retrieval

Thank youAny questions?