View
194
Download
1
Tags:
Embed Size (px)
Citation preview
Log Analysis to Understand Medical
Professionals' Image Searching
Behaviour
Theodora Tsikrika
Henning Müller
Charles E. Kahn
Overview
• Medical image retrieval
• Motivation of our work
• Methods
• Log file analysis
• Search strategies
• Frequent information needs
• Use as topics for a retrieval benchmark
• Conclusions
2
Medical image retrieval
• Medical professionals frequently and increasingly search for visual information (images, videos)
• Particularly radiologists often search for images
• Internet search increasingly replaces search in
reference books and discussions with colleagues
• Images are important for differential diagnosis,
finding explications for unclear visual patterns
• Different types of image search systems
• Text-based search for images
• Content-based search for images • Visual characteristics are extracted from images
3
Motivation
• Knowing search tasks, goals and formulations of user groups for information retrieval is important
• To build new IR systems or benchmark existing ones
• Several surveys have been performed
• Log file analyses were done as well
• MedLine log files, not really for images
• HONmedia search, less focused as not radiologists, but
rather general public, health professionals
• Image search on the Internet for radiologists has increased strongly
• Goldminer, Yottalook, MedSearch, SpringerImages, … 4
Log file analysis
• Session level, query level, term level
• Search logs have received much attention to learn more on user behavior
• Bad example: release of AOL log, privacy!!
• Amount of information differs, IP addresses, time
stamps
• Session level is interesting as much is learned on behavior, query modifications, even satisfaction
• Terms added, removed, changed?
• Query and term level often focus on frequency
• Most common terms and queries
5
Methods
• ARRS Goldminer made a log file available
• 25’000 consecutive searches of medical
professionals
• Search system is very popular with radiologists • Allows search terms, selection of gender, age and
modality
• Search term normalization
• All lower case, removing special characters, quotes
• Manual work: “xray”, “x-ray”, “x ray” all equals “xray”
• Removal of identical consecutive queries
• No time stamps available, no IP address
• Proximity & overlapping search terms to define
sessions
6
Results of the analysis
• 23’033 queries after preprocessing, 14’413 of these are unique queries (63%)
• Query length 2.24 words, 2.46 for unique queries
• Similar to web search, one term less than MedLine
• Imaging modalities:
• MRI (586), CT (425), ultrasound (199), xray (139),
PET (34), PET/CT (13), angiography (13), echo (11),
radiography (10), tomography (6), fMRI (3), PET/MRI
(1)
• This despite the possibility to filter for modalities • Not logged 7
Most frequent queries and terms
8
Query modification
• 5713 consecutive query pairs sharing at least one term, assumed to be single session
9
Use of terms for topics in
ImageCLEF • ImageCLEF, image retrieval benchmark
• Using images and text as queries, 17 groups
participated in 2012
• Taking most frequent searches, at least two terms
• Radiologist ranked these search terms by usefulness in radiology
• Most useful terms were checked to find whether documents in PubMedCentral fulfill the need
• 30 most useful, most frequent, available results were used as queries
• Images were taken from teaching files
10
Conclusions
• Analysis of log files can help understand user behavior
• Help build better systems based on user models
and analyze current approaches, also
shortcomings
• Time stamps and user identification are important for query session analysis
• We used implicit knowledge for this
• People do not know all details of systems
• Search for modalities in text and through filters
• Depending on results, users change terms (specialization, generalization, modification)
11
Questions?
• More information can be found at
• http://www.khresmoi.eu/
• http://medgift.hevs.ch/
• http://publications.hevs.ch/
• Contact:
12