Query by Image and Video Content: The QBIC System
M. Flickner et al.IEEE Computer Special Issue on Content-Based Retrieval
Vol. 28, No. 9, September 1995
Presenter: William Conner
Outline
• Overview• Motivation• Design• Indexing• Representative frames• Related Work• Critique• Demo
QBIC
• System that supports content-based image and video retrieval– Flexible query interface– Results ranked based on similarity
• Introduced into commercial products– IBM’s Ultimedia Manager– IBM’s DB2 Image Extenders
Motivation
• Many previous image and video retrieval approaches were limited– Only supported queries over meta-data rather
than content• File identifiers• Keywords that are input manually• Other text associated with image (e.g., caption)• Yahoo.com and Google.com image search support
queries by keyword, size, coloration, file type, and domain
Query Methods
• Example images
• Sketches and drawings
• User-selected color and texture patterns
• Camera and object motion
System Architecture
Database
Feature Extraction
Query Interface
Matching Engine
Ranked Results
Image Objects Video Objects
Filter/Index
User
User
R-Trees
• Region tree is a multidimensional index – Like a B-tree for multiple dimensions– R*-tree is a variant that re-inserts entries upon
overflow rather than splitting nodes
• Can be used to index low-dimensional features such as average color and texture
• High-dimensional features can be reduced to a lower number of dimensions
R-Trees
• 2-D example with only two levels (next slide)– Want query to find to points P1 and P2
• Tree root is a bounding rectangle• Child nodes are also bounding rectangles
– Overlap is allowed at same tree level– All regions overlapping with query region must be
searched
• Possible to have several levels and several dimensions
R-Trees
A
B
C
P1
P2
ROOT
R-Frames
• Representative frames– Allow image retrieval techniques to help with video retrieval– Video broken up into clips called shots– R-frame is representative of shot– Also, basic unit of video query result
• Useful for browsing
• Choice– Particular frame from shot
• First, last, or middle
– Synthesized by creating mosaic of all frames in a shot
Related Work
• MIT Photobook– Content-based image retrieval system– Library of matching algorithms
• e.g., Euclidean distance, histograms, wavelet tree distances
– Interactive learning agent to help determine user’s intent
• IBM’s Garlic Project– Managing large-scale multimedia systems– Fagin’s algorithms for merging ranked query results
• i.e., Top-k query processing over several multimedia subsystems
Photobook
• Query: find images most similar to image in the upper left
Critique
• Pros– Flexible query interface for content-based retrieval– Reuses image retrieval techniques for video retrieval– Actually used in commercial products
• Cons– Not enough details
• e.g., More elaboration on how query plans are developed considering fast filtering and indexing
– No performance evaluation• Should include measurements of accuracy and delay
Demo
• Russian museum’s online digital collection uses QBIC engine– Supports color and layout search– The State Hermitage Museum