Upload
builiem
View
214
Download
1
Embed Size (px)
Citation preview
Multimedia
Content-based Multimedia Retrieval
Course Code 005636 (Fall 2017)
Prof. S. M. Riazul Islam, Dept. of Computer Engineering, Sejong University, Korea
E-mail: [email protected]
Contents
Overview of Content-based multimedia retrieval
Concepts of Content-based image retrieval
Audio Retrieval
Document Image Analysis and Retrieval
System Architecture
Content-based Image Retrieval (CBIR)
Searching for digital images in large databases
What kinds of databases?
What kinds of queries?
What constitutes a match?
How do we make such searches efficient?
Deep blue sky
Orange sunset
CBIR Applications
Art Collections
e.g. Fine Arts Museum of San Francisco
Medical Image Databases
CT, MRI, Ultrasound, The Visible Human
Scientific Databases
Earth Sciences
General Image Collections for Licensing
The World Wide Web
What is a Query?
An image you already have
A rough sketch you draw
S symbolic description of what you want
CBIR System
Offline Processing
Image Features / Distance Measures
Image Database
Query Image
Distance Measure
Retrieved Images
Image Feature
User
Feature Space
Images
Features
Color (histograms, gridded layout, wavelets)
Texture (Laws, Gabor filters, LBP, polarity)An entity consisting of mutually related pixels and group of pixels
Shape (What preprocessing must occur to get shape?)
Objects and their Relationships This is the most powerful, but you have to be able to recognize the
objects!
Artificial texture Natural texture
Research Objective
Image Database
Query Image Retrieved Images
ImagesObject-oriented
Feature Extraction
User
…
Animals
Buildings
Office Buildings
Houses
Transportation
•Boats
•Vehicles
…
boat
Categories
A Taxonomy of Audio
Sound
Music Other?Speech
Classical
Country
Disco Hip Hop
Jazz
RockSports
AnnouncerFemale
Male
Orchestra
String
Quartet
Choir
Piano
?
Acoustic Modeling
Describes the sounds that
make up speech
Lexicon
Describes which
sequences of speech
sounds make up
valid words
Language Model
Describes the likelihood
of various sequences of
words being spoken
Speech Recognition
Speech Recognition Knowledge Sources
Speech Recognition Process
Pronunciation
Lexicon
Signal Processing
Phonetic
Probability
Estimator
(Acoustic
Model)
Decoder
(Language
Model)WordsSpeech
Grammar
Document Image Analysis
Recognize text (OCR)
convert page images to Unicode
machine-printed, handwritten
Analyze page layout geometry
a 2-D problem (unlike speech, text)
good ‘language-free’ algorithms
Capture logical structure
output marked-up text (XML, etc)
exploit non-textual clues
Video/Image OCR Block Diagram
Text Area
Detection
Text Area
Preprocessing
Commercial
OCR
Video or
Image
UTF8 Text
Text Detection
System Architecture
• Combine video, audio and text retrieval scores
Query
Text Image Audio
Text Score Image Score Audio Score
Retrieval
Agents
Final Score
Q&A