Multi-Media RetrievalMulti-Media Retrieval
by Paul McGladeby Paul McGladeModified by Shinta P.Modified by Shinta P.
What is Multi-Media Retrieval?What is Multi-Media Retrieval?
The searching and retrieval of various multi-The searching and retrieval of various multi-media (image, video, web).media (image, video, web).
Typically consists of a query search against a Typically consists of a query search against a database, usually called either digital libraries or database, usually called either digital libraries or digital archives.digital archives.
Generally, multimedia databases also consist of Generally, multimedia databases also consist of textual data types.textual data types.
TasksTasks
Multimedia systems must solve at least two Multimedia systems must solve at least two different tasks:different tasks:
First, relevant items have to be identified.First, relevant items have to be identified.
Second, they have to be presented in such a Second, they have to be presented in such a way that the user can relate them to each other, way that the user can relate them to each other, and what is often more complicated, to the and what is often more complicated, to the query.query.
ProblemsProblems
Multimedia data comparison is more difficult Multimedia data comparison is more difficult than textual data.than textual data.
Different types of querying raises different types Different types of querying raises different types of problems.of problems.
The relevance of each aspect in the multimedia The relevance of each aspect in the multimedia data must be weighted.data must be weighted.
Approaches / SolutionsApproaches / Solutions
Different approaches are explored for the Different approaches are explored for the comparison process:comparison process: Text-basedText-based Region-basedRegion-based Object-basedObject-based
Various solutions have been created:Various solutions have been created: Query formulationQuery formulation MRMLMRML Image indexingImage indexing
Text-basedText-based
Index images using keywords or descriptions.Index images using keywords or descriptions.
Advantages:Advantages: Easier to design and implement.Easier to design and implement. Uses surrounding text in a web page.Uses surrounding text in a web page.
Disadvantages:Disadvantages: Often too expensive.Often too expensive. A picture can sometimes require many words.A picture can sometimes require many words. Surrounding text may not describe picture.Surrounding text may not describe picture.
Region-basedRegion-based
Queries images using regions of the image.Queries images using regions of the image.
Advantages:Advantages: Handles low-level queries.Handles low-level queries. Many features can be extracted.Many features can be extracted.
Disadvantages:Disadvantages: Cannot handle high-level queries.Cannot handle high-level queries.
Region-basedRegion-based
Good
Bad
Object-basedObject-based
Extracts objects from images first.Extracts objects from images first.Advantages:Advantages:
Handles object-based queries.Handles object-based queries.Reduce feature storage adaptively.Reduce feature storage adaptively.
Disadvantages:Disadvantages:Object segmentation is very difficult.Object segmentation is very difficult.User interface is complicated and not easily User interface is complicated and not easily
implemented.implemented.
Object-based (cont’d)Object-based (cont’d)
BlobworldBlobworld
Blobworld is a system for content-based image Blobworld is a system for content-based image retrieval.retrieval.
By automatically segmenting each image into By automatically segmenting each image into regions which roughly correspond to objects or regions which roughly correspond to objects or parts of objects, we allow users to query for parts of objects, we allow users to query for photographs based on the objects they contain.photographs based on the objects they contain.
Blobworld SiteBlobworld Site
Query FormulationQuery Formulation
Formulates a query for comparison against a database.
Query Formula example:SIMILARITY: look similarOBJECT: contains a bikeOBJECT RELATIONSHIP: contains a dog near a personMOOD: a happy pictureTIME/PLACE: Yosemite sunset
MRMLMRML
Multimedia Retrieval Markup LanguageMultimedia Retrieval Markup Language
MRML’s goal is to unify access to multimedia MRML’s goal is to unify access to multimedia retrieval.retrieval.
XML-based communication protocol.XML-based communication protocol.
Specified to standardize access to Multimedia Specified to standardize access to Multimedia Retrieval software components. Retrieval software components.
MRML (cont’d)MRML (cont’d)
Code example:Code example: <<propertyproperty id = "p1" id = "p1" type = "subset" type = "subset" caption = "Weighting function" caption = "Weighting function" visibility = "visible" visibility = "visible" sendtype = "attribute" sendtype = "attribute" sendname = "cui-weighting-function" sendname = "cui-weighting-function" minsubsetsize = "1" minsubsetsize = "1" maxsubsetsize = "1" > maxsubsetsize = "1" > < <propertyproperty id = "p2" id = "p2" type = "setelement" type = "setelement" caption = "Best fully weighted" caption = "Best fully weighted" visibility = "visible" visibility = "visible" sendtype = "value" sendtype = "value" sendvalue = "best-fully" sendvalue = "best-fully" defaultstate = "selected" /> defaultstate = "selected" /> < <propertyproperty id = "p3" id = "p3" type = "setelement" type = "setelement" caption = "Classical IDF" caption = "Classical IDF" visibility = "visible" visibility = "visible" sendtype = "value" sendtype = "value" sendvalue = "classical-idf" sendvalue = "classical-idf" defaultstate = "unselected" /> defaultstate = "unselected" /> </</propertyproperty> >
<mrml > <get-server-properties /> </mrml>
<mrml > <get-algorithms collection-id = "collection-1" /> </mrml>
GIFTGIFT
GNU Image-Finding Tool is a Content Based GNU Image-Finding Tool is a Content Based Image Retrieval System (CBIRS).Image Retrieval System (CBIRS).
Uses MRML.Uses MRML.
Enables the user to query by example on Enables the user to query by example on images.images.
Relies purely on the content of the image.Relies purely on the content of the image.
GIFT SiteGIFT Site
Image IndexingImage Indexing
Process which analyzes an image and Process which analyzes an image and selects aspects of the image to compare in selects aspects of the image to compare in order to index the image with little user input.order to index the image with little user input.
Segments the image into various regions, Segments the image into various regions, and attaches words to each region.and attaches words to each region.
Image Indexing (cont’)Image Indexing (cont’)
Computer Predictions - male cloth female fashion environment people industry fire face man man-made
Manual Category Annotation -super model people female cloth
Computer Predictions - grass mare tiger horses cat buildings
Manual Category Annotation -cat grass tiger
A-LipA-Lip Automatic Linguistic Indexing of Pictures system Automatic Linguistic Indexing of Pictures system
selects among 600 trained concepts to annotate selects among 600 trained concepts to annotate images automatically.images automatically.
On-line real-time image annotation On-line real-time image annotation demonstration is expected to be developed and demonstration is expected to be developed and made available later this year. made available later this year.
When released, will be able to submit your own When released, will be able to submit your own images for automatic annotation.images for automatic annotation.
A-Lip SiteA-Lip Site
High-Level ToolsHigh-Level Tools
Some technical approaches to image Some technical approaches to image comparison:comparison:
Wavelet comparisons.Wavelet comparisons.Fast Image Segmentation.Fast Image Segmentation. IRM (Integrated Region Matching).IRM (Integrated Region Matching).Fuzzy Matching.Fuzzy Matching.
SIMPLIcitySIMPLIcity Semantics-sensitive Integrated Matching for
Picture Libraries.
Combine low-level statistical semantic classification with image retrieval.
Wavelet-based feature extraction for fast segmentation.
Integrated Region Matching (IRM).
SIMPLIcity SiteSIMPLIcity Site
Mengapa Image Retrieval Sulit?Mengapa Image Retrieval Sulit?
Text RetrievalText RetrievalKata Adalah suatu unit, mudah diindexKata Adalah suatu unit, mudah diindexKata Memiliki arti semantikKata Memiliki arti semantik
Image RetrievalImage RetrievalUnit pberupa piksel, sulit diindexUnit pberupa piksel, sulit diindexPiksel tak memiliki artiPiksel tak memiliki artipiksel membentuk pola representasi objek, piksel membentuk pola representasi objek,
kesulitan dalam segmentasikesulitan dalam segmentasiObjek gambar tergantung banyak faktorObjek gambar tergantung banyak faktor
Mengapa Image Retrieval Sulit? Mengapa Image Retrieval Sulit? (Cont’)(Cont’)
Image RetrievalImage RetrievalObjek gambar tergantung banyak faktorObjek gambar tergantung banyak faktor
Sudut PandangSudut PandangIluminasiIluminasiBayanganBayangan Dan komplikasi lainya (latar belakang, variasi Dan komplikasi lainya (latar belakang, variasi
warna, dll)warna, dll)
Pencocokan CitraPencocokan Citra (Global Similarity) (Global Similarity)
Histogram WarnaHistogram Warna
Karakteristik Tekstur (region)Karakteristik Tekstur (region)
Pencocokan CitraPencocokan Citra (Local Similarity) (Local Similarity)
Query By Example Query By Example Segmentasi ObjekSegmentasi ObjekPencocokanPencocokan
Caption TextCaption TextSimilarity (warna, tekstur, bentuk)Similarity (warna, tekstur, bentuk)Susunan Spatial (orientasi, posisi)Susunan Spatial (orientasi, posisi)Teknik Khhusus (eg. Pengenalan Wajah)Teknik Khhusus (eg. Pengenalan Wajah)
ConclusionConclusion
Since one query will return many false results, I believe Since one query will return many false results, I believe more emphasis should be placed on the weighting of more emphasis should be placed on the weighting of certain aspects of each image.certain aspects of each image.
Some ideas:Some ideas:
Artistic tendencies could be taken into account when Artistic tendencies could be taken into account when determining the relevance of an object in an image.determining the relevance of an object in an image.
A textual comparison of an images indexed words, could help in A textual comparison of an images indexed words, could help in determining how common certain objects are found together.determining how common certain objects are found together.