Upload
marilynn-richard
View
226
Download
1
Tags:
Embed Size (px)
Citation preview
Content Based Image RetrievalContent Based Image Retrieval
Problems, issues, future directionsProblems, issues, future directions
Problem Problem ??????
EFFECTS &EFFECTS &
PROCESSINGPROCESSING
CHEAP & CHEAP & DENSE DENSE STORAGESTORAGE
MEDIA FLOODINGMEDIA FLOODING
EXAMPLE: GENERAL PHOTOGRAPHYEXAMPLE: GENERAL PHOTOGRAPHY
SNAPSHOT SNAPSHOT PREVIEWSPREVIEWS
EASY SHARING EASY SHARING VIA INTERNETVIA INTERNET
MEMORY MEMORY REUSABLEREUSABLE
PRINTER PRINTER TECHNOLOGYTECHNOLOGY
RESULT: DIGITAL MEDIA FLOODHOW DO WE COPE, TRACK, ORGANIZE IT ALL?
DEVICE FUNCTION CONVERGENCEDEVICE FUNCTION CONVERGENCE DATA RAPIDLY GENERATED BY MANY DATA RAPIDLY GENERATED BY MANY
DEVICESDEVICES INTERNET ACTS AS GLOBAL TRANSPORTINTERNET ACTS AS GLOBAL TRANSPORT DATA CONSUMED BY DEVICES ON DEMANDDATA CONSUMED BY DEVICES ON DEMAND
MULTIMEDIA DATA NEEDS TO BEMULTIMEDIA DATA NEEDS TO BE EFFICIENTLY EFFICIENTLY STOREDSTORED INDEXED INDEXED ACCURATELYACCURATELY EASILY EASILY RETRIEVEDRETRIEVED
MOTIVATIONMOTIVATION
History of Image RetrievalHistory of Image Retrieval
Traditional text-based image search enginesTraditional text-based image search engines Manual annotation of imagesManual annotation of images Use text-based retrieval methodsUse text-based retrieval methods
E.g. E.g. Water lilies
Flowers in a pond<Its biological name>
Text Based Image retrievalText Based Image retrieval
by googleby google by yahooby yahoo etc..etc..
IMPORTANT QUESTION ARISES:IMPORTANT QUESTION ARISES:
“ “WHY NOT SIMPLY INDEX USING TEXT?WHY NOT SIMPLY INDEX USING TEXT?””
(YAHOO! HAS HAD SOME SUCCESS WITH THIS)(YAHOO! HAS HAD SOME SUCCESS WITH THIS)
INTUITIVE, YET USING TEXT ISINTUITIVE, YET USING TEXT IS
SIMPLE BUT SIMPLE BUT SIMPLISTICSIMPLISTIC
TIME CONSUMING TIME CONSUMING – CAN’T AUTOMATE– CAN’T AUTOMATE
HIGHLY HIGHLY SUBJECTIVE SUBJECTIVE & USER-DEPENDENT& USER-DEPENDENT
SUSCEPTIBLE TO SUSCEPTIBLE TO TRANSLATIONTRANSLATION PROBLEMS PROBLEMS
Content based image retrievalContent based image retrieval
What is content in images ??What is content in images ?? Colour, texture, shape, etc.Colour, texture, shape, etc.
CBIRCBIR
Definition:Definition:““The process of retrieving images The process of retrieving images
from a collection on the basis of from a collection on the basis of features (such as colour, texture features (such as colour, texture
and shape) automatically and shape) automatically extracted from the images extracted from the images
themselves”themselves”
FOR A GIVEN QUERY…FOR A GIVEN QUERY…EXAMPLE IMAGEEXAMPLE IMAGEROUGH SKETCHROUGH SKETCHEXPLICIT DESCRIPTION EXPLICIT DESCRIPTION CRITERIACRITERIA
… …RETURN ALL ‘RETURN ALL ‘SIMILARSIMILAR’ ’ IMAGESIMAGES
CBIRCBIRSIMPLE EXAMPLESIMPLE EXAMPLE
QUERY IMAGE
RETRIEVALSYSTEM
RETRIEVAL RESULTSBASED ON COLOR CONTENT
CBIRCBIRQUERY TYPESQUERY TYPES
SKETCHSKETCH
EXAMPLEEXAMPLE
COLORCOLOR
SHAPESHAPE
TEXTURETEXTURE
MORE COMPLEX TYPES MORE COMPLEX TYPES EXIST YET ABOVE ARE EXIST YET ABOVE ARE
MOST FUNDAMENTAL & MOST FUNDAMENTAL & MOST REGULARLY USEDMOST REGULARLY USED
ON WHAT BASIS ARE THEY SIMILAR?ON WHAT BASIS ARE THEY SIMILAR?COLOR CONTENT?COLOR CONTENT?SHAPE CONTENT?SHAPE CONTENT?HIGH LEVEL IDEAS (‘MASKS’, ‘GENDER’)?HIGH LEVEL IDEAS (‘MASKS’, ‘GENDER’)?
PERCEPTION IS ALWAYS AN ISSUEPERCEPTION IS ALWAYS AN ISSUE
CONSIDER THREE IMAGESCONSIDER THREE IMAGES
CBIRCBIR(DIS)SIMILARITY?(DIS)SIMILARITY?SIMILARITY IS NOT SO SIMPLESIMILARITY IS NOT SO SIMPLE
However …However …
Finding the right image is not always easy.Finding the right image is not always easy. There are many millions of digital images There are many millions of digital images
available on the Web.available on the Web. Many more images waiting to be digitised.Many more images waiting to be digitised. Variable levels of metadata and associated Variable levels of metadata and associated
content – if any exist at all …content – if any exist at all … Diverse groups of users.Diverse groups of users. Users who don’t always know either what they Users who don’t always know either what they
want or how to express it in words.want or how to express it in words.
Why Research Image Why Research Image Data?Data?
Increasing use of digital images, particularly Increasing use of digital images, particularly via the Internet, by professionals of all kinds.via the Internet, by professionals of all kinds.
Inadequacy of current technology in Inadequacy of current technology in handling images.handling images.
Increasing interest in how people perceive, Increasing interest in how people perceive, search for and use images.search for and use images.
Exciting new applications opening up (e-Exciting new applications opening up (e-commerce?).commerce?).
Why Is Image Retrieval Why Is Image Retrieval Difficult?Difficult?
Why Is Image Retrieval Why Is Image Retrieval Difficult?Difficult?
Important to distinguish between the Important to distinguish between the physical propertiesphysical properties of an image and how it is of an image and how it is perceivedperceived..
Image retrieval should be based on the Image retrieval should be based on the latterlatter, not the , not the formerformer!!
Two Classes of CBIRTwo Classes of CBIRNarrow vs. Broad DomainNarrow vs. Broad Domain Narrow
Medical Imagery Retrieval Finger Print Retrieval Satellite Imagery Retrieval
Broad
Photo Collections Internet
Block diagram of CBIRBlock diagram of CBIR
Server
Internetor
Intranetor
Extranet
Client
Query Interface
Query byColor Sensation
Query byShape
Learning
Mechanism
Query by
Images
User Drawing
Weight of Features
Query bySpatial Relation
Query byColor
Fectures Extraction
Color Sensation
Color Shape
Spatial Relation
Similarity Measure
Color Sensation
Color Shape
Spatial Relation
Indexing&
Filtering Image Database
Image
Query
Server
Query specificationQuery specification
InterfacesInterfaces Browsing and navigationBrowsing and navigation Specifying the conditions the objects of interest Specifying the conditions the objects of interest
must satisfy, by means of queriesmust satisfy, by means of queries
Queries can be specified in two different Queries can be specified in two different waysways Using a specific query languageUsing a specific query language Query by exampleQuery by example
Using actual data (object example)Using actual data (object example)
Conditions on multimedia dataConditions on multimedia data Query predicatesQuery predicates Attribute predicatesAttribute predicates
Concern the attributes for which an exact value is supplied Concern the attributes for which an exact value is supplied for each objectfor each object
Exact-match retrievalExact-match retrieval
Structural predicatesStructural predicates Concern the structure of multimedia objectsConcern the structure of multimedia objects Can be answered by metadata and information Can be answered by metadata and information
about the database schemaabout the database schema ““Find all multimedia objects containing at least one Find all multimedia objects containing at least one
image and a video clip”image and a video clip”
Conditions on multimedia dataConditions on multimedia data
Semantic predicatesSemantic predicates Concern the semantic content of the Concern the semantic content of the
required data, depending on the features required data, depending on the features that have been extracted and stored for that have been extracted and stored for each multimedia objecteach multimedia object
““Find all the red houses”Find all the red houses” Exact match cannot be appliedExact match cannot be applied
Uncertainty, proximity, and Uncertainty, proximity, and weights in query expressionsweights in query expressions Specify the degree of relevance of the retrieved Specify the degree of relevance of the retrieved
objectsobjects Using some imprecise terms and predicatesUsing some imprecise terms and predicates
Represent a set of possible acceptable values with Represent a set of possible acceptable values with respect to which the attribute or he features has to respect to which the attribute or he features has to be matchedbe matched
Normal, unacceptable, typicalNormal, unacceptable, typical Particular proximity predicatesParticular proximity predicates
The relationship represented is based on the The relationship represented is based on the computation of a semantic distance between the computation of a semantic distance between the query object and stored onesquery object and stored ones
Nearest object searchNearest object search
Uncertainty, proximity, and Uncertainty, proximity, and weights in query expressionsweights in query expressions
Assign each condition or term a given weightAssign each condition or term a given weight Specify the degree of precision by which a Specify the degree of precision by which a
condition must be verified by an objectcondition must be verified by an object ““Find all the objects containing an image Find all the objects containing an image
representing a screen (HIGH) and a keyboard representing a screen (HIGH) and a keyboard (LOW)”(LOW)”
The corresponding query is executed by assigning some The corresponding query is executed by assigning some importance and preference values to each predicate and importance and preference values to each predicate and termterm
Issues related to feature extractionIssues related to feature extraction
What are features of images ??What are features of images ?? ColourColour
Chromatic Histograms, dominant colours, moments, Chromatic Histograms, dominant colours, moments,
ShapeShape TextureTexture
Is the colour system device independentIs the colour system device independent Is the colour system perceptual uniformIs the colour system perceptual uniform Is the colour system linearIs the colour system linear Is the colour system intuitiveIs the colour system intuitive Is the colour system robust against varying Is the colour system robust against varying
imaging conditionsimaging conditions Invariant to change in viewing directionInvariant to change in viewing direction Invariant to change in object geometryInvariant to change in object geometry Invariant to change in direction of illuminationInvariant to change in direction of illumination Invariant to change in intensity of illuminationInvariant to change in intensity of illumination Invariant to change in SPD of illuminationInvariant to change in SPD of illumination
For the purpose of Colour Based For the purpose of Colour Based Image Retrieval Image Retrieval
For the purpose of color based image retrieval, For the purpose of color based image retrieval, color systems composed according to the color systems composed according to the following criteriafollowing criteria• is the color system device independent ?is the color system device independent ?• perceptual uniform ?perceptual uniform ?• linear ?linear ?• intuitive ?intuitive ?• robust against varying imaging conditionsrobust against varying imaging conditions
invariant to a change in viewing directioninvariant to a change in viewing direction
invariant to a change in object geometryinvariant to a change in object geometry
invariant to a change in intensity of the invariant to a change in intensity of the illumination illumination
Change in illuminantChange in illuminant
Color HistogramColor Histogram
The The histogramhistogram of image of image I I is defined as:is defined as:
For a color For a color CCi i , , HHcici(I)(I) represents the number of represents the number of
pixels of color pixels of color CCi i in image in image II . .
OR:OR:
For any pixel in image For any pixel in image II, , HHcici(I)(I) represents the represents the
possibility of that pixel is in color possibility of that pixel is in color CCii.. Most commercial CBIR systems include color Most commercial CBIR systems include color
histogram as one of the features (e.g., QBIC of IBM).histogram as one of the features (e.g., QBIC of IBM). No space information.No space information.
Improvement of color Improvement of color histogramhistogram There are several techniques proposed to integrate There are several techniques proposed to integrate
spatial information with color histograms:spatial information with color histograms: W.Hsu, et al., W.Hsu, et al., An integrated color-spatial approach to content-based An integrated color-spatial approach to content-based
image retrieval.image retrieval. 3 3rdrd ACM Multimedia Conf. Nov 1995. ACM Multimedia Conf. Nov 1995. Smith and Chang, Smith and Chang, Tools and techniques for color image retrievalTools and techniques for color image retrieval, ,
SPIE Proc. 2670, 1996.SPIE Proc. 2670, 1996. Stricker and Dimai, Stricker and Dimai, Color indexing with weak spatial constraintsColor indexing with weak spatial constraints, ,
SPIE Proc. 2670, 1996.SPIE Proc. 2670, 1996. Gong, et al., Gong, et al., Image indexing and retrieval based on human perceptual Image indexing and retrieval based on human perceptual
color clusteringcolor clustering, Proc. 17, Proc. 17thth IEEE Conf. On Computer Vision and IEEE Conf. On Computer Vision and Pattern Recognition, 1998.Pattern Recognition, 1998.
Pass and Zabih, Pass and Zabih, Histogram refinement for content-based image Histogram refinement for content-based image retrieval.retrieval. IEEE Workshop on Applications of Computer Vision, IEEE Workshop on Applications of Computer Vision, 1996.1996.
Park, et al., Park, et al., Models and algorithms for efficient color image Models and algorithms for efficient color image indexing.indexing. Proc. Of IEEE Workshop on Content-Based Access of Proc. Of IEEE Workshop on Content-Based Access of Image and Video Libraries, 1997.Image and Video Libraries, 1997.
Color auto-correlogramColor auto-correlogram
Pick any pixel Pick any pixel p1p1 of color of color CCii in the in the
image image II, at distance , at distance kk away from away from p1p1 pick another pixel pick another pixel p2p2, what is the , what is the probability that probability that p2p2 is also of color is also of color CCii??
P1
P2
k
Red ?
Image: I
Color auto-correlogramColor auto-correlogram
The auto-correlogram of image The auto-correlogram of image II for for color color CCi i , distance , distance kk::
Integrate both color information and Integrate both color information and space information.space information.
]|,|Pr[|)( 1221)(
iii CCkC IpIpkppI
Color auto-correlogramColor auto-correlogram
ImplementationsImplementations Pixel Distance MeasuresPixel Distance Measures Use D8 distance (also called chessboard distance):Use D8 distance (also called chessboard distance):
Choose distance k=1,3,5,7Choose distance k=1,3,5,7 Computation complexity: Computation complexity: Histogram:Histogram:Correlogram: Correlogram:
|)||,max(|),(8 yyxx qpqpqpD
)*134( 2n
)( 2n
ImplementationsImplementations
Features Distance Measures:Features Distance Measures: D( f(ID( f(I11) - f(I) - f(I22) )) ) is small is small II11 and and II22 are similar. are similar. Example: Example: f(a)=1000, f(a’)=1050; f(a)=1000, f(a’)=1050;
f(b)=100, f(b’)=150f(b)=100, f(b’)=150 For histogram:For histogram:
For correlogram:For correlogram:
][ )'()(1
|)'()(||'|
mi CC
CCh IhIh
IhIhII
ii
ii
][],[)()(
)()(
)'()(1
|)'()(||'|
dkmikC
kC
kC
kC
II
IIII
ii
ii
Human beings can perceive specific wavelengths as colors
CBIRCBIR ……
is about:is about: but is not about:but is not about:
““blue”blue”
““Botticelli”Botticelli”
““seashell”seashell”
““Venus”Venus”
MPEG-7MPEG-7
MotivationMotivation To efficiently search/retrieve relevant information that To efficiently search/retrieve relevant information that
people want to usepeople want to use GoalGoal To make it easy to search/retrieve/filter/exchange To make it easy to search/retrieve/filter/exchange
content to maintain archive, and to edit multimedia content to maintain archive, and to edit multimedia content etc.content etc. MPEG-1, 2, 4 : Representation of contents itselfMPEG-1, 2, 4 : Representation of contents itself MPEG-7 : Representation of information about the MPEG-7 : Representation of information about the
contentcontent Types of multimedia dataTypes of multimedia data audio, speech, video, still pictures, graphics and 3D-audio, speech, video, still pictures, graphics and 3D-
modelsmodels composition informationcomposition information
Components of MPEG-7Components of MPEG-71)1) MPEG-7 SystemsMPEG-7 Systems2)2) MPEG-7 Description Definition MPEG-7 Description Definition
Language Language 3)3) MPEG-7 VisualMPEG-7 Visual4)4) MPEG-7 AudioMPEG-7 Audio5)5) MPEG-7 Multimedia DSsMPEG-7 Multimedia DSs6)6) MPEG-7 Reference SoftwareMPEG-7 Reference Software7)7) MPEG-7 ConformanceMPEG-7 Conformance
Color DescriptorsColor Descriptors
Color SpacesColor Spaces Constrained color spacesConstrained color spaces Scalable Color Descriptor uses HSVScalable Color Descriptor uses HSV Color Structure Descriptor uses HMMDColor Structure Descriptor uses HMMD
MPEG-7 color spaces:MPEG-7 color spaces: MonochromeMonochrome RGB RGB HSVHSV YCrCbYCrCb HMMDHMMD
Scalable Color DescriptorScalable Color Descriptor
A color histogram in HSV color spaceA color histogram in HSV color space Encoded by Haar TransformEncoded by Haar Transform
Dominant Color DescriptorDominant Color Descriptor
Clustering colors into a small number of Clustering colors into a small number of representative colorsrepresentative colors
It can be defined for each object, regions, or the It can be defined for each object, regions, or the whole imagewhole image
F = { {cF = { {cii, p, pii, v, vii}, s}}, s}
ccii : Representative colors : Representative colors
ppii : Their percentages in the region : Their percentages in the region
vvii : Color variances : Color variances s : Spatial coherency s : Spatial coherency
Color Layout DescriptorColor Layout Descriptor Clustering the image into 64 (8x8) blocksClustering the image into 64 (8x8) blocks Deriving the average color of each block Deriving the average color of each block
(or using DCD)(or using DCD) Applying DCT and encodingApplying DCT and encoding Efficient forEfficient for Sketch-based image retrievalSketch-based image retrieval Content Filtering using image indexingContent Filtering using image indexing
Color Structure DescriptorColor Structure Descriptor
Scanning the image by an 8x8 pixel blockScanning the image by an 8x8 pixel block Counting the number of blocks containing Counting the number of blocks containing
each coloreach color Generating a color histogram (HMMD)Generating a color histogram (HMMD) Main usages:Main usages: Still image retrievalStill image retrieval Natural images retrievalNatural images retrieval
GoF/GoP Color DescriptorGoF/GoP Color Descriptor
Extends Scalable Color DescriptorExtends Scalable Color Descriptor Generates the color histogram for a Generates the color histogram for a
video segment or a group of picturesvideo segment or a group of pictures Calculation methods:Calculation methods: AverageAverage MedianMedian IntersectionIntersection
Color OpponencyColor Opponency Color After-Color After-
imagesimages
Color OpponencyColor Opponency
Color BlindnessColor Blindness
Color ConstancyColor Constancy Discounting the illuminant - adaptationDiscounting the illuminant - adaptation The color of the ambient lighting quickly fatigues The color of the ambient lighting quickly fatigues
photorecptors to that color -- There is no eye photorecptors to that color -- There is no eye position that allows the photoreceptors to recoverposition that allows the photoreceptors to recover
Once fatigued to the ambient color, that color is Once fatigued to the ambient color, that color is subtracted, or discounted from the visual scene subtracted, or discounted from the visual scene and colors appear close to the way they would in and colors appear close to the way they would in white lightwhite light
Color ConstancyColor Constancy
There is a cognitive There is a cognitive component as well.component as well.
Color ConstancyColor Constancy
There is a cognitive There is a cognitive component as well.component as well.
Color VisionColor Vision
1.1. Memory & ImageryMemory & Imagery AchromatopsiaAchromatopsia
2.2. Form & MotionForm & Motion Interactions of color system with Interactions of color system with
other visual componentsother visual components
Case of AchromatopsiaCase of Achromatopsia Damage to V4 can cause the complete Damage to V4 can cause the complete
loss of color vision (as opposed to red-loss of color vision (as opposed to red-green color blindness): V4 is more green color blindness): V4 is more sensitive to oxygen deprivationsensitive to oxygen deprivation
In addition, color imagery and color In addition, color imagery and color memory are also lostmemory are also lost
What are the implications for What are the implications for perception, imagery and memory?perception, imagery and memory?
Color, Form & MotionColor, Form & Motion
Although V4 interacts with other areas (V3 Although V4 interacts with other areas (V3 & V5 are monochromatic), its interactions & V5 are monochromatic), its interactions are limited are limited
Equiluminant color conditions makes form Equiluminant color conditions makes form and motion perception difficult -- but not and motion perception difficult -- but not impossibleimpossible
Equiluminant ColorsEquiluminant Colors
Equiluminant ColorsEquiluminant Colors
Equiluminant ColorsEquiluminant Colors
Some Applications AreasSome Applications Areas Planning and government: there is a lot of satellite
imagery of the earth, which can be used to inform important political debates. For example, how far does urban sprawl extend? what acreage is under crops? how large will the maize crop be? how much rainforest is left?, etc.
Military intelligence: satellite imagery can contain important military information. Typical queries involve finding militarily interesting changes — for example, is there a concentration of force? how much damage was caused by the last bombing raid? what happened today? etc. — occurring at particular places on the earth
Stock photo and stock footage: commercial libraries — which often have extremely large and very diverse collections — survive by selling the rights to use particular images. Effective tools may unlock value in these collections by making it possible for relatively unsophisticated users to obtain images that are useful to them at acceptable expense in time and money.
Access to museums: museums are increasingly creating web views of their collections, typically at restricted resolutions, to entice viewers into visiting the museum. Ideally, one would want viewers to get a sense of what is at the museum, why it is worth visiting and the particular virtues of the museum’s gift store.
Trademark and copyright enforcement: as electronic commerce grows, so does the opportunity for automatic searches to .nd violations of trademark or of copyright. For example, at time of writing, the owner of rights to a picture could register it with an organisation called BayTSP, who would then search for stolen copies of the picture on the web; recent changes in copyright law make it relatively easy to recover fines from violators (see http://www.baytsp.com/index.asp).
Managing the web: indexing web pages appears to be a profitable activity; the images present on a web page should give cues to the content of the page. Users may also wish to have tools that allow them to avoid offensive images or advertising. A number of tools have been built to support searches for images on the web using CBIR techniques. There are tools that check images for potentially offensive content, both in the academic and commercial domains.
Medical information systems: recovering medical images “similar” to a given query example might give more information on which to base a diagnosis or to conduct epidemiological studies. Furthermore, one might be able to cluster medical images in ways that suggest interesting and novel hypotheses to experts.
5-min Recap5-min Recap
Why is Image IR important?Why is Image IR important? ““a picture is worth a 1000 words”a picture is worth a 1000 words” Alternative form of communicationAlternative form of communication Not everything can be described in Not everything can be described in
text; Not everything can be described text; Not everything can be described in imagesin images
Popular medium of information on the Popular medium of information on the InternetInternet
Search and Retrieval ProcessSearch and Retrieval Process
It’s all overIt’s all over
Earlier Works of CBIREarlier Works of CBIR
The main features of earlier works of CBIR:The main features of earlier works of CBIR: Focused on effective FEATURE representationFocused on effective FEATURE representation
such as color, texture, shape.such as color, texture, shape. Indexing image contents based on Features.Indexing image contents based on Features.
Disadvantages of the previous workDisadvantages of the previous work Semantic gap between high level concepts and low level image Semantic gap between high level concepts and low level image
feature representation. Hence hard to select appropriate feature representation. Hence hard to select appropriate features.features.
User's subjective preference may vary from user to user.User's subjective preference may vary from user to user.
To solve this To solve this Relevance Feedback techniqueRelevance Feedback technique is used. is used.
1st iteration
UserFeedback
Display
2nd iteration
Display
UserFeedback
Estimation &Display selection
Feedbackto system
Problem StatementProblem Statement
Assumption: images of the same semantic Assumption: images of the same semantic meaning/category form a cluster in feature meaning/category form a cluster in feature vector spacevector space
Given a set of positive examples, learn user’s Given a set of positive examples, learn user’s preference and find better result in the next preference and find better result in the next iterationiteration
Former ApproachesFormer Approaches
Multimedia Analysis and Retrieval System Multimedia Analysis and Retrieval System (MARS)(MARS) IEEE Trans CSVT 1998IEEE Trans CSVT 1998 Weight updating, modification of distance Weight updating, modification of distance
functionfunction Pic-HunterPic-Hunter IEEE Trans IP 2000IEEE Trans IP 2000 Probability based, updated by Bayes’ ruleProbability based, updated by Bayes’ rule Maximum Entropy DisplayMaximum Entropy Display
ComparisonsComparisons
AspectAspect ModelModel DescriptionDescription
ModelinModeling of g of user’s user’s targettarget
MARSMARS Weighted Euclidean distanceWeighted Euclidean distance
Pic-HunterPic-Hunter Probability associated with each imageProbability associated with each image
Our Our approachapproach
User’s target data point follow Gaussian User’s target data point follow Gaussian distributiondistribution
LearninLearning g methodmethod
MARSMARS Weight updating, modification of distance Weight updating, modification of distance functionfunction
Pic-HunterPic-Hunter Bayes’ ruleBayes’ rule
Our Our approachapproach
Parameter estimationParameter estimation
Display Display selectioselectionn
MARSMARS K-NN neighborhood searchK-NN neighborhood search
Pic-HunterPic-Hunter Maximum entropy principleMaximum entropy principle
Our Our approachapproach
Simulated maximum entropy principleSimulated maximum entropy principle
Estimation of Target Estimation of Target DistributionDistribution Assume the user’s target follows a Gaussian Assume the user’s target follows a Gaussian
distributiondistribution Construct a distribution that best fits the Construct a distribution that best fits the
relevant data points into some “specific” relevant data points into some “specific” regionregion
Data points selected as relevant
Estimation of Target Estimation of Target DistributionDistribution Assume the user’s target follows a Gaussian Assume the user’s target follows a Gaussian
distributiondistribution Construct a distribution that best fits the Construct a distribution that best fits the
relevant data points into some “specific” relevant data points into some “specific” regionregion
Data points selected as relevant
Estimation of Target Estimation of Target DistributionDistribution Assume the user’s target follows a Gaussian Assume the user’s target follows a Gaussian
distributiondistribution Construct a distribution that best fits the Construct a distribution that best fits the
relevant data points into some “specific” relevant data points into some “specific” regionregion
Data points selected as relevant
Expectation FunctionExpectation Function
Best fit the relevant data points to medium Best fit the relevant data points to medium likelihood regionlikelihood region
The estimated distribution represents user’s The estimated distribution represents user’s targettarget
Updating ParametersUpdating Parameters
After each feedback loop, parameters are After each feedback loop, parameters are updatedupdated New estimated mean = mean of relevant New estimated mean = mean of relevant
data pointsdata points New estimated variance New estimated variance found by found by
differentiationdifferentiation Iterative approach Iterative approach
Indexing and searchingIndexing and searching Searching similar patternsSearching similar patterns Distance functionDistance function Given two objects, OGiven two objects, O11 and O and O22, the , the
distance (=dissimilarity) of the two objects distance (=dissimilarity) of the two objects is denoted by is denoted by D(OD(O11,O,O22))
Similarity queriesSimilarity queries Whole matchWhole match Sub-pattern matchSub-pattern match Nearest neighborsNearest neighbors All pairsAll pairs
Spatial access methodsSpatial access methods
Map objects into points in Map objects into points in ff-D space, and to use -D space, and to use multiattribute access methods (also referred to as multiattribute access methods (also referred to as spatial access methodsspatial access methods or SAMs) to cluster them or SAMs) to cluster them and to search for themand to search for them
MethodsMethods R*-trees and the rest of the R-tree familyR*-trees and the rest of the R-tree family Linear quadtreesLinear quadtrees Grid-filesGrid-files Linear quadtrees and grid files explode Linear quadtrees and grid files explode
exponentially with the dimensionalityexponentially with the dimensionality
R-treeR-tree
R-treeR-tree Represent a spatial object by its minimum Represent a spatial object by its minimum
bounding rectangle (MBR)bounding rectangle (MBR) Data rectangles are grouped to form parent Data rectangles are grouped to form parent
nodes (recursively grouped)nodes (recursively grouped) The MBR of a parent node completely The MBR of a parent node completely
contains the MBRs of its childrencontains the MBRs of its children MBRs are allowed to overlapMBRs are allowed to overlap Nodes of the tree correspond to disk pagesNodes of the tree correspond to disk pages
R-treeR-tree
Range queryRange query Specify a region of interest, requiring all the data Specify a region of interest, requiring all the data
regions that intersect itregions that intersect it RetrieveRetrieveCompute the MBR of the query regionCompute the MBR of the query regionRecursively descend the R-tree, excluding the Recursively descend the R-tree, excluding the
branches whose MBRs do not intersect the branches whose MBRs do not intersect the query MBRquery MBR The retrieved data regions will be further The retrieved data regions will be further
examined for intersection with the query regionexamined for intersection with the query region
Generic multimedia indexing Generic multimedia indexing approachapproach ““Whole match” problemWhole match” problem A collection of A collection of NN objects: objects: OO11, O, O22,…,O,…,ONN
The distance/dissimilarity between two objects The distance/dissimilarity between two objects ((OOii,O,Ojj) is given by the function ) is given by the function D(OD(Oii,O,Ojj))
User specifies a query object User specifies a query object QQ, and a , and a tolerance tolerance εε
GoalGoal Find the objects in the collection that are Find the objects in the collection that are
within distance within distance εεfrom the query objectfrom the query object
GEMINIGEMINI
Generic Multimedia object INdexIngGeneric Multimedia object INdexIng IdeasIdeas A ‘quick-and-dirty’ test, to discard quickly A ‘quick-and-dirty’ test, to discard quickly
the vast majority of non-qualifying the vast majority of non-qualifying objects (possibly, allowing some false objects (possibly, allowing some false alarms)alarms)
The use of spatial access methods, to The use of spatial access methods, to achieve faster-than-sequential searchingachieve faster-than-sequential searching
GEMINIGEMINI ExampleExample Database: yearly stock price movements, with one price Database: yearly stock price movements, with one price
per dayper day Distance functionDistance function Euclidean distanceEuclidean distance
The idea behind the quick-and-dirty test is to The idea behind the quick-and-dirty test is to characterize a sequence with a single number (feature), characterize a sequence with a single number (feature), which help us discard many non-qualifying sequenceswhich help us discard many non-qualifying sequences Average stock price over the year, standard Average stock price over the year, standard
deviation, some of the discrete Fourier transform deviation, some of the discrete Fourier transform (DFT) coefficients(DFT) coefficients
2/1
1
2][][),(
i
iQiSQSD
GEMINIGEMINI
Mapping functionMapping function Let Let F()F() be the mapping of objects to f- be the mapping of objects to f-
dimensional points, that is, dimensional points, that is, F(O)F(O) will be the will be the ff-D -D point that corresponds to object point that corresponds to object OO
Organize f-D points into a spatial access method, Organize f-D points into a spatial access method, cluster them in a hierarchical structure, like the R*-cluster them in a hierarchical structure, like the R*-treestrees
Upon a query, we can exploit the R*-tree, to prune Upon a query, we can exploit the R*-tree, to prune out large portions of the database that are not out large portions of the database that are not promisingpromising
GEMINIGEMINI
Search algorithm (for whole match query)Search algorithm (for whole match query) Map the query object Map the query object QQ into a point into a point F(Q)F(Q) in in
feature spacefeature space Using a spatial access method, retrieve all Using a spatial access method, retrieve all
points within the desired tolerance points within the desired tolerance εεfrom from F(Q)F(Q)
Retrieve the corresponding objects, Retrieve the corresponding objects, compute their actual distance from compute their actual distance from QQ and and discard the false alarmsdiscard the false alarms
GEMINIGEMINI Lower Bounding lemmaLower Bounding lemma To guarantee no false dismissals for whole-To guarantee no false dismissals for whole-
match queries, the feature extraction function F() match queries, the feature extraction function F() should satisfy the following formulashould satisfy the following formula
DDfeaturefeature()(): distance of two feature vectors: distance of two feature vectors
(mapping (mapping F()F() from objects to points should make from objects to points should make things look closer)things look closer)
2121 ,, OODOFOFDfeature
GEMINIGEMINI
GEMINI algorithmGEMINI algorithm Determine the distance function Determine the distance function D()D() between between
two objectstwo objects Find one or more numerical feature-extraction Find one or more numerical feature-extraction
functions, to provide a ‘quick-and-dirty’ testfunctions, to provide a ‘quick-and-dirty’ test Prove that the distance in feature space Prove that the distance in feature space
lower-bounds the actual distance lower-bounds the actual distance D()D(), to , to guarantee correctnessguarantee correctness Use a SAM (e.g., an R-tree), to store and Use a SAM (e.g., an R-tree), to store and
retrieve the retrieve the ff-D feature vectors-D feature vectors
GEMINIGEMINI
‘‘Feature-extracting’ questionFeature-extracting’ question If we are allowed to use only one numerical If we are allowed to use only one numerical
feature to describe each data object, what feature to describe each data object, what should this feature be?should this feature be?
The successful answers to the question should The successful answers to the question should meet two goalsmeet two goals They should facilitate step 3 (the distance They should facilitate step 3 (the distance
lower-bounding)lower-bounding) They should capture most of the They should capture most of the
characteristics of the objectscharacteristics of the objects
R type database e.g. Access and OLE Object Linking and Embedding was Microsoft’s first architecture Object Linking and Embedding was Microsoft’s first architecture
for integrating files of different types:for integrating files of different types: Each file type in Windows is associated with an application It is Each file type in Windows is associated with an application It is
possible to place a file of one type inside another:possible to place a file of one type inside another: either by wholly either by wholly embedding embedding the data in which case it is the data in which case it is
rendered by a plug-in associated with the programrendered by a plug-in associated with the program or by placing a link to the data in which case it is rendered by or by placing a link to the data in which case it is rendered by
calling the original programcalling the original program Access works with this system by providing a domain type for Access works with this system by providing a domain type for
OLEOLE ••There’s not much you can do with OLE fields since the data is There’s not much you can do with OLE fields since the data is
in a format that Access does not understandin a format that Access does not understand ••You can plug the foreign data into a report or a form and little You can plug the foreign data into a report or a form and little
elseelse
R databases e.g. BFILEs in R databases e.g. BFILEs in OracleOracle
The BFILE datatype provides access to The BFILE datatype provides access to BLOB files of up to 4 gigabytes that are BLOB files of up to 4 gigabytes that are stored in file systems outside an Oracle stored in file systems outside an Oracle database.database. The BFILE datatype allows read-only The BFILE datatype allows read-only
support of large binary files; you cannot support of large binary files; you cannot modify a file through Oracle. Oracle modify a file through Oracle. Oracle provides APIs to access file data.provides APIs to access file data.
Large Object Types in Oracle and SQL3
Oracle and SQL3Oracle and SQL3support three large object types:support three large object types: BLOB BLOB - The BLOB domain type stores - The BLOB domain type stores
unstructured binary data in the database. BLOBs unstructured binary data in the database. BLOBs can store up to four gigabytes of binary data.can store up to four gigabytes of binary data.
CLOB CLOB – The CLOB domain type stores up to – The CLOB domain type stores up to four gigabytes of single-byte character set datafour gigabytes of single-byte character set data
NCLOB NCLOB - The NCLOB domain type stores up to - The NCLOB domain type stores up to four gigabytes of fixed-width and varying width four gigabytes of fixed-width and varying width multi-byte national character set datamulti-byte national character set data
* SQL3 is a significant extension to standard SQL which turns into a full object-based * SQL3 is a significant extension to standard SQL which turns into a full object-based languagelanguage
Cont …Cont … These types support These types support Concatenation Concatenation – making up one LOB by putting two of them – making up one LOB by putting two of them
togethertogether Substring Substring – extract a section of a LOB– extract a section of a LOB Overlay Overlay – replace a substring of one LOB with another– replace a substring of one LOB with another Trim Trim – removing particular characters (e.g. whitespace) from – removing particular characters (e.g. whitespace) from
the beginning or endthe beginning or end Length Length – returns the length of the LOB– returns the length of the LOB Position Position – returns the position of a substring in a LOB– returns the position of a substring in a LOB Upper and Lower Upper and Lower – turns a CLOB or NCLOB into upper or – turns a CLOB or NCLOB into upper or
lower caselower case LOBs LOBs can only appear in a can only appear in a where where clause using “=”, “<>” or clause using “=”, “<>” or
“like” and not in “like” and not in group group by or by or order by order by at allat all
Large Object Types in Large Object Types in MySQLMySQLMySQL has four BLOB and four CLOB (called MySQL has four BLOB and four CLOB (called
TEXT in MySQL) domain types:TEXT in MySQL) domain types: TINYBLOBTINYBLOB and and TINYTEXTTINYTEXT – store up to 256 – store up to 256
bytesbytes BLOBBLOB and and TEXTTEXT – store up to 64K bytes – store up to 64K bytes MEDIUMBLOBMEDIUMBLOB and and MEDIUMTEXTMEDIUMTEXT – store – store
up to 16M bytesup to 16M bytes LONGBLOBLONGBLOB and and LONGTEXTLONGTEXT – store up to – store up to
4G bytes4G bytes
Oracle interMedia Audio, Image, and Video
Oracle interMedia supports multimedia storage, retrieval, and Oracle interMedia supports multimedia storage, retrieval, and management of:management of: BLOBs BLOBs stored locally in Oracle8i onwards and containing stored locally in Oracle8i onwards and containing
audio, image, or video dataaudio, image, or video data BFILEs,BFILEs, stored locally in operating system-specific file stored locally in operating system-specific file
systems and containing audio, image or video datasystems and containing audio, image or video data URLs URLs containing audio, image, or video data stored on any containing audio, image, or video data stored on any
HTTP server such as Oracle Application Server, Netscape HTTP server such as Oracle Application Server, Netscape Application Server, Microsoft Internet Information Server, Application Server, Microsoft Internet Information Server, Apache HTTPD server, and Spyglass serversApache HTTPD server, and Spyglass servers
Streaming audioStreaming audio or or videovideo data stored on specialized data stored on specialized media servers such as the Oracle Video Servermedia servers such as the Oracle Video Server
The Object Relational Multimedia Domain Types in interMedia
interMedia interMedia provides the provides the ORDAudioORDAudio, , ORDImageORDImage, , and and ORDVideo ORDVideo object types and methods for:object types and methods for: updateTime ORDSource attribute manipulationupdateTime ORDSource attribute manipulation manipulating multimedia data source attribute manipulating multimedia data source attribute
informationinformation extracting attributes from multimedia dataextracting attributes from multimedia data getting and managing multimedia data from getting and managing multimedia data from
Oracle Oracle interMediainterMedia, Web servers, and other , Web servers, and other serversservers
performing a minimal set of manipulation performing a minimal set of manipulation operations on multimedia data (images only)operations on multimedia data (images only)
Cont …Cont …
The properties available are:The properties available are: ORDImage ORDImage – the height, width, data size of the – the height, width, data size of the
on-disk image, file type, image type,compression on-disk image, file type, image type,compression type, and MIME typetype, and MIME type
ORDAudio – ORDAudio – the format, encoding, number of the format, encoding, number of channels, sampling rate, sample channels, sampling rate, sample size,compression type, and audio durationsize,compression type, and audio duration
ORDVideo ORDVideo – the format, frame size, frame – the format, frame size, frame resolution, frame rate, video duration, number of resolution, frame rate, video duration, number of frames, compression type, number of colours, frames, compression type, number of colours, and bit rateand bit rate
Cont …Cont …
Oracle also stores metadata including:Oracle also stores metadata including: source type, location, and source source type, location, and source
namename MIME type and formatting informationMIME type and formatting information characteristics such as height and characteristics such as height and
width of an image, number of audio width of an image, number of audio channels, video frame rate, pay time, channels, video frame rate, pay time, etc.etc.
Open issuesOpen issues
Gap between low level features and Gap between low level features and high-level conceptshigh-level concepts
Human in the loop – interactive systemsHuman in the loop – interactive systems Retrieval speed – most research Retrieval speed – most research
prototypes can handle only a few prototypes can handle only a few thousand images.thousand images.
A reliable test-bed and measurement A reliable test-bed and measurement criterion, please!criterion, please!
Query Refinement in Query Refinement in Multimedia Similarity RetrievalMultimedia Similarity Retrieval To refine the query to represent the To refine the query to represent the
information that the user is looking for.information that the user is looking for.optimal query representation
initial query representation
Sim=0.7
Sim=0.8
Sim=0.9
Relevant according to the current queryRelevant according to the optimal query
refine
Query Refinement ModelsQuery Refinement Models
Inter-feature Refinement Inter-feature Refinement (Feature Re-weighting)(Feature Re-weighting)
Intra-feature Refinement Intra-feature Refinement (Query Modification (Query Modification & Re-weighting)& Re-weighting)
User Feedback
Query Refinement Model
Multi-feature Query
Individual Feature Queries
Index Index
Index
Intra-feature RefinementIntra-feature RefinementQuery Point Movement Query Expansion
new query representation (Q1…4)
initial query representation
new query representation (C*)
initial query representation
QC
jQ
PCPC
j
m
jjj
j
of centroid weighted
dimension in of deviation standard
)(1
),(Dist P),Dist(Q
*
1
2**new
Q1
Q2
Q3
Q4
ii
ii
Qw
PQwPQ
of weight
),(Dist),(Dist
weights based on relevance level
P P
Sim=0.7
Sim=0.8
Sim=0.9
Selecting Relevant Points in Selecting Relevant Points in Query ExpansionQuery Expansion
• Clustering algorithm used to cluster the relevant points
• Cluster centroids chosen as a new query points
Cluster centroids to be added to query representation
Feature space
Query Expansion:Query Expansion: multi-point approachmulti-point approach
Node distance from a multi-point query is defined as :
Q1
Q3
Q2 w1
w3
w2
MinDist(Q,R) w1 MinDist(Q1,R) + w2 MinDist(Q2,R) + w3 MinDist(Q3,R)
R
PR : Dist(Q,P) MinDist(Q,R)
P
Query Processing for Refined QueryQuery Processing for Refined Query Naively,Naively, execute the refined query just like executing the execute the refined query just like executing the
initial queryinitial query Observation:Observation: the query representation does not change the query representation does not change
dramatically across feedback iterations.dramatically across feedback iterations.
exploit the work done in the previous iteration exploit the work done in the previous iteration by reusing the priority queue used in the by reusing the priority queue used in the previous kNN search for the next iteration.previous kNN search for the next iteration.
Refined QueryRefined QueryPrevious priority queue
New priority queue
Previous query P = P1, P2, P3New query Q = Q1, Q2, Q3,
Q4
wp1P1 + wp2P2 + wp3P3
wq1Q1 + wq2Q2 + wq3Q3 + wq4Q4
1 2 534
CBIRCBIRSUMMARYSUMMARY
BORN FROM BORN FROM MULTIMEDIA FLOODMULTIMEDIA FLOOD TEXT TEXT TOO SIMPLE AND LABORIOUSTOO SIMPLE AND LABORIOUS SYSTEMS WORK DECENTLY IN VITROSYSTEMS WORK DECENTLY IN VITRO
QUERY BY SHAPE, COLOR, TEXTURE, QUERY BY SHAPE, COLOR, TEXTURE, EXAMPLEEXAMPLE
SHORTCOMINGSSHORTCOMINGSNEED NEED RELEVANCE FEEDBACK RELEVANCE FEEDBACK && PERCEPTUAL PERCEPTUALHYBRID QUERIES DIFFICULTHYBRID QUERIES DIFFICULT TO CREATE TO CREATESEMANTIC GAPSEMANTIC GAP NEEDS TO BE BRIDGED NEEDS TO BE BRIDGED
MPEG-7MPEG-7: IMPORTANT DEVELOPMENT: IMPORTANT DEVELOPMENT
ONGOING ONGOING RESEARCHRESEARCH-2-2
ITERATIVE QUERY REFINEMENTITERATIVE QUERY REFINEMENTPLACE USER IN LOOP TO ITERATIVELY PLACE USER IN LOOP TO ITERATIVELY IMPROVE RETRIEVAL RATESIMPROVE RETRIEVAL RATESHIGH-DIMENSIONAL SPACE NEEDS HIGH-DIMENSIONAL SPACE NEEDS PRUNINGPRUNINGEMPHASIZED FEATURE(S) MUST BE EMPHASIZED FEATURE(S) MUST BE FOUNDFOUND
TYPICAL APPROACHESTYPICAL APPROACHESSTATISTICAL METHODSSTATISTICAL METHODSFEATURE WEIGHTINGFEATURE WEIGHTING
RELEVANCE FEEDBACKRELEVANCE FEEDBACK
ONGOING ONGOING RESEARCH-2RESEARCH-2
FEATURE SELECTIVE INTERFACEFEATURE SELECTIVE INTERFACEWHY CHOOSE IMAGES ON WHY CHOOSE IMAGES ON WHOLE? REQUIRES WHOLE? REQUIRES PROCESSING/STATS TO FIND PROCESSING/STATS TO FIND GOOD FEATURESGOOD FEATURESUSER CAN EXPLICITLY INDICATE USER CAN EXPLICITLY INDICATE ELEMENTS OF IMAGE WHICH ELEMENTS OF IMAGE WHICH ARE GOOD: NO GUESSWORKARE GOOD: NO GUESSWORK
RELEVANT COLOR
RELEVANT SHAPE
EXPLICIT FEATURES TO R.F. ENGINE
RELEVANCE FEEDBACKRELEVANCE FEEDBACK
ONGOING ONGOING RESEARCH-3RESEARCH-3
TYPICALLY USED APPROACHESTYPICALLY USED APPROACHESBOOLEAN (AND, OR & NOT OPERATORS)BOOLEAN (AND, OR & NOT OPERATORS)EUCLIDEAN (MINKOWSKI W/ r=1)EUCLIDEAN (MINKOWSKI W/ r=1)WEIGHTED AVERAGE (WA) i.e. SUPERVECTORSWEIGHTED AVERAGE (WA) i.e. SUPERVECTORS
DISADVANTAGESDISADVANTAGESEUCLIDEANEUCLIDEAN: FCN OF DESCRIPTORS – CHANGE : FCN OF DESCRIPTORS – CHANGE DESCRIPTOR, DRASTICALLY ALTER MEASUREDESCRIPTOR, DRASTICALLY ALTER MEASUREWAWA: INFLEXIBLE FOR HIGH LEVEL QUERIES, : INFLEXIBLE FOR HIGH LEVEL QUERIES, SUPERVECTORS IMPOSE CERTAIN STRUCTURESUPERVECTORS IMPOSE CERTAIN STRUCTUREBOOLEANBOOLEAN: HARD LIMITED TO LOGIC FCNs : HARD LIMITED TO LOGIC FCNs ALLALL LACK PERCEPTUAL CONSIDERATIONS LACK PERCEPTUAL CONSIDERATIONS
SIMILARITY AGGREGATION/HYBRID QUERIESSIMILARITY AGGREGATION/HYBRID QUERIES
FUZZY AGGREGATION OF DECISIONSFUZZY AGGREGATION OF DECISIONSUSE MEMBERSHIP FUNCTION TO USE MEMBERSHIP FUNCTION TO ‘FUZZIFY’ DISTANCES & ‘FUZZIFY’ DISTANCES & GENERATE A ‘FUZZY DECISION’GENERATE A ‘FUZZY DECISION’
EXPONENTIAL MODELS HUMAN EXPONENTIAL MODELS HUMAN PERCEPTIONPERCEPTION
ONGOING ONGOING RESEARCH-4RESEARCH-4
SIMILARITY AGGREGATION/HYBRID QUERIESSIMILARITY AGGREGATION/HYBRID QUERIES
FUZZYMEMBERSHIP
FUNCTIONSIMILARITY DISTANCE
dFUZZY DISTANCE
DECISION
INDEXES USUALLY CENTRALIZEDINDEXES USUALLY CENTRALIZEDENTIRE SYSTEM FAILS IF COMPONENT FAILSENTIRE SYSTEM FAILS IF COMPONENT FAILSNO GRACEFUL PERFORMANCE DEGRADATIONNO GRACEFUL PERFORMANCE DEGRADATIONHIGH DATA VOLUME = HIGH SYSTEM REQ’SHIGH DATA VOLUME = HIGH SYSTEM REQ’S
DISTRIBUTED INDEXESDISTRIBUTED INDEXESSPREAD WORKLOAD OVER MANY SPREAD WORKLOAD OVER MANY SUBSYSTEMSSUBSYSTEMS INCREASE REDUNDANCYINCREASE REDUNDANCYP2P SYSTEMS LACK CENTRALIZED ELEMENTSP2P SYSTEMS LACK CENTRALIZED ELEMENTSP2P SYSTEMS RESEMBLE SOCIAL NETWORKSP2P SYSTEMS RESEMBLE SOCIAL NETWORKS
ONGOING ONGOING RESEARCH-5RESEARCH-5
DISTRIBUTED MULTIMEDIA INDEXINGDISTRIBUTED MULTIMEDIA INDEXING
SSMALL MALL WWORLD ORLD IINDEXING NDEXING MMODELODEL11
SOCIOLOGICAL PEER DESCRIPTIONSSOCIOLOGICAL PEER DESCRIPTIONSWE ARE NOT BLIND TO WE ARE NOT BLIND TO WHOWHO OUR OUR PEERS AREPEERS AREPEOPLE KEEP MEMORY OF THEIR PEERSPEOPLE KEEP MEMORY OF THEIR PEERSWE ARE NOT BLIND TO WE ARE NOT BLIND TO HOWHOW OUR OUR PEERS AREPEERS ARE
WE REFER OTHERS TO OUR PEERSWE REFER OTHERS TO OUR PEERS EXAMPLEEXAMPLE
ONGOING ONGOING RESEARCH-6RESEARCH-6
DISTRIBUTED MULTIMEDIA INDEXINGDISTRIBUTED MULTIMEDIA INDEXING
[1] P. Androutsos, D. Androutsos and A. N. Venetsanopoulos, “A distributed fault-tolerant MPEG-7 retrieval scheme based on small world theory”, Distributed Media Technologies and Applications Special Issue of IEEE Transactions on Multimedia, under review.
RESEARCH RESEARCH AVENUESAVENUES-1-1
HYBRID QUERIES & AGGREGATIONHYBRID QUERIES & AGGREGATIONWHAT DO WEIGHTS WHAT DO WEIGHTS MEANMEAN? HOW TO ? HOW TO CHOOSECHOOSE??ALTERNATIVE AGGREGATIONS METHODSALTERNATIVE AGGREGATIONS METHODSADAPTIVE SCHEMES USING REL. FEEDBACKADAPTIVE SCHEMES USING REL. FEEDBACK
USER INTERFACEUSER INTERFACEBRIDGE BRIDGE SEMANTIC GAPSEMANTIC GAP BETWEEN USER’S BETWEEN USER’S IDEA, AND ABILITY TO EXPRESS AS A QUERYIDEA, AND ABILITY TO EXPRESS AS A QUERYALTERNATIVE INTERFACES–ICONIC, ALTERNATIVE INTERFACES–ICONIC, SEMANTICSEMANTIC
RESEARCH RESEARCH AVENUES-2AVENUES-2
PERCEPTUAL ISSUESPERCEPTUAL ISSUESEMPHASIS OF DOMINATING FEATURESEMPHASIS OF DOMINATING FEATURESFEATURE MASKINGFEATURE MASKINGEMOTIONAL INDEXING/EMOTIONAL INDEXING/ALL USERS DIFFERENT–CUSTOMIZED PROFILEALL USERS DIFFERENT–CUSTOMIZED PROFILE
ARCHIVE DEPENDENCEARCHIVE DEPENDENCESYSTEMS USUALLY SYSTEMS USUALLY SPECIALIZEDSPECIALIZEDADAPTIVE INDEXING – MOST APPROPRIATE ADAPTIVE INDEXING – MOST APPROPRIATE SYSTEM USED BASED ON PRELIMINARY SYSTEM USED BASED ON PRELIMINARY SURVEY OF CANDIDATE DATABASESURVEY OF CANDIDATE DATABASE
RESEARCH RESEARCH AVENUES-3AVENUES-3
DISTRIBUTED INDEXINGDISTRIBUTED INDEXINGDISTRIBUTED INDEXES & RETRIEVALDISTRIBUTED INDEXES & RETRIEVAL INDEX SYNCHRONIZATIONINDEX SYNCHRONIZATIONRESULTS ORGANIZATION & RANKINGRESULTS ORGANIZATION & RANKINGSWIM OVERHEAD ESTIMATIONSWIM OVERHEAD ESTIMATIONEXTENSION OF SWIM TO OTHER DATA TYPESEXTENSION OF SWIM TO OTHER DATA TYPES
INCORPORATE TEXT METHODSINCORPORATE TEXT METHODSTEXT-INDEXING USING LIMITED VOCABULARYTEXT-INDEXING USING LIMITED VOCABULARYDON’T REJECT BUT DON’T REJECT BUT USE INTELLIGENTLYUSE INTELLIGENTLY
EXTEND TO MPEG-21 & METADATAEXTEND TO MPEG-21 & METADATA
SUMMARYSUMMARY-1-1 MULTIMEDIA PROCESSINGMULTIMEDIA PROCESSING
RESULTS FROM MULTIMEDIA EXPLOSIONRESULTS FROM MULTIMEDIA EXPLOSIONUSERS DEMANDING MORE FROM DEVICESUSERS DEMANDING MORE FROM DEVICESDEVICES ARE CONVERGINGDEVICES ARE CONVERGING
CONTENT BASED IMAGE RETRIEVALCONTENT BASED IMAGE RETRIEVALNECESSARY TO TRACK VISUAL SEA OF DATANECESSARY TO TRACK VISUAL SEA OF DATAGOOD CAPABILITIES, BUT W/ SHORTCOMINGSGOOD CAPABILITIES, BUT W/ SHORTCOMINGSPERCEPTUAL/SUBJECTIVE ISSUESPERCEPTUAL/SUBJECTIVE ISSUESRELEVANCE FEEDBACKRELEVANCE FEEDBACKDISTRIBUTED CONCEPTS BECOMING CRITICALDISTRIBUTED CONCEPTS BECOMING CRITICAL
IntroductionIntroduction
As Imaging systems evolved in As Imaging systems evolved in complexity and openness, the need for complexity and openness, the need for device-independent image measures device-independent image measures became clear.became clear.
It was quickly recognized that device-It was quickly recognized that device-dependent color coordinates (such as dependent color coordinates (such as monitor RGB and printer CMYK) could monitor RGB and printer CMYK) could not be used to specify and reproduce not be used to specify and reproduce color images with accuracy and precision.color images with accuracy and precision.
ColorimetryColorimetry
Device-independent color Device-independent color measurements are based on the measurements are based on the internationally-standardized CIE system internationally-standardized CIE system of colorimetry first developed in 1931.of colorimetry first developed in 1931.
CIE colorimetry specifies a color CIE colorimetry specifies a color stimulus with numbers proportional to stimulus with numbers proportional to the stimulation of the human visual the stimulation of the human visual system independent of how the color system independent of how the color stimulus was produced.stimulus was produced.
OutlineOutline
Introduction : Introduction : ColorimetryColorimetry Color / Image DifferenceColor / Image Difference Color / Image Appearance ModelColor / Image Appearance Model
The iCAM frameworkThe iCAM framework Input ImagesInput Images First Stage: Chromatic Adaptation (Color First Stage: Chromatic Adaptation (Color
Appearance)Appearance) Second Stage: Appearance AttributesSecond Stage: Appearance Attributes Third Stage: Spatial Filtering (Image Difference)Third Stage: Spatial Filtering (Image Difference)
Rendering HDR imageRendering HDR image
Color Difference EqsColor Difference Eqs
A color difference equation allows for A color difference equation allows for the mapping of physically measured the mapping of physically measured stimuli into perceived differences.stimuli into perceived differences.
CIELAB and CIELUV. ( E△CIELAB and CIELUV. ( E△ abab)) CIE DE94 and CIEDE2000.CIE DE94 and CIEDE2000.
Image DifferenceImage Difference
The CIE color difference formula were The CIE color difference formula were developed using simple color patches in developed using simple color patches in controlled viewing condition. There is no controlled viewing condition. There is no reason to believe that they are reason to believe that they are adequate for predicting color difference adequate for predicting color difference for spatially complex image stimuli.for spatially complex image stimuli.
S-CIELAB contrast sensitivity functions. S-CIELAB contrast sensitivity functions. (CSF)(CSF)
Image DifferenceImage Difference
The CSF serves to remove information The CSF serves to remove information that is imperceptible to the visual that is imperceptible to the visual system. For instance, when viewing system. For instance, when viewing dots at a certain distance the dots tend dots at a certain distance the dots tend to blur, and integrate into a single color.to blur, and integrate into a single color.
OutlineOutline
Introduction : Introduction : ColorimetryColorimetry Color / Image DifferenceColor / Image Difference Color / Image Appearance ModelColor / Image Appearance Model
The iCAM frameworkThe iCAM framework Input ImagesInput Images First Stage: Chromatic Adaptation (Color First Stage: Chromatic Adaptation (Color
Appearance)Appearance) Second Stage: Appearance AttributesSecond Stage: Appearance Attributes Third Stage: Spatial Filtering (Image Difference)Third Stage: Spatial Filtering (Image Difference)
Rendering HDR imageRendering HDR image
Color Appearance ModelColor Appearance Model
CIE colorimetry is only strictly CIE colorimetry is only strictly applicable to situations in which the applicable to situations in which the original and reproduction are viewed in original and reproduction are viewed in identical conditions.identical conditions.
Color appearance model developed to Color appearance model developed to predict color in different viewing predict color in different viewing conditions.conditions.
Image Appearance ModelImage Appearance Model
Color appearance models account for Color appearance models account for many changes in viewing condition, but many changes in viewing condition, but they do not incorporate any of the they do not incorporate any of the spatial or temporal properties of human spatial or temporal properties of human vision and the perception of images.vision and the perception of images.
One model for still images, referred as One model for still images, referred as iCAM, has recently been published by iCAM, has recently been published by Fairchild and Johnson.Fairchild and Johnson.
2. Color Appearance
Image courtesy of John MCannImage courtesy of John MCann
Image courtesy of John MCannImage courtesy of John MCann
Color Appearance Color Appearance More than a single colorMore than a single color Adjacent colors (background)Adjacent colors (background) Viewing environment (surround)Viewing environment (surround)
Appearance effectsAppearance effects AdaptationAdaptation Simultaneous contrastSimultaneous contrast Spatial effectsSpatial effectsColor Appearance ModelsColor Appearance Models
Mark FairchildMark Fairchild
surround
background
stimulus
Light/Dark AdaptationLight/Dark Adaptation
Adjust to overall brightnessAdjust to overall brightness 7 decades of dynamic range7 decades of dynamic range 100:1 at any particular time 100:1 at any particular time
Absolute illumination effectsAbsolute illumination effects Hunt effectHunt effect
Higher brightness increases Higher brightness increases colorfulnesscolorfulness
Stevens effectStevens effect Higher brightness increases contrast Higher brightness increases contrast
Chromatic AdaptationChromatic AdaptationDaylight
Tungsten
Change in illuminationChange in illumination Cones “white balance”Cones “white balance” Scale cone Scale cone
sensitivitiessensitivities von Kriesvon Kries Also cognitive effectsAlso cognitive effects
Creates unique whiteCreates unique white
From Color Appearance Models, fig 8-1
Simultaneous ContrastSimultaneous Contrast
““After image” of backgroundAfter image” of backgroundadds to the coloradds to the color Reality is more complexReality is more complex
Affects Lightness ScaleAffects Lightness Scale
Effect of Spatial FrequencyEffect of Spatial Frequency Smaller = less saturatedSmaller = less saturated The paint chip problemThe paint chip problem Color image perceptionColor image perception S-CIELABS-CIELAB
Redrawn from Redrawn from Foundations of VisionFoundations of Vision, fig 6, fig 6© Brian Wandell, Stanford University© Brian Wandell, Stanford University
Keep it as simple as possible but Keep it as simple as possible but not simplernot simpler..
Albert EinsteinAlbert Einstein
R-treesR-trees
B-trees in multiple dimensionsB-trees in multiple dimensions
Spatial object represented by its MBRSpatial object represented by its MBR
Minimum Bounding Rectangle
R-treesR-trees
Nonleaf nodesNonleaf nodes <<ptrptr, , RR>> ptrptr – pointer to a child node – pointer to a child nodeRR – MBR covering all rectangles in the child – MBR covering all rectangles in the child
nodenode
Leaf nodesLeaf nodes <<obj-idobj-id, , RR>> obj-idobj-id – pointer to object – pointer to objectRR – MBR of the object – MBR of the object
R-treesR-trees
AlgorithmsAlgorithms InsertInsert Find the most suitable leaf nodeFind the most suitable leaf nodePossibly, extend MBRs in parent nodes to Possibly, extend MBRs in parent nodes to
enclose the new objectenclose the new object Leaf node overflow Leaf node overflow split split
SplitSplitHeuristics basedHeuristics based(Possible propagation upwards)(Possible propagation upwards)
R-treesR-trees
Range queriesRange queries Traverse the treeTraverse the tree Compare query MBR with the current node’s MBRCompare query MBR with the current node’s MBR
Nearest neighborNearest neighbor Branch and bound:Branch and bound: Traverse the most promising sub-treeTraverse the most promising sub-tree find neighborsfind neighbors Estimate best- and worstcase Estimate best- and worstcase
Traverse the other sub-trees Traverse the other sub-trees Prune according to obtained thresholdsPrune according to obtained thresholds
R-treesR-trees
Spatial joinsSpatial joins
””find intersecting objects”find intersecting objects” Naïve method:Naïve method:Build a list of pairs of intersecting Build a list of pairs of intersecting
MBRsMBRsExamine each pair, down to leaf levelExamine each pair, down to leaf level
(Faster methods exist)(Faster methods exist)
VariantsVariants
RR++-tree-tree
(Sellis et al 1987)(Sellis et al 1987)
Avoids overlapping rectangles in Avoids overlapping rectangles in internal nodesinternal nodes
RR**-tree-tree
(Beckmann et al 1990)(Beckmann et al 1990)
ApplicationsApplications
Spatial databasesSpatial databases Text retrievalText retrieval Multimedia retrievalMultimedia retrieval
Color correctionColor correction
Does Memorization = Does Memorization = Learning?Learning? Test #1: Thomas learns his mother’s faceTest #1: Thomas learns his mother’s face
Memorizes:
But will he recognize:
Thus he can generalize beyond what he’s seen!
Does Memorization = Does Memorization = Learning? (cont’d)Learning? (cont’d)
Test #2: Nicholas learns about trucks & combines
Memorizes:
But will he recognize others?
So learning involves ability to generalize from labeled examples(in contrast, memorization is trivial, especially for a computer)
Some examplesSome examples
Some examplesSome examples
Some examplesSome examples
That is all, folks…That is all, folks…
Thank you for your Thank you for your patience!patience!
That is all, folks…That is all, folks…
Thank you for your Thank you for your patience!patience!
QuestionsQuestions
Good Luck!Good Luck!