Special Topic on Image Retrieval 2014-03. Popular Visual Features Global feature – Color correlation histogram – Shape context – GIST – Color name Local

Embed Size (px)

Citation preview

  • Slide 1
  • Special Topic on Image Retrieval 2014-03
  • Slide 2
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine, KAZE FAST Descriptor SIFT, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 3
  • 2-D color images Color histograms Each color image a 2-d array of pixels Each pixel 3 color components (R,G,B) h colors each color denoting a point in 3-d color space (as high as 2 24 colors) For each image compute the h-element color histogram each component is the percentage of pixels that are most similar to that color The histogram of image I is defined as: For a color C i, H ci (I) represents the number of pixels of color C i in image I OR: For any pixel in image I, H ci (I) represents the possibility of that pixel having color C i.
  • Slide 4
  • 2-D color images Color histograms Usually cluster similar colors together and choose one representative color for each color bin Most commercial CBIR systems include color histogram as one of the features No spatial information
  • Slide 5
  • Color histograms - distance One method to measure the distance between two histograms x and y is: where the color-to-color similarity matrix A has entries a ij that describe the similarity between color i and color j
  • Slide 6
  • Color Correlation Histogram Given any pixel of color c i in the image, gives the probability that a pixel at distance k away from the given pixel is of color c j.
  • Slide 7
  • Color Auto-correlogram The auto-correlogram of image I for color C i, with distance k: Integrate both color information and space information. P1 P2 Red ? Image: I k
  • Slide 8
  • Color auto-correlogram
  • Slide 9
  • Implementation Pixel Distance Measures Use D 8 distance (also called chessboard distance): Co-occurrence count: Then, The denominator is the total number of pixels at distance k from any pixel of color c i. Computational complexity:
  • Slide 10
  • Efficient Implementation with Dynamic Programming to count the number of pixels of a given color within a given distance from a fixed pixel in the positive horizontal/vertical directions. With initial condition: Define: Then: Since we do O(n 2 ) work for each k, the total time taken is O(n 2 d). can aslo be computed in a similar way. Finally:
  • Slide 11
  • Distance Metric Features Distance Measures: D( f(I 1 ) - f(I 2 ) ) is small I 1 and I 2 are similar. Example: f(a)=1000, f(a)=1050; f(b)=100, f(b)=150 For histogram: For correlogram:
  • Slide 12
  • Color Histogram vs Correlogram no difference If there is no difference between the query and the target images, both methods have good performance. Query Image (512 colors) Correlogram method Histogram method 1st2nd3rd4th5th 1st2nd3rd4th5th
  • Slide 13
  • Color Histogram vs Correlogram The correlogram method is more stable to color change than the histogram method. Query Target Correlogram method: 1 st Histogram method: 48 th
  • Slide 14
  • Color Histogram vs Correlogram The correlogram method is more stable to large appearance change than the histogram method Query Target Correlogram method: 1 st Histogram method: 31 th
  • Slide 15
  • Color Histogram vs Correlogram The correlogram method is more stable to contrast & brightness change than the histogram method. Query 1 Target C: 178 th H: 230 th Query 2 Query 3 Query 4 C: 1 st H: 1 st C: 1 st H: 3 rd C: 5 th H: 18 th
  • Slide 16
  • Color Histogram vs Correlogram The color correlogram describes the global distribution of local spatial correlations of colors. Its easy to compute Its more stable than the color histogram method
  • Slide 17
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine, KAZE FAST Descriptor SIFT, GLOH, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 18
  • 18 Shape Context What points on these two sampled contours are most similar? How do you know?
  • Slide 19
  • Shape context descriptor [Belongie et al 02] 19 Count the number of points inside each bin, e.g.: Count = 4 Count = 10... Compact representation of distribution of points relative to each point Shape context slides from Belongie et al.
  • Slide 20
  • Shape context descriptor 20
  • Slide 21
  • Comparing shape contexts 21 Compute matching costs using Chi Squared distance: Recover correspondences by solving for least cost assignment, using costs C ij (Then use a deformable template match, given the correspondences.)
  • Slide 22
  • Invariance/ Robustness Translation Scaling Rotation Modeling transformations thin plate splines (TPS) Generalization of cubic splines to 2D Matching cost = f(Shape context distances, bending energy of thin plate splines) Can add appearance information too Outliers? 22
  • Slide 23
  • 23 An example of shape context-based matching
  • Slide 24
  • 24 Some retrieval results
  • Slide 25
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine, KAZE FAST Descriptor SIFT, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 26
  • GIST Feature Definition and background Essence, holistic characteristics of an image Context information obtained within a eye saccade (app. 150 ms.) Evidence of place recognizing cells at Parahippocampal Place Area (PPA) Biologically plausible models of Gist are yet to be proposed Nature of tasks done with gist Scene categorization/context recognition Region priming/layout recognition Resolution/scale selection
  • Slide 27
  • Human Vision Architecture Visual Cortex: Low level filters, center- surround, and normalization Saliency Model: Attend to pertinent regions Gist Model: Compute image general characteristics High Level Vision: Object recognition Layout recognition Scene understanding
  • Slide 28
  • Gist Model Utilize the same Visual Cortex raw features in the saliency model [Itti 2001] Gist is theoretically non-redundant with Saliency Gist vs. Saliency Instead of looking at most conspicuous locations in image, looks at scene as a whole Detection of regularities, not irregularities Cooperation (Accumulation) vs. competition (WTA) among locations More spatial emphasis in saliency Local vs. global/regional interaction
  • Slide 29
  • Gist Model Implementation Raw image feature-Maps Orientation Channel Gabor filters at 4 angles (0,45,90,135) on 4 scales = 16 sub-channels Color: red-green and blue-yellow center surround each with 6 scale combinations = 12 sub-channels Intensity dark-bright center-surround with 6 scale combinations = 6 sub-channels = Total of 34 sub-channels
  • Slide 30
  • Gist Model Implementation Gist Feature Extraction Average values of predetermined grid
  • Slide 31
  • Gist Model Implementation Dimension Reduction Original: 34 sub-channels x 16 features = 544 features PCA/ICA reduction: 80 features Kept >95% of variance
  • Slide 32
  • System Example Run
  • Slide 33
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine, KAZE FAST Descriptor SIFT, GLOH, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 34
  • Color Name: Chip-Based vs. Real-World
  • Slide 35
  • Basic Color Terms The English language consists of 11 basic color terms. These basic color terms are defined by the linguistics Berlin and Kay as those color names: Which are applied to diverse classes of objects. Whose meaning is not subsumable under one of the other basic color terms. Which are used consistently and with consensus by most speakers of the language.
  • Slide 36
  • Learning Color Names Color names are learned with an adapted Probabilistic Latent Semantic Analysis (PLSA-bg). Google set: 1100 images queried with Google image, containing 100 images per color name.
  • Slide 37
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine FAST Descriptor SIFT, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 38
  • Blob Detector: MSER Maximally Stable Extremal Region
  • Slide 39
  • Blob Detector: MSER
  • Slide 40
  • Extremal/Maximal Regions Definition: A set of all connected components (pixels) below all thresholds. g=0.2 g=0.4 g=0.9
  • Slide 41
  • Extremal/Minimal Regions Definition: A set of all connected components (pixels) above all thresholds. g=0.2 g=0.4 g=0.9
  • Slide 42
  • Maximally stable extremal regions (MSER) Examples of thresholded images high threshold low threshold
  • Slide 43
  • MSER
  • Slide 44
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine FAST Descriptor SIFT, GLOH, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 45
  • GLOH: Gradient location-orientation histogram (Mikolajczyk and Schmid 2005) 272D 128D by PCA SIFT GLOH
  • Slide 46
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine FAST Descriptor SIFT, GLOH, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 47
  • 47 SURF: Speeded Up Robust Features Using integral images for major speed up Integral Image (summed area tables) is an intermediate representation for the image and contains the sum of gray scale pixel values of image Second order derivative and Haar-wavelet response Cost four additions operation only ECCV 2006, CVIU 2008
  • Slide 48
  • 48 Detection Hessian-based interest point localization L xx (x,y,) is the Laplacian of Gaussian of the image It is the convolution of the Gaussian second order derivative with the image Lindeberg showed Gaussian function is optimal for scale-space analysis Gaussian is overrated since the property that no new structures can appear while going to lower resolution is not proven in 2D case
  • Slide 49
  • 49 Detection Approximated second order derivatives with box filters (mean/average filter)
  • Slide 50
  • 50 Detection Scale analysis with constant image size 9 x 9, 15 x 15, 21 x 21, 27 x 27 39 x 39, 51 x 51 1 st octave2 nd octave
  • Slide 51
  • 51 Detection Non-maximum suppression and interpolation Blob-like feature detector
  • Slide 52
  • 52 Description Orientation Assignment Circular neighborhood of radius 6s around the interest point (s = the scale at which the point was detected) Side length = 4s Cost 6 operation to compute the response x response y response
  • Slide 53
  • 53 Description Dominant orientation The Haar wavelet responses are represented as vectors Sum all responses within a sliding orientation window covering an angle of 60 degree The two summed response yield a new vector The longest vector is the dominant orientation Second longest is ignored
  • Slide 54
  • 54 Description Split the interest region up into 4 x 4 square sub-regions with 5 x 5 regularly spaced sample points inside Calculate Haar wavelet response d x and d y Weight the response with a Gaussian kernel centered at the interest point Sum the response over each sub-region for d x and d y separately feature vector of length 32 In order to bring in information about the polarity of the intensity changes, extract the sum of absolute value of the responses feature vector of length 64 Normalize the vector into unit length
  • Slide 55
  • 55 Description
  • Slide 56
  • 56 Description SURF-128 The sum of d x and |d x | are computed separately for d y 0 Similarly for the sum of d y and |d y | This doubles the length of a feature vector
  • Slide 57
  • 57 Matching Fast indexing through the sign of the Laplacian for the underlying interest point The sign of trace of the Hessian matrix Trace = L xx + L yy Either 0 or 1 (Hard thresholding, may have boundary effect ) In the matching stage, compare features if they have the same type of contrast (sign)
  • Slide 58
  • 58 Experimental Results
  • Slide 59
  • 59 Viewpoint change of 30 degrees
  • Slide 60
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine FAST Descriptor SIFT, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 61
  • LIOP: Local Intensity Order Pattern for Feature Description (2011) Motivation Orientation estimation error in SIFT Figure. Orientation assignment errors. (a) Between corresponding points, only 63.77% of errors are in the range of [-20,20]. (b) Between corresponding points that are also matched by SIFT.
  • Slide 62
  • LIOP: Local Intensity Order Pattern for Feature Description
  • Slide 63
  • Slide 64
  • Popular Visual Features Global feature Color correlation histogram Shape context GIST Color name Local feature Detector DoG, MSER, Hessian Affine FAST Descriptor SIFT, SURF, LIOP BRIEF, ORB, FREAK, BRISK, CARD
  • Slide 65
  • BRIEF: Binary Robust Independent Elementary Features (2010) Binary test BRIEF descriptor For each S*S patch 1.Smooth it 2.Pick pixels using pre-defined binary tests
  • Slide 66
  • Smoothing kernels De-noising Gaussian kernels
  • Slide 67
  • Spatial arrangement of the binary tests 1.(X,Y)~i.i.d. Uniform 2.(X,Y)~i.i.d. Gaussian 3.X~i.i.d. Gaussian, Y~i.i.d. Gaussian 4.Randomly sampled from discrete locations of a coarse polar grid introducing a spatial quantization. 5. and takes all possible values on a coarse polar grid containing points
  • Slide 68
  • Slide 69
  • Slide 70
  • Distance Distributions
  • Slide 71
  • Experiments