Localization in indoor environments by querying omnidirectional visual maps using perspective images Miguel Lourenco, V. Pedro and João P. Barreto ICRA

Embed Size (px)

Citation preview

  • Slide 1
  • Localization in indoor environments by querying omnidirectional visual maps using perspective images Miguel Lourenco, V. Pedro and Joo P. Barreto ICRA 2012
  • Slide 2
  • Standard Image-based Indoor Localization (1/3) How can a robot equipped with standard camera perform indoor localization? Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 2 Establishing correspondences between the query image and a database of geo-referenced images [Cummins08, Chen11, Hansen01] Query image
  • Slide 3
  • Standard Image-based Indoor Localization (2/3) Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 3 Querying Quantization + td-idf weighting SIFT descriptors Image Database + Inverted file Building a detailed database of environments is troublesome Number of Images StorageTime
  • Slide 4
  • Problem Statement Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 4 A complete coverage of the environment can be performed with an para-catadioptric camera Distortion increases the appearance difference between the images Our Contribution: A new model-based SIFT method for matching between hybrid imaging systems Query image
  • Slide 5
  • Presentation Outline Matching in Hybrid Imaging Systems Comparison / drawbacks of the standard approaches Improvements to the SIFT detector and descriptor Image-based IL using Hybrid Imaging Systems Comparison of several database description schemes Comparison of two searching approaches : BoV vs GVP Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 5
  • Slide 6
  • Matching in Hybrid Imaging Systems Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 6 Using SIFT [Lowe04] on both para- catadioptric and perspective images provides poor matching results [Puig08] 10% inliers A straightforward solution is to apply SIFT in a virtual camera perspective (VCP) [Schnbein11] Cylinder Polar Standard approaches either render a Polar [ Puig08 ] or a Cylindrical panorama [ Krishnan08 ]
  • Slide 7
  • Implicit cylindrical rectification - cylSIFT Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 7 Based on our previous work [Lourenco12] we propose to perform the cylindrical rectification implicitly inside SIFT framework - cylSIFT How does SIFT work ? Image salient points detected in a scale space framework SIFT descriptor is computed based on local image gradient Render synthetic views require to reconstruct the image signal Interpolation artifacts severely affect SIFT performance [ Lourenco12 ]
  • Slide 8
  • Implicit cylindrical rectification cylSIFT detector Render the cylinder before applying SIFT adds extra computational time and interpolation artifacts Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 8 Rectification > 2sec (Matlab ) We avoid the reconstruction artifacts by using an adaptive Gaussian filter [Lourenco12] *
  • Slide 9
  • Standard vs Adaptive Gaussian smoothing Inherent properties of the standard Gaussian filter Space invariant filtering Decouple convolution in X and Y directions Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 9 Advantages of the Simplified Adaptive Filter Isotropic filter that can be decoupled for each image radius A filter bank can be computed offline and loaded into memory Simplification of the adaptive filter
  • Slide 10
  • Implicit cylindrical rectification cylSIFT description Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 10 Non-linear distortion modifies the local structures in the image and, by consequence, the gradients are affected Changes in local gradients of the image deteriorates SIFT descriptor performance Proposed Solution: Compute gradients in the omnidirectional image and implicitly correct them using the Jacobian matrix of the cylindrical mapping function
  • Slide 11
  • Detection and Matching evaluation cylSIFT completely avoids interpolation of the image signal Better repeatability and matching with less computational burden Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 11 cylSIFT has similar performance to the VCP approach The VCP requires a priori knowledge of the view to render to minimize viewpoints changes between the query and VCP image Set ASet B Set C Set D
  • Slide 12
  • Standard Image-based Indoor Localization (3/3) How can a robot use a standard camera for performing indoor localization? Compare of two image searching schemes Standard Bags of Visual Words (BoV) Geometry preserving Visual Phrases (GVP) [Zhang11] Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 12 Query image
  • Slide 13
  • BoV: Standard Bags of Visual Words (BoV) Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 13 Length: Dictionary Size Images are represented as the histogram of words Drawback: Discard the spatial relation between words Spatial layout can be relevant for disambiguate situations of perceptual aliasing
  • Slide 14
  • GVP: Encoding weak geometric constraints [Zhang11] Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 14 A B A B Offset space A B Group of Visual Words in a certain layout form a Visual Phrase GVP incorporate geometric constraints at the searching step -3 -2 0 1 2 3 4 -1 0 1234 5 I I
  • Slide 15
  • Indoor Location Recognition - Experimental Setup Our database covers 2 teaching buildings of our campus 118 para-catadioptric images 451 perspective images are used to query the database The environment suffers from high perceptual aliasing Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 15 ? Query image
  • Slide 16
  • Retrieval Results Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 16 cylSIFT takes full advantage of its matching capabilities Interpolation avoidance assure more distinctive descriptors 10% improvement of the localization success when compared with a state-of-the-art approach (Cylinder + GVP) [Chen11] Indoor recognition systems benefit with the usage of GVP Robustness against perceptual aliasing (spatial layout matters) BoV TOP 1GVP TOP 1 Re-ranking - Top 5
  • Slide 17
  • Take home messages Interpolation artifacts affect image retrieval cylSIFT offers better retrieval performance in hybrid imaging systems at a marginal computational cost when compared to the standard SIFT algorithm cylSIFT can be useful for other applications than localization Hybrid fundamental matrix estimation [Puig08] First work that uses a hybrid imaging systems for image retrieval without the need of rectification Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 17
  • Slide 18
  • Thanks for coming Questions? Code and dataset releases: http://arthronav.isr.uc.pt/~mlourenco/OmniSearch/ Miguel Loureno Institute for Systems and Robotics, Faculty of Science and Technology, University of Coimbra - 16 May 2012 - Slide 18