Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Autonomous Mobile Robots
Autonomous Systems LabZürich
Perception
Sensors
Vision
Uncertainties, Line extraction from laser scans
"Position"
Global Map
Perception Motion Control
Cognition
Real WorldEnvironment
Localization
PathEnvironment Model
Local Map
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Today‟s lecture
From object recognition to scene/place recognitionSection 4.6
Omndirectional Cameras (today‟s exercise)Section 4.2.4
Uncertainties: Section 4.7
Representation + Propagation
Line extraction from laser scans
Lec. 72
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Object Recognition
Lecture 7 - Perception
Lec. 73
Q: Is this Book present in the Scene?
Scene Book
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Object Recognition
Lecture 7 - Perception
Lec. 74
Q: Is this Book present in the Scene?
Extract keypointsin both images
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Most of the Book’s keypoints are present in the Scene
A: The Book is present in the Scene
Object Recognition
Lecture 7 - Perception
Lec. 75
Q: Is this Book present in the Scene?
Look for corresponding matches
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Taking this a step further…
Find an object in an image
Find an object in multiple images
Find multiple objects in multiple images
Lecture 7 - Perception
Lec. 76
?
?
?
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Bag of Words
Extension to scene/place recognition:
Is this image in my database?
Robot: Have I been to this place before? „loop closure‟ problem, „kidnapped robot‟ problem
Use analogies from text retrieval:
Visual Words
Vocabulary of Visual Words
“Bag of Words” (BOW) approach
Lecture 7 - Perception
Lec. 77
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Building the Visual Vocabulary
Lecture 7 - Perception
Lec. 78
Image Collection Extract Features
Images for this slide are courtesy of Mark Cummins
Cluster Descriptors
Examples of Visual Words:
Descriptors space
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Video Google: one image a thousand words?
“Video Google” [J.Sivic and A. Zisserman, ICCV 2003]
Demo:
Lecture 7 - Perception
Lec. 79
These features map to the same visual word
Image courtesy of Andrew Zisserman
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Efficient Place/Object Recognition
Same principle: we can describe a scene as a collection of words and look up in the database for images with a similar collection of words
Lecture 7 - Perception
Lec. 710
What if we need to find an object/scene in a database of millions of images?
Build Vocabulary Tree via hierarchical clustering
Use the Inverted File system: each node in the tree is associated with a list of images containing an instance of this node
[D. Nister and H. Stewénius, CVPR 2006]
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Extract Features
Lecture 7 - Perception
Lec. 711
Descriptors Space
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Extract Features
Lecture 7 - Perception
Lec. 712
Descriptors Space
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Extract Features
Lecture 7 - Perception
Lec. 713
Descriptors Space
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Extract Features
Lecture 7 - Perception
Lec. 714
Descriptors Space
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Descriptors SpaceLecture 7 - Perception
Lec. 715
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Vocabulary Tree: Hierarchical PartitioningLecture 7 - Perception
Lec. 716
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Vocabulary Tree: Hierarchical PartitioningLecture 7 - Perception
Lec. 717
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Vocabulary Tree: Hierarchical PartitioningLecture 7 - Perception
Lec. 718
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 719
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 720
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 721
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 722
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 723
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 724
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 725
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 726
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 727
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Populate Vocabulary Tree
Lecture 7 - Perception
Lec. 728
Model images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lookup a test mage
Lecture 7 - Perception
Lec. 729
Test imageModel images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lookup a test mage
Lecture 7 - Perception
Lec. 730
Test imageModel images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lookup a test mage
Lecture 7 - Perception
Lec. 731
Test imageModel images
Slide based on D. Nister‟s slides
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Robust object/scene recognition
Visual Vocabulary holds appearance information but discards the spatial relationships between features
Two images with the same features shuffled around in the image will be a 100% match according to this technique
Usually appearance alone is enough, however, if different arrangements of the same features are expected then one might use geometric verification:
Test the k most similar images to the query image for geometric consistency
(e.g. using RANSAC)
Lecture 7 - Perception
Lec. 732
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Image Histograms
Aim: achieve compact representation of each scene
Smooth the input image (using a Gaussian operator)
Image representation = a set of 1D histograms describing the R,G,B components of the current image and its {Hue, Saturation, Luminance}
Image Similarity Criterion: divergence of histograms (between query image and database)
Earlier approach to place recognition
Lecture 7 - Perception
Lec. 733
Make use of the entire image (instead of salient keypoints as in BOW approaches)
maximize the camera‟s field of view use omni-directional cameras
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Short intro to Omni-directional Vision
Lecture 7 - Perception
Lec. 734
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Omni-directional Cameras
~360º FOV
polydioptric camera (multiple cameras with
overlapping FOV)
~180º FOV
wide FOV dioptriccamera (e.g. fisheye)
Lecture 7 - Perception
Lec. 735
>180º FOV
catadioptric camera(camera + shaped mirror)
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Omni-cam: example image
Lecture 7 - Perception
Lec. 736
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Omni-directional Vision: Why?
Very popular in the robotics community
Omni-vision offers:
Tracking of landmarks in all directions
Tracking of landmarks for longer periods of time better accuracy in
Lecture 7 - Perception
Lec. 737
motion estimation and map building
Google Street View uses omni-directional snapshots to build photo-realistic 3D maps
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Central Catadioptric Camera
Central Omnidirectional Cameras
Lecture 7 - Perception
Lec. 738
A vision system is central when all optical rays intersect to a single point in 3D: “projection center”, “single effective viewpoint”
Single effective viewpoint is the camera optical center
Image plane (CCD)
mirror
camera
Perspective Camera:a central projection system
Non-Central Catadioptric Camera
mirror
camera
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Central catadioptric cameras can be built by appropriate choice of
Mirror shape, and
Distance of camera – mirror
Why is a single effective point so desirable?
Every pixel in the captured image, measures the irradiance of light in one particular direction
Allows the generation of geometrically correct perspective images
Central Omnidirectional Cameras
Lecture 7 - Perception
Lec. 739
Central Catadioptric Camera
mirror
camera
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Today„s Exercise 3
How to Build a Range Finder using an Omnidirectional Camera
Guidance notes on: www.asl.ethz.ch/education/master/mobile_robotics
Lecture 7 - Perception
Lec. 740
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Uncertainties: Representation and Propagation
& Line Extraction from Range data
Lecture 7 - Perception
Lec. 741
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Uncertainty Representation
Section 4.1.3 of the book
Sensing in the real world is always uncertain
How can uncertainty be represented or quantified?
How does uncertainty propagate?
fusing uncertain inputs into a system, what is the resulting uncertainty?
What is the merit of all this for mobile robotics?
Lec. 742
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Uncertainty Representation (2)Lec. 7
43
Use a Probability Density Function (PDF) to characterize the statistical properties of a variable X:
Expected value of variable X: Variance of variable X
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Gaussian Distribution
68.26%
95.44%
99.72%
Lec. 744
Most common PDF for characterizing uncertainties: Gaussian
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
The Error Propagation Law
Imagine extracting a line based on point measurements with uncertainties.
Model parameters in polar coordinates
[ (r, a) uniquely identifies a line ]
The question:
What is the uncertainty of the extracted line knowing the uncertainties of the measurement points that contribute to it ?
Lec. 745
r
a
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
The Error Propagation Law
Error propagation in a multiple-input multi-output system with ninputs and m outputs.
X1
Xi
Xn
System
……
Y1
Yi
Ym
……
)...( 1 njj XXfY
Lec. 746
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
The Error Propagation Law
1D case of a nonlinear error propagation problem
It can be shown that the output covariance matrix CY is given by the error propagation law:
where
CX: covariance matrix representing the input uncertainties
CY: covariance matrix representing the propagated uncertainties for the outputs.
FX: is the Jacobian matrix defined as:
which is the transpose of the gradient of f(X).
Lec. 747
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Example: line extraction from laser scansLec. 7
48
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Line Extraction (1)Lec. 7
49
Point-Line distance
If each measurement is equallyuncertain then sum of sq. errors:
Goal: minimize S when selecting (r, a)
solve the system
“Unweighted Least Squares”
r
a
xi (ri, qi)
diri
qi
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Line Extraction (2)
Point-Line distance
Each sensor measurement, may have itsown, unique uncertainty
Weighted Least Squares
Lec. 750
r
a
xi (ri, qi)
di
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Line Extraction (2)
Weighted least squares and solving the system:
Gives the line parameters:
If what is the uncertainty in the line (r, a) ?
Lec. 751
),ˆ(~
),ˆ(~
2
2
i
i
ii
ii
N
N
q
r
sqq
srr
The uncertainty si of eachmeasurement is proportional to the measured distance ri
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Error Propagation: Line extraction
The uncertainty of each measurement xi (ri, qi) is described by the covariance matrix:
The uncertainty in the line (r, a) is
described by the covariance matrix:
Define:
Jacobian:
Lec. 752
2
2
rr
rrC
ss
ss
a
aaa
.........
.0.00.
.0.00.
.........
.00.0.
.00.0.
.00.00...
)(0
0)(
2
1
2
2
2
2
2 1
i
i
i
i
diag
diagCx
q
q
r
r
q
r
s
s
s
s
s
s
.........
.........
11
11
iiii
iiii
rrrrF
qqrr
q
a
q
a
r
a
r
a
rq
2
2
0
0
i
i
ixCq
r
s
sAssuming that ri, qi
are independent
T
xr FCFC rqrqa
From Error Propagation Law
= ?
Autonomous Mobile Robots
Autonomous Systems LabZürich
Feature Extraction from Range Data:
Line extraction
Split and merge
Linear regression
RANSAC
Hough-Transform
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Extracting Features from Range Data
photograph of corridor at ASL raw 3D scan
extracted planes for every cubeplane segmentation result
Lec. 754
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Extracting Features from Range Data
goal: extract planar features from a dense point cloud
example:
example scene showing a part of a corridor of
the lab same scene represented as dense point cloud
generated by a rotating laser scanner
left wall
ceiling
right wall
floor
back wall
Lecture 7 - Perception
Lec. 755
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Extracting Features from Range Data
Map of the ASL hallway built using line segments
Lec. 756
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Extracting Features from Range Data
Map of the ASL hallway built using orthogonal planes constructed from line segments
Lec. 757
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Features from Range Data: MotivationLec. 7
58
Point Cloud extract Lines / Planes
Why Features:
Raw data: huge amount of data to be stored
Compact features require less storage
Provide rich and accurate information
Basis for high level features (e.g. more abstract features, objects)
Here, we will study line segments
The simplest geometric structure
Suitable for most office-like environments
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Line Extraction: The Problem
Extract lines from a Range scan (i.e. a point cloud)
Three main problems:
How many lines are there?
Segmentation: Which points belong to which line ?
Line Fitting/Extraction: Given points that belong to a line, how to estimate the line parameters ?
Algorithms we will see:
1. Split and merge
2. Linear regression
3. RANSAC
4. Hough-Transform
Lec. 759
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Algorithm 1: Split-and-Merge (standard)
The most popular algorithm which is originated from computer vision.
A recursive procedure of fitting and splitting.
A slightly different version, called Iterative-End-Point-Fit, simply connects the end points for line fitting.
Lec. 760
Initialise set S to contain all points
Split
• Fit a line to points in current set S
• Find the most distant point to the line
• If distance > threshold split & repeat with left and right point sets
Merge
• If two consecutive segments are close/collinear enough, obtain the common line and find the most distant point
• If distance <= threshold, merge both segments
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Algorithm 1: Split-and-Merge (standard)
The most popular algorithm which is originated from computer vision.
A recursive procedure of fitting and splitting.
A slightly different version, called Iterative-End-Point-Fit, simply connects the end points for line fitting.
Lec. 761
Initialise set S to contain all points
Split
• Fit a line to points in current set S
• Find the most distant point to the line
• If distance > threshold split & repeat with left and right point sets
Merge
• If two consecutive segments are close/collinear enough, obtain the common line and find the most distant point
• If distance <= threshold, merge both segments
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Algorithm 1: Split-and-Merge (standard)
The most popular algorithm which is originated from computer vision.
A recursive procedure of fitting and splitting.
A slightly different version, called Iterative-End-Point-Fit, simply connects the end points for line fitting.
Lec. 762
Initialise set S to contain all points
Split
• Fit a line to points in current set S
• Find the most distant point to the line
• If distance > threshold split & repeat with left and right point sets
Merge
• If two consecutive segments are close/collinear enough, obtain the common line and find the most distant point
• If distance <= threshold, merge both segments
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Algorithm 1: Split-and-Merge (standard)
The most popular algorithm which is originated from computer vision.
A recursive procedure of fitting and splitting.
A slightly different version, called Iterative-End-Point-Fit, simply connects the end points for line fitting.
Lec. 763
Initialise set S to contain all points
Split
• Fit a line to points in current set S
• Find the most distant point to the line
• If distance > threshold split & repeat with left and right point sets
Merge
• If two consecutive segments are close/collinear enough, obtain the common line and find the most distant point
• If distance <= threshold, merge both segments
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Algorithm 1: Split-and-Merge (Iterative-End-Point-Fit)Lec. 7
64
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Lecture 7 - Perception
Algorithm 1: Split-and-MergeLec. 7
65
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 2: Line-Regression
Uses a “sliding window” of size Nf
The points within each “sliding window” are fitted by a segment
Then adjacent segments are merged if their line parameters are close
Nf = 3
Lecture 7 - Perception
Lec. 766
Line-Regression
• Initialize sliding window size Nf
• Fit a line to every Nf consecutive points (i.e. in each window)
• Compute a Line Fidelity Array: each element contains the sum of Mahalanobis distances between 3 consecutive windows (Mahalanobis distance used as a measure of similarity)
• Scan Fidelity array for consecutive elements < threshold (using a clustering algorithm).
• For every Fidelity Array element < Threshold, construct a new line segment
• Merge overlapping line segments + recompute line parameters for each segment
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 2: Line-Regression
Uses a “sliding window” of size Nf
The points within each “sliding window” are fitted by a segment
Then adjacent segments are merged if their line parameters are close
Nf = 3
Lecture 7 - Perception
Lec. 767
Line-Regression
• Initialize sliding window size Nf
• Fit a line to every Nf consecutive points (i.e. in each window)
• Compute a Line Fidelity Array: each element contains the sum of Mahalanobis distances between 3 consecutive windows (Mahalanobis distance used as a measure of similarity)
• Scan Fidelity array for consecutive elements < threshold (using a clustering algorithm).
• For every Fidelity Array element < Threshold, construct a new line segment
• Merge overlapping line segments + recompute line parameters for each segment
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 2: Line-Regression
Uses a “sliding window” of size Nf
The points within each “sliding window” are fitted by a segment
Then adjacent segments are merged if their line parameters are close
Nf = 3
Lecture 7 - Perception
Lec. 768
Line-Regression
• Initialize sliding window size Nf
• Fit a line to every Nf consecutive points (i.e. in each window)
• Compute a Line Fidelity Array: each element contains the sum of Mahalanobis distances between 3 consecutive windows (Mahalanobis distance used as a measure of similarity)
• Scan Fidelity array for consecutive elements < threshold (using a clustering algorithm).
• For every Fidelity Array element < Threshold, construct a new line segment
• Merge overlapping line segments + recompute line parameters for each segment
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 2: Line-Regression
Uses a “sliding window” of size Nf
The points within each “sliding window” are fitted by a segment
Then adjacent segments are merged if their line parameters are close
Nf = 3
Lecture 7 - Perception
Lec. 769
Line-Regression
• Initialize sliding window size Nf
• Fit a line to every Nf consecutive points (i.e. in each window)
• Compute a Line Fidelity Array: each element contains the sum of Mahalanobis distances between 3 consecutive windows (Mahalanobis distance used as a measure of similarity)
• Scan Fidelity array for consecutive elements < threshold (using a clustering algorithm).
• For every Fidelity Array element < Threshold, construct a new line segment
• Merge overlapping line segments + recompute line parameters for each segment
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 2: Line-Regression
Uses a “sliding window” of size Nf
The points within each “sliding window” are fitted by a segment
Then adjacent segments are merged if their line parameters are close
Nf = 3
Lecture 7 - Perception
Lec. 770
Line-Regression
• Initialize sliding window size Nf
• Fit a line to every Nf consecutive points (i.e. in each window)
• Compute a Line Fidelity Array: each element contains the sum of Mahalanobis distances between 3 consecutive windows (Mahalanobis distance used as a measure of similarity)
• Scan Fidelity array for consecutive elements < threshold (using a clustering algorithm).
• For every Fidelity Array element < Threshold, construct a new line segment
• Merge overlapping line segments + recompute line parameters for each segment
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 2: Line-Regression
Uses a “sliding window” of size Nf
The points within each “sliding window” are fitted by a segment
Then adjacent segments are merged if their line parameters are close
Nf = 3
Lecture 7 - Perception
Lec. 771
Line-Regression
• Initialize sliding window size Nf
• Fit a line to every Nf consecutive points (i.e. in each window)
• Compute a Line Fidelity Array: each element contains the sum of Mahalanobis distances between 3 consecutive windows (Mahalanobis distance used as a measure of similarity)
• Scan Fidelity array for consecutive elements < threshold (using a clustering algorithm).
• For every Fidelity Array element < Threshold, construct a new line segment
• Merge overlapping line segments + recompute line parameters for each segment
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
RANSAC = RANdom SAmple Consensus.
It is a generic and robust fitting algorithm of models in the presence of outliers (i.e. points which do not satisfy a model)
Generally applicable algorithm to any problem where the goal is to identify the inliers which satisfy a predefined model.
Typical applications in robotics are: line extraction from 2D range data, plane extraction from 3D range data, feature matching, structure from motion, …
RANSAC is an iterative method and is non-deterministic in that the probability to find a set free of outliers increases as more iterations are used
Drawback: A nondeterministic method, results are different between runs.
Lecture 7 - Perception
Lec. 772
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
Lecture 7 - Perception
Lec. 773
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
• Select sample of 2 points at random
Lecture 7 - Perception
Lec. 774
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
• Select sample of 2 points at random
• Calculate model parameters that fit the data in the sample
Lecture 7 - Perception
Lec. 775
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
RANSAC
• Select sample of 2 points at random
• Calculate model parameters that fit the data in the sample
• Calculate error function for each data point
Lecture 7 - Perception
Lec. 776
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
• Select sample of 2 points at random
• Calculate model parameters that fit the data in the sample
• Calculate error function for each data point
• Select data that support current hypothesis
Lecture 7 - Perception
Lec. 777
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
• Select sample of 2 points at random
• Calculate model parameters that fit the data in the sample
• Calculate error function for each data point
• Select data that support current hypothesis
• Repeat sampling
Lecture 7 - Perception
Lec. 778
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
• Select sample of 2 points at random
• Calculate model parameters that fit the data in the sample
• Calculate error function for each data point
• Select data that support current hypothesis
• Repeat sampling
Lecture 7 - Perception
Lec. 779
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
Set with the maximum number of inliers obtained
within k iterations
Lecture 7 - Perception
Lec. 780
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
Lecture 7 - Perception
Lec. 781
( for line extraction from 2D range data )
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
How many iterations does RANSAC need?
We cannot know in advance if the observed set contains the max. no. inliers ideally: check all possible combinations of 2 points in a dataset of N points.
No. all pairwise combinations: N(N-1)/2 computationally infeasible if N is too large. example: laser scan of 360 points need to check all 360*359/2= 64,620 possibilities!
Do we really need to check all possibilities or can we stop RANSAC after iterations? Checking a subset of combinations is enough if we have a rough estimate of the percentage of inliers in our dataset
This can be done in a probabilistic way
Algorithm 3: RANSAC
Lecture 7 - Perception
Lec. 782
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
How many iterations does RANSAC need?
w = number of inliers / N
where N : tot. no. data points
w : fraction of inliers in the dataset = probability of selecting an inlier-point
Let p : probability of finding a set of points free of outliers
Assumption: the 2 points necessary to estimate a line are selected independently
w 2 = prob. that both points are inliers
1-w 2 = prob. that at least one of these two points is an outlier
Let k : no. RANSAC iterations executed so far
( 1-w 2 ) k = prob. that RANSAC never selects two points that are both inliers.
1-p = ( 1-w 2 ) k and therefore :
Algorithm 3: RANSAC
)1log(
)1log(2w
pk
Lecture 7 - Perception
Lec. 783
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 3: RANSAC
How many iterations does RANSAC need?
The number of iterations k is
knowing the fraction of inliers w, after k RANSAC iterations we will have a
prob. p of finding a set of points free of outliers.
Example: if we want a probability of success p=99% and we know that w=50%
k=16 iterations, which is much less than the number of all possible
combinations!
In practice we need only a rough estimate of w. More advanced
implementations of RANSAC estimate the fraction of inliers & adaptively set it on every iteration.
)1log(
)1log(2w
pk
Lecture 7 - Perception
Lec. 784
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 4: Hough-Transform
Hough Transform uses a voting scheme
Lecture 7 - Perception
Lec. 785
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
A line in the image corresponds to a point in Hough space
Algorithm 4: Hough-Transform
m0
b0
Lecture 7 - Perception
Lec. 786
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
What does a point (x0, y0) in the image space map to in the Hough space?
Algorithm 4: Hough-Transform
Lecture 7 - Perception
Lec. 787
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
What does a point (x0, y0) in the image space map to in the Hough space?
Algorithm 4: Hough-Transform
Lecture 7 - Perception
Lec. 788
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Where is the line that contains both (x0, y0) and (x1, y1)?
It is the intersection of the lines b = –x0m + y0 and b = –x1m + y1
Algorithm 4: Hough-Transform
Lecture 7 - Perception
Lec. 789
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 4: Hough-Transform
Each point in image space, votes for line-parameters in Hough parameter space
Lecture 7 - Perception
Lec. 790
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Problems with the (m,b) space:
Unbounded parameter domain
Vertical lines require infinite m
Alternative: polar representation
Each point in image space will map to a sinusoid in the (θ,ρ) parameter space
Algorithm 4: Hough-Transform
Lecture 7 - Perception
Lec. 791
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
1. Initialize accumulator H to all zeros
2. for each edge point (x,y) in the image
for all θ in [0,180]
• Compute ρ = x cos θ + y sin θ
• H(θ, ρ) = H(θ, ρ) + 1
end
end
3. Find the values of (θ, ρ) where H(θ, ρ) is a local maximum
4. The detected line in the image is given by ρ = x cos θ + y sin θ
Algorithm 4: Hough-Transform
Lecture 7 - Perception
Lec. 792
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 4: Hough-Transform
Lecture 7 - Perception
Lec. 793
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 4: Hough-Transform
Lecture 7 - Perception
Lec. 794
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 4: Hough-Transform
Lecture 7 - Perception
Lec. 795
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 4: Hough-Transform
Effect of Noise
Lecture 7 - Perception
Lec. 796
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Algorithm 4: Hough-Transform
Application: Lane detection
Lecture 7 - Perception
Lec. 797
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Example – Door detection using Hough Transform
q
r
20 40 60 80 100 120 140 160 180
100
200
300
400
500
600
700
Hough Transform
Lecture 7 - Perception
Lec. 798
© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL
Comparison of Line Extraction Algorithms
Complexity Speed (Hz) False positives Precision
Split-and-Merge N logN 1500 10% +++
Incremental S N 600 6% +++
Line-Regression N Nf400 10% +++
RANSAC S N k 30 30% ++++
Hough-Transform S N NC + S NR NC10 30% ++++
Expectation Maximization
S N1 N2 N 1 50% ++++
Lecture 7 - Perception
Lec. 799
Split-and-merge, Incremental and Line-Regression: fastest
Deterministic & make use of the sequential ordering of raw scan points(: points captured according to the rotation direction of the laser beam)
If applied on randomly captured points only last 3 algorithms would segment all lines.
RANSAC, HT and EM: produce greater precision more robust to outliers