Autonomous Mobile Robotsweb.eecs.utk.edu/~leparker/Courses/CS494-529-fall... · Section 4.1.3 of the book Sensing in the real world is always uncertain How can uncertainty be represented

Autonomous Mobile Robots

Autonomous Systems LabZürich

Perception

Sensors

Vision

Uncertainties, Line extraction from laser scans

"Position"

Global Map

Perception Motion Control

Cognition

Real WorldEnvironment

Localization

PathEnvironment Model

Local Map

© R. Siegwart, D. Scaramuzza and M. Chli, ETH Zurich - ASL

Lecture 7 - Perception

Today‟s lecture

From object recognition to scene/place recognitionSection 4.6

Omndirectional Cameras (today‟s exercise)Section 4.2.4

Uncertainties: Section 4.7

Representation + Propagation

Line extraction from laser scans

Lec. 72


Object Recognition


Lec. 73

Q: Is this Book present in the Scene?

Scene Book


Object Recognition


Lec. 74


Extract keypointsin both images


Most of the Book’s keypoints are present in the Scene

A: The Book is present in the Scene

Object Recognition


Lec. 75


Look for corresponding matches


Taking this a step further…

Find an object in an image

Find an object in multiple images

Find multiple objects in multiple images


Lec. 76

?

?

?


Bag of Words

Extension to scene/place recognition:

Is this image in my database?

Robot: Have I been to this place before? „loop closure‟ problem, „kidnapped robot‟ problem

Use analogies from text retrieval:

Visual Words

Vocabulary of Visual Words

“Bag of Words” (BOW) approach


Lec. 77


Building the Visual Vocabulary


Lec. 78

Image Collection Extract Features

Images for this slide are courtesy of Mark Cummins

Cluster Descriptors

Examples of Visual Words:

Descriptors space


Video Google: one image a thousand words?

“Video Google” [J.Sivic and A. Zisserman, ICCV 2003]

Demo:


Lec. 79

These features map to the same visual word

Image courtesy of Andrew Zisserman

http://www.robots.ox.ac.uk/~vgg/research/vgoogle/


Efficient Place/Object Recognition

Same principle: we can describe a scene as a collection of words and look up in the database for images with a similar collection of words


Lec. 710

What if we need to find an object/scene in a database of millions of images?

Build Vocabulary Tree via hierarchical clustering

Use the Inverted File system: each node in the tree is associated with a list of images containing an instance of this node

[D. Nister and H. Stewénius, CVPR 2006]


Extract Features


Lec. 711

Descriptors Space

Slide based on D. Nister‟s slides


Extract Features


Lec. 712

Descriptors Space



Extract Features


Lec. 713

Descriptors Space



Extract Features


Lec. 714

Descriptors Space



Descriptors SpaceLecture 7 - Perception

Lec. 715



Vocabulary Tree: Hierarchical PartitioningLecture 7 - Perception

Lec. 716




Lec. 717




Lec. 718



Populate Vocabulary Tree


Lec. 719

Model images





Lec. 720

Model images





Lec. 721

Model images





Lec. 722

Model images





Lec. 723

Model images





Lec. 724

Model images





Lec. 725

Model images





Lec. 726

Model images





Lec. 727

Model images





Lec. 728

Model images



Lookup a test mage


Lec. 729

Test imageModel images



Lookup a test mage


Lec. 730




Lookup a test mage


Lec. 731




Robust object/scene recognition

Visual Vocabulary holds appearance information but discards the spatial relationships between features

Two images with the same features shuffled around in the image will be a 100% match according to this technique

Usually appearance alone is enough, however, if different arrangements of the same features are expected then one might use geometric verification:

Test the k most similar images to the query image for geometric consistency

(e.g. using RANSAC)


Lec. 732


Image Histograms

Aim: achieve compact representation of each scene

Smooth the input image (using a Gaussian operator)

Image representation = a set of 1D histograms describing the R,G,B components of the current image and its {Hue, Saturation, Luminance}

Image Similarity Criterion: divergence of histograms (between query image and database)

Earlier approach to place recognition


Lec. 733

Make use of the entire image (instead of salient keypoints as in BOW approaches)

maximize the camera‟s field of view use omni-directional cameras


Short intro to Omni-directional Vision


Lec. 734


Omni-directional Cameras

~360º FOV

polydioptric camera (multiple cameras with

overlapping FOV)

~180º FOV

wide FOV dioptriccamera (e.g. fisheye)


Lec. 735

>180º FOV

catadioptric camera(camera + shaped mirror)


Omni-cam: example image


Lec. 736


Omni-directional Vision: Why?

Very popular in the robotics community

Omni-vision offers:

Tracking of landmarks in all directions

Tracking of landmarks for longer periods of time better accuracy in


Lec. 737

motion estimation and map building

Google Street View uses omni-directional snapshots to build photo-realistic 3D maps


Central Catadioptric Camera

Central Omnidirectional Cameras


Lec. 738

A vision system is central when all optical rays intersect to a single point in 3D: “projection center”, “single effective viewpoint”

Single effective viewpoint is the camera optical center

Image plane (CCD)

mirror

camera

Perspective Camera:a central projection system

Non-Central Catadioptric Camera

mirror

camera


Central catadioptric cameras can be built by appropriate choice of

Mirror shape, and

Distance of camera – mirror

Why is a single effective point so desirable?

Every pixel in the captured image, measures the irradiance of light in one particular direction

Allows the generation of geometrically correct perspective images

Central Omnidirectional Cameras


Lec. 739

Central Catadioptric Camera

mirror

camera


Today„s Exercise 3

How to Build a Range Finder using an Omnidirectional Camera

Guidance notes on: www.asl.ethz.ch/education/master/mobile_robotics


Lec. 740


Uncertainties: Representation and Propagation

& Line Extraction from Range data


Lec. 741



Uncertainty Representation

Section 4.1.3 of the book

Sensing in the real world is always uncertain

How can uncertainty be represented or quantified?

How does uncertainty propagate?

fusing uncertain inputs into a system, what is the resulting uncertainty?

What is the merit of all this for mobile robotics?

Lec. 742



Uncertainty Representation (2)Lec. 7

43

Use a Probability Density Function (PDF) to characterize the statistical properties of a variable X:

Expected value of variable X: Variance of variable X



Gaussian Distribution

68.26%

95.44%

99.72%

Lec. 744

Most common PDF for characterizing uncertainties: Gaussian



The Error Propagation Law

Imagine extracting a line based on point measurements with uncertainties.

Model parameters in polar coordinates

[ (r, a) uniquely identifies a line ]

The question:

What is the uncertainty of the extracted line knowing the uncertainties of the measurement points that contribute to it ?

Lec. 745

r

a




Error propagation in a multiple-input multi-output system with ninputs and m outputs.

X1

Xi

Xn

System

……

Y1

Yi

Ym

……

)...( 1 njj XXfY

Lec. 746




1D case of a nonlinear error propagation problem

It can be shown that the output covariance matrix CY is given by the error propagation law:

where

CX: covariance matrix representing the input uncertainties

CY: covariance matrix representing the propagated uncertainties for the outputs.

FX: is the Jacobian matrix defined as:

which is the transpose of the gradient of f(X).

Lec. 747



Example: line extraction from laser scansLec. 7

48



Line Extraction (1)Lec. 7

49

Point-Line distance

If each measurement is equallyuncertain then sum of sq. errors:

Goal: minimize S when selecting (r, a)

solve the system

“Unweighted Least Squares”

r

a

xi (ri, qi)

diri

qi



Line Extraction (2)

Point-Line distance

Each sensor measurement, may have itsown, unique uncertainty

Weighted Least Squares

Lec. 750

r

a

xi (ri, qi)

di



Line Extraction (2)

Weighted least squares and solving the system:

Gives the line parameters:

If what is the uncertainty in the line (r, a) ?

Lec. 751

),ˆ(~

),ˆ(~

2

2

i

i

ii

ii

N

N

q

r

sqq

srr

The uncertainty si of eachmeasurement is proportional to the measured distance ri



Error Propagation: Line extraction

The uncertainty of each measurement xi (ri, qi) is described by the covariance matrix:

The uncertainty in the line (r, a) is

described by the covariance matrix:

Define:

Jacobian:

Lec. 752

2

2

rr

rrC

ss

ss

a

aaa

.........

.0.00.

.0.00.

.........

.00.0.

.00.0.

.00.00...

)(0

0)(

2

1

2

2

2

2

2 1

i

i

i

i

diag

diagCx

q

q

r

r

q

r

s

s

s

s

s

s

.........

.........

11

11

iiii

iiii

rrrrF

qqrr

q

a

q

a

r

a

r

a

rq

2

2

0

0

i

i

ixCq

r

s

sAssuming that ri, qi

are independent

T

xr FCFC rqrqa

From Error Propagation Law

= ?

Autonomous Mobile Robots

Autonomous Systems LabZürich

Feature Extraction from Range Data:

Line extraction

Split and merge

Linear regression

RANSAC

Hough-Transform



Extracting Features from Range Data

photograph of corridor at ASL raw 3D scan

extracted planes for every cubeplane segmentation result

Lec. 754



goal: extract planar features from a dense point cloud

example:

example scene showing a part of a corridor of

the lab same scene represented as dense point cloud

generated by a rotating laser scanner

left wall

ceiling

right wall

floor

back wall


Lec. 755




Map of the ASL hallway built using line segments

Lec. 756




Map of the ASL hallway built using orthogonal planes constructed from line segments

Lec. 757



Features from Range Data: MotivationLec. 7

58

Point Cloud extract Lines / Planes

Why Features:

Raw data: huge amount of data to be stored

Compact features require less storage

Provide rich and accurate information

Basis for high level features (e.g. more abstract features, objects)

Here, we will study line segments

The simplest geometric structure

Suitable for most office-like environments



Line Extraction: The Problem

Extract lines from a Range scan (i.e. a point cloud)

Three main problems:

How many lines are there?

Segmentation: Which points belong to which line ?

Line Fitting/Extraction: Given points that belong to a line, how to estimate the line parameters ?

Algorithms we will see:

1. Split and merge

2. Linear regression

3. RANSAC

4. Hough-Transform

Lec. 759

http://www.youtube.com/watch?v=wV8frjLqtIA&feature=related



Algorithm 1: Split-and-Merge (standard)

The most popular algorithm which is originated from computer vision.

A recursive procedure of fitting and splitting.

A slightly different version, called Iterative-End-Point-Fit, simply connects the end points for line fitting.

Lec. 760

Initialise set S to contain all points

Split

• Fit a line to points in current set S

• Find the most distant point to the line

• If distance > threshold split & repeat with left and right point sets

Merge

• If two consecutive segments are close/collinear enough, obtain the common line and find the most distant point

• If distance <= threshold, merge both segments







Lec. 761


Split




Merge









Lec. 762


Split




Merge









Lec. 763


Split




Merge





Algorithm 1: Split-and-Merge (Iterative-End-Point-Fit)Lec. 7

64



Algorithm 1: Split-and-MergeLec. 7

65


Algorithm 2: Line-Regression

Uses a “sliding window” of size Nf

The points within each “sliding window” are fitted by a segment

Then adjacent segments are merged if their line parameters are close

Nf = 3


Lec. 766

Line-Regression

• Initialize sliding window size Nf

• Fit a line to every Nf consecutive points (i.e. in each window)

• Compute a Line Fidelity Array: each element contains the sum of Mahalanobis distances between 3 consecutive windows (Mahalanobis distance used as a measure of similarity)

• Scan Fidelity array for consecutive elements < threshold (using a clustering algorithm).

• For every Fidelity Array element < Threshold, construct a new line segment

• Merge overlapping line segments + recompute line parameters for each segment






Nf = 3


Lec. 767

Line-Regression












Nf = 3


Lec. 768

Line-Regression












Nf = 3


Lec. 769

Line-Regression












Nf = 3


Lec. 770

Line-Regression












Nf = 3


Lec. 771

Line-Regression








Algorithm 3: RANSAC

RANSAC = RANdom SAmple Consensus.

It is a generic and robust fitting algorithm of models in the presence of outliers (i.e. points which do not satisfy a model)

Generally applicable algorithm to any problem where the goal is to identify the inliers which satisfy a predefined model.

Typical applications in robotics are: line extraction from 2D range data, plane extraction from 3D range data, feature matching, structure from motion, …

RANSAC is an iterative method and is non-deterministic in that the probability to find a set free of outliers increases as more iterations are used

Drawback: A nondeterministic method, results are different between runs.


Lec. 772


Algorithm 3: RANSAC


Lec. 773


Algorithm 3: RANSAC

• Select sample of 2 points at random


Lec. 774


Algorithm 3: RANSAC


• Calculate model parameters that fit the data in the sample


Lec. 775


RANSAC



• Calculate error function for each data point


Lec. 776


Algorithm 3: RANSAC




• Select data that support current hypothesis


Lec. 777


Algorithm 3: RANSAC





• Repeat sampling


Lec. 778


Algorithm 3: RANSAC





• Repeat sampling


Lec. 779


Algorithm 3: RANSAC

Set with the maximum number of inliers obtained

within k iterations


Lec. 780


Algorithm 3: RANSAC


Lec. 781

( for line extraction from 2D range data )


How many iterations does RANSAC need?

We cannot know in advance if the observed set contains the max. no. inliers ideally: check all possible combinations of 2 points in a dataset of N points.

No. all pairwise combinations: N(N-1)/2 computationally infeasible if N is too large. example: laser scan of 360 points need to check all 360*359/2= 64,620 possibilities!

Do we really need to check all possibilities or can we stop RANSAC after iterations? Checking a subset of combinations is enough if we have a rough estimate of the percentage of inliers in our dataset

This can be done in a probabilistic way

Algorithm 3: RANSAC


Lec. 782



w = number of inliers / N

where N : tot. no. data points

w : fraction of inliers in the dataset = probability of selecting an inlier-point

Let p : probability of finding a set of points free of outliers

Assumption: the 2 points necessary to estimate a line are selected independently

w 2 = prob. that both points are inliers

1-w 2 = prob. that at least one of these two points is an outlier

Let k : no. RANSAC iterations executed so far

( 1-w 2 ) k = prob. that RANSAC never selects two points that are both inliers.

1-p = ( 1-w 2 ) k and therefore :

Algorithm 3: RANSAC

)1log(

)1log(2w

pk


Lec. 783


Algorithm 3: RANSAC


The number of iterations k is

knowing the fraction of inliers w, after k RANSAC iterations we will have a

prob. p of finding a set of points free of outliers.

Example: if we want a probability of success p=99% and we know that w=50%

k=16 iterations, which is much less than the number of all possible

combinations!

In practice we need only a rough estimate of w. More advanced

implementations of RANSAC estimate the fraction of inliers & adaptively set it on every iteration.

)1log(

)1log(2w

pk


Lec. 784


Algorithm 4: Hough-Transform

Hough Transform uses a voting scheme


Lec. 785


A line in the image corresponds to a point in Hough space


m0

b0


Lec. 786


What does a point (x0, y0) in the image space map to in the Hough space?



Lec. 787


What does a point (x0, y0) in the image space map to in the Hough space?



Lec. 788


Where is the line that contains both (x0, y0) and (x1, y1)?

It is the intersection of the lines b = –x0m + y0 and b = –x1m + y1



Lec. 789



Each point in image space, votes for line-parameters in Hough parameter space


Lec. 790


Problems with the (m,b) space:

Unbounded parameter domain

Vertical lines require infinite m

Alternative: polar representation

Each point in image space will map to a sinusoid in the (θ,ρ) parameter space



Lec. 791


1. Initialize accumulator H to all zeros

2. for each edge point (x,y) in the image

for all θ in [0,180]

• Compute ρ = x cos θ + y sin θ

• H(θ, ρ) = H(θ, ρ) + 1

end

end

3. Find the values of (θ, ρ) where H(θ, ρ) is a local maximum

4. The detected line in the image is given by ρ = x cos θ + y sin θ



Lec. 792




Lec. 793




Lec. 794




Lec. 795



Effect of Noise


Lec. 796



Application: Lane detection


Lec. 797


Example – Door detection using Hough Transform

q

r

20 40 60 80 100 120 140 160 180

100

200

300

400

500

600

700

Hough Transform


Lec. 798


Comparison of Line Extraction Algorithms

Complexity Speed (Hz) False positives Precision

Split-and-Merge N logN 1500 10% +++

Incremental S N 600 6% +++

Line-Regression N Nf400 10% +++

RANSAC S N k 30 30% ++++

Hough-Transform S N NC + S NR NC10 30% ++++

Expectation Maximization

S N1 N2 N 1 50% ++++


Lec. 799

Split-and-merge, Incremental and Line-Regression: fastest

Deterministic & make use of the sequential ordering of raw scan points(: points captured according to the rotation direction of the laser beam)

If applied on randomly captured points only last 3 algorithms would segment all lines.

RANSAC, HT and EM: produce greater precision more robust to outliers

Documents

Autonomous Mobile Robotsweb.eecs.utk.edu/~leparker/Courses/CS494-529-fall... · Section 4.1.3 of the book Sensing in the real world is always uncertain How can uncertainty be represented