Upload
tranhanh
View
223
Download
2
Embed Size (px)
Citation preview
CS5320/6320 Computer Vision Class Project
Team Members: Eric Johnson, Randy Hamburger Weekly Report – March 12, 2007 Accomplishments: Both: Image to World Algorithm Method 1
Map all the image points to world coordinates (gives a non-uniform grid), then interpolate in the world domain to get the brightness for a regularly spaced grid of points.
Method 2
Map a regularly spaced grid of world points to a perspective-warped grid of image points, then interpolate inside the image to get the brightness for each of the original world points. Assuming we can deal with interpolation in the irregular grid, method 1 should be more accurate because the distances used for the interpolation will be physical distances on the plane of the road, not stretched and skewed distances between pixels in the image. So, order of operations:
1. Define the range in the image which is useful. There might be interesting stuff
anywhere below the horizon.
2. Map each of those (u,v) pixels to (x,y) to generate the matrices below. The size of these is m - (rHorizon + 1) x n.
This mapping (u,v) to (x,y) only needs to be done once. The Xvis and Yvis matrices then contain the coordinates (x,y) where each point (u,v) in the image maps to in the real world. Then for each incoming image, just pull out the right area (rHorizon:end) to get the image area of interest and you already know where each pixel maps.
rHorizon
I I
Yvis
Xvis
Cropped
visible matrix
matrix static
To get the rHorizon
Then row = u + 1 so rHorizon = + 1
3. Create a regular grid in the world coordinates (Xg, Yg) which roughly covers the visible region (Xvis, Yvis).
y
x
xvis, xvis,
no useful stuff
lim u(x,y) = m-1 2α
atan ( dz * sin (atan ( x-dx
y-dy
)
x - dx
) - θ0 + α
)
m-1 2α
atan ( dz * sin (atan ( )
x - dx
) - θ0 + α
) 0
m-1 2α
atan ( dz * sin ( )
x - dx
) - θ0 + α
0
m-1 2α
atan ( ) - θ0 + α
0
m-1 2α
α - θ0
y ∞
m-1 2α
α - θ0
Only need to do this once 4. Use Iw = griddata(Xvis, Yvis, , I , Xg, Yg, ‘method’)
The ‘method’ is: nearest, linear, spline, or cubic. In MATLAB, we found that this griddata function works for non-uniformly spaced data. (The only trouble is that it takes a while to run.) Inverse-Perspective Mapping Functions The Original Broggi Rendition • Broggi 1995 “Robust + Real-Time Land and Road Detection ………”
γ = angle between optical axis and forward axis viewed from above θ = angle between optical axis and forward axis viewed from the side
z
x
y Note: Lefthand Coordinate System!
u
v
u,v = 0,1,…, n-1 so to use row, column u = r -1 v = c-1
2α = camera “angular aperture” (viewing angle)
n = camera resolution (image is assumed square nxn) l = offset along x between world and camera coordinates d = offset along y between world and camera coordinates h = offset along z between world and camera coordinates • Image to World
x(u,v) = h * cot θ - α + u ( 2α
n-1 ) * sin γ - α + v ( 2α
n-1 ) + l
y(u,v) = h * cot θ - α + u ( 2α
n-1 ) * cos γ - α + v ( 2α
n-1 ) + l
z(u,v) = 0 by definition of the x,y,z frame
y
x
γ z
yd
d
θ
α
α
• World to Image
Improved Method From a Combination of Bertozzi & Broggi 1998 with Jiang 2000 Coordinate Systems (note right-handedness):
u(x,y,0) = ( θ (x,y,0) - (θ - α ) ) / (2 α / (n-1)) θ (x,y,0) = ????
( γ(x,y,0) - (γ - α) ) / / (2 α / (n-1)) γ(x,y,0) = ????
v(x,y,0) =
u v
dz
dx dy
z
x
y m x nu
v u = 0…..m-1 (rows) v = 0…..n-1 (rows) row = u + 1 col = v + 1
top side y
x dy
dx
γ0
z
y
dz
dy
θ0
α
α
World to Image:
Image to World:
u(x,y,0) = m-1 2α
atan ( dz * sin (atan ( x-dx
y-dy )
x - dx ) - θ0 + α
)
n-1 2α
atan ( x - dz
y - dy
) - γ0 + α
v(x,y,0) =
x(u,v) = dz * cot θ0 - α + u ( 2α
m-1 ) * sin γ0 - α + v ( 2α
n-1 ) + dx
y(u,v) = dz * cot θ0 - α + u ( 2α
m-1 ) * cos γ0 - α + v ( 2α
n-1 ) + dy
z(u,v) = 0
Results of inverse-perspective mapping to get the “birds-eye view” for our MATLAB-generated test image:
50 100 150 200 250 300 350 400
100
200
300
400
500
600
Results on DARPA test image (just giving it a shot with the same camera parameters used for the MATLAB image):
50 100 150 200 250 300 350 400
100
200
300
400
500
600
(Obviously, we’ll need to play with the camera parameters a little.)
Eric:
1. Explored the output of the mapping functions (particularly the unexpected curving of horizontal lines) and generated the following document:
http://www.eng.utah.edu/~hamburge/BertozziAndBroggi_InversePerspEqnsTest.pdf
Randy:
1. Typed up meeting notes into weekly report. 2. Updated web page.
Next Steps
1. Finish figuring out the image to world coordinate transformation. We have the basics accomplished but need to resolve two issues:
a. With the current equations (adapted mostly from Broggi) horizontal lines are getting curved, which shouldn’t happen. Need to figure out how to fix this.
b. The interpolation in the non-uniform grid using griddata takes a long time. Once we get the horizontal lines straightened out, we may be able to write a much simpler function to speed this up.
2. Pick out some representative test images and create the ground truth answers for those.
a. Next week or the one after depending on how hard step 1 continues to be.
3. Take a good look a the list of processing techniques we’ve played with and those mentioned in the two survey articles we’ve read (see reference list at the end of last week’s report) and come up with a plan for a first attempt at a meaningful land detection algorithm.
4. Future from here: -> implement it -> check performance measures. 5. Come up with some utility to display algorithm results for convenient visual
feedback.
Recap of some of the lane-finding building-blocks we have experimented with to-date. See http://www.eng.utah.edu/~hamburge/CVprojectCode/
1. imagePatch_RMSerror.m – find RMS error between the image and a small horizontal or vertical lane marker template. Seemed to give promising results, but very slow to run.
Original Image
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450Test Patch
510152025
Scaled RMS Errors
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Thresholded RMS Errors
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Original Image
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450 Test Patch
20 40
51015
Scaled RMS Errors
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Thresholded RMS Errors
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Combined Horizontal and Vertical Lane Marker Search Results
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
Combined Horizontal and Vertical Lane Marker Search Results
100 200 300 400 500 600
50
100
150
200
250
300
350
400
450
2. testing_2007_02_08.m – Tried using dot products with expected lane marker
orientation to boost edges of lane markers which are in shadows. (More comments in the m file.)
Original Image Slice
100 200 300 400 500 600
50
100
150
Smoothed Gradient Magnitude
100 200 300 400 500 600
50
100
150
Raw Dot Product Result
100 200 300 400 500 600
50
100
150
Thresholded Dot Product
100 200 300 400 500 600
50
100
150
3. testing_2007_02_15.m – Tried using different gray-levels based on a histogram
analysis to threshold an image to help isolate potential lane markers. (More comments in the m file.)
Raw Image
0 0.1 0.2 0.3 0.40
1000
2000
3000
4000
5000Histogram
Mean as Threshold Bin 3 of 20 as Threshold
Bin 3 of 20 as Threshold
Shifted and Scaled
0 0.2 0.4 0.6 0.80
1000
2000
3000
4000
5000Histogram
Mean as Threshold Bin 3 of 20 as Threshold
Bin 3 of 20 as Threshold
Equilized Histogram
0 0.2 0.4 0.6 0.80
1000
2000
3000
4000
5000Histogram
Mean as Threshold Bin 14 of 20 as Threshold
Bin 14 of 20 as Threshold
4. testing_2007_02_27.m – Explored the steerable filters used in McCall 2006 and found that a correctly-sized Laplacian may be able to find the centers of lane markers directly and distinguish between dark-bright-dark (lane marker) patterns and bright-dark-bright (tar strip) patterns. (More comments in the m file.)
20 40 60 80 100 120-0.2
0
0.2
0.4
0.6
0.8
1
1.2Idealized Lane Marker Intensity Cross Section
20 40 60 80 100 1200
0.01
0.02
0.03
0.04
0.05Gaussian Smoothing Kernel
20 40 60 80 100 1200
0.2
0.4
0.6
0.8
1Smoothed Intensity
20 40 60 80 100 120-0.2
0
0.2
0.4
0.6
0.8
1
1.2Idealized Lane Marker Intensity Cross Section
20 40 60 80 100 120-4
-2
0
2
4x 10-3 Gaussian First Derivative
20 40 60 80 100 120-0.05
0
0.05Smoothed First Derivative
20 40 60 80 100 120-0.2
0
0.2
0.4
0.6
0.8
1
1.2Idealized Lane Marker Intensity Cross Section
20 40 60 80 100 120-8
-6
-4
-2
0
2
4x 10-4 Gaussian Second Derivative
20 40 60 80 100 120-6
-4
-2
0
2
4x 10-3 Smoothed Second Derivative
* =
* =
* =
20 40 60 80 100 120-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2Idealized Tar Stripe Intensity Cross Section
20 40 60 80 100 1200
0.01
0.02
0.03
0.04
0.05Gaussian Smoothing Kernel
20 40 60 80 100 120-0.7
-0.6
-0.5
-0.4
-0.3
-0.2
-0.1
0Smoothed Intensity
20 40 60 80 100 120-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2Idealized Tar Stripe Intensity Cross Section
20 40 60 80 100 120-4
-2
0
2
4x 10-3 Gaussian First Derivative
20 40 60 80 100 120-0.05
0
0.05Smoothed First Derivative
20 40 60 80 100 120-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2Idealized Tar Stripe Intensity Cross Section
20 40 60 80 100 120-8
-6
-4
-2
0
2
4x 10-4 Gaussian Second Derivative
20 40 60 80 100 120-4
-2
0
2
4
6
8x 10-3 Smoothed Second Derivative
* =
* =
* =
Original Image
Magnitude of Laplacian of Gaussian Output Only Negative Values Retained