16
CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy Hamburger Weekly Report – March 12, 2007 Accomplishments: Both: Image to World Algorithm Method 1 Map all the image points to world coordinates (gives a non-uniform grid), then interpolate in the world domain to get the brightness for a regularly spaced grid of points.

CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Embed Size (px)

Citation preview

Page 1: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

CS5320/6320 Computer Vision Class Project

Team Members: Eric Johnson, Randy Hamburger Weekly Report – March 12, 2007 Accomplishments: Both: Image to World Algorithm Method 1

Map all the image points to world coordinates (gives a non-uniform grid), then interpolate in the world domain to get the brightness for a regularly spaced grid of points.

Page 2: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Method 2

Map a regularly spaced grid of world points to a perspective-warped grid of image points, then interpolate inside the image to get the brightness for each of the original world points. Assuming we can deal with interpolation in the irregular grid, method 1 should be more accurate because the distances used for the interpolation will be physical distances on the plane of the road, not stretched and skewed distances between pixels in the image. So, order of operations:

Page 3: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

1. Define the range in the image which is useful. There might be interesting stuff

anywhere below the horizon.

2. Map each of those (u,v) pixels to (x,y) to generate the matrices below. The size of these is m - (rHorizon + 1) x n.

This mapping (u,v) to (x,y) only needs to be done once. The Xvis and Yvis matrices then contain the coordinates (x,y) where each point (u,v) in the image maps to in the real world. Then for each incoming image, just pull out the right area (rHorizon:end) to get the image area of interest and you already know where each pixel maps.

rHorizon

I I

Yvis

Xvis

Cropped

visible matrix

matrix static

Page 4: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

To get the rHorizon

Then row = u + 1 so rHorizon = + 1

3. Create a regular grid in the world coordinates (Xg, Yg) which roughly covers the visible region (Xvis, Yvis).

y

x

xvis, xvis,

no useful stuff

lim u(x,y) = m-1 2α

atan ( dz * sin (atan ( x-dx

y-dy

)

x - dx

) - θ0 + α

)

m-1 2α

atan ( dz * sin (atan ( )

x - dx

) - θ0 + α

) 0

m-1 2α

atan ( dz * sin ( )

x - dx

) - θ0 + α

0

m-1 2α

atan ( ) - θ0 + α

0

m-1 2α

α - θ0

y ∞

m-1 2α

α - θ0

Page 5: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Only need to do this once 4. Use Iw = griddata(Xvis, Yvis, , I , Xg, Yg, ‘method’)

The ‘method’ is: nearest, linear, spline, or cubic. In MATLAB, we found that this griddata function works for non-uniformly spaced data. (The only trouble is that it takes a while to run.) Inverse-Perspective Mapping Functions The Original Broggi Rendition • Broggi 1995 “Robust + Real-Time Land and Road Detection ………”

γ = angle between optical axis and forward axis viewed from above θ = angle between optical axis and forward axis viewed from the side

z

x

y Note: Lefthand Coordinate System!

u

v

u,v = 0,1,…, n-1 so to use row, column u = r -1 v = c-1

Page 6: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

2α = camera “angular aperture” (viewing angle)

n = camera resolution (image is assumed square nxn) l = offset along x between world and camera coordinates d = offset along y between world and camera coordinates h = offset along z between world and camera coordinates • Image to World

x(u,v) = h * cot θ - α + u ( 2α

n-1 ) * sin γ - α + v ( 2α

n-1 ) + l

y(u,v) = h * cot θ - α + u ( 2α

n-1 ) * cos γ - α + v ( 2α

n-1 ) + l

z(u,v) = 0 by definition of the x,y,z frame

y

x

γ z

yd

d

θ

α

α

Page 7: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

• World to Image

Improved Method From a Combination of Bertozzi & Broggi 1998 with Jiang 2000 Coordinate Systems (note right-handedness):

u(x,y,0) = ( θ (x,y,0) - (θ - α ) ) / (2 α / (n-1)) θ (x,y,0) = ????

( γ(x,y,0) - (γ - α) ) / / (2 α / (n-1)) γ(x,y,0) = ????

v(x,y,0) =

u v

dz

dx dy

z

x

y m x nu

v u = 0…..m-1 (rows) v = 0…..n-1 (rows) row = u + 1 col = v + 1

top side y

x dy

dx

γ0

z

y

dz

dy

θ0

α

α

Page 8: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

World to Image:

Image to World:

u(x,y,0) = m-1 2α

atan ( dz * sin (atan ( x-dx

y-dy )

x - dx ) - θ0 + α

)

n-1 2α

atan ( x - dz

y - dy

) - γ0 + α

v(x,y,0) =

x(u,v) = dz * cot θ0 - α + u ( 2α

m-1 ) * sin γ0 - α + v ( 2α

n-1 ) + dx

y(u,v) = dz * cot θ0 - α + u ( 2α

m-1 ) * cos γ0 - α + v ( 2α

n-1 ) + dy

z(u,v) = 0

Page 9: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Results of inverse-perspective mapping to get the “birds-eye view” for our MATLAB-generated test image:

50 100 150 200 250 300 350 400

100

200

300

400

500

600

Results on DARPA test image (just giving it a shot with the same camera parameters used for the MATLAB image):

50 100 150 200 250 300 350 400

100

200

300

400

500

600

(Obviously, we’ll need to play with the camera parameters a little.)

Page 10: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Eric:

1. Explored the output of the mapping functions (particularly the unexpected curving of horizontal lines) and generated the following document:

http://www.eng.utah.edu/~hamburge/BertozziAndBroggi_InversePerspEqnsTest.pdf

Randy:

1. Typed up meeting notes into weekly report. 2. Updated web page.

Next Steps

1. Finish figuring out the image to world coordinate transformation. We have the basics accomplished but need to resolve two issues:

a. With the current equations (adapted mostly from Broggi) horizontal lines are getting curved, which shouldn’t happen. Need to figure out how to fix this.

b. The interpolation in the non-uniform grid using griddata takes a long time. Once we get the horizontal lines straightened out, we may be able to write a much simpler function to speed this up.

2. Pick out some representative test images and create the ground truth answers for those.

a. Next week or the one after depending on how hard step 1 continues to be.

3. Take a good look a the list of processing techniques we’ve played with and those mentioned in the two survey articles we’ve read (see reference list at the end of last week’s report) and come up with a plan for a first attempt at a meaningful land detection algorithm.

4. Future from here: -> implement it -> check performance measures. 5. Come up with some utility to display algorithm results for convenient visual

feedback.

Page 11: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Recap of some of the lane-finding building-blocks we have experimented with to-date. See http://www.eng.utah.edu/~hamburge/CVprojectCode/

1. imagePatch_RMSerror.m – find RMS error between the image and a small horizontal or vertical lane marker template. Seemed to give promising results, but very slow to run.

Original Image

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450Test Patch

510152025

Scaled RMS Errors

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Thresholded RMS Errors

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Page 12: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Original Image

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450 Test Patch

20 40

51015

Scaled RMS Errors

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Thresholded RMS Errors

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Combined Horizontal and Vertical Lane Marker Search Results

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

Page 13: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Combined Horizontal and Vertical Lane Marker Search Results

100 200 300 400 500 600

50

100

150

200

250

300

350

400

450

2. testing_2007_02_08.m – Tried using dot products with expected lane marker

orientation to boost edges of lane markers which are in shadows. (More comments in the m file.)

Original Image Slice

100 200 300 400 500 600

50

100

150

Smoothed Gradient Magnitude

100 200 300 400 500 600

50

100

150

Raw Dot Product Result

100 200 300 400 500 600

50

100

150

Thresholded Dot Product

100 200 300 400 500 600

50

100

150

3. testing_2007_02_15.m – Tried using different gray-levels based on a histogram

analysis to threshold an image to help isolate potential lane markers. (More comments in the m file.)

Page 14: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Raw Image

0 0.1 0.2 0.3 0.40

1000

2000

3000

4000

5000Histogram

Mean as Threshold Bin 3 of 20 as Threshold

Bin 3 of 20 as Threshold

Shifted and Scaled

0 0.2 0.4 0.6 0.80

1000

2000

3000

4000

5000Histogram

Mean as Threshold Bin 3 of 20 as Threshold

Bin 3 of 20 as Threshold

Equilized Histogram

0 0.2 0.4 0.6 0.80

1000

2000

3000

4000

5000Histogram

Mean as Threshold Bin 14 of 20 as Threshold

Bin 14 of 20 as Threshold

4. testing_2007_02_27.m – Explored the steerable filters used in McCall 2006 and found that a correctly-sized Laplacian may be able to find the centers of lane markers directly and distinguish between dark-bright-dark (lane marker) patterns and bright-dark-bright (tar strip) patterns. (More comments in the m file.)

Page 15: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

20 40 60 80 100 120-0.2

0

0.2

0.4

0.6

0.8

1

1.2Idealized Lane Marker Intensity Cross Section

20 40 60 80 100 1200

0.01

0.02

0.03

0.04

0.05Gaussian Smoothing Kernel

20 40 60 80 100 1200

0.2

0.4

0.6

0.8

1Smoothed Intensity

20 40 60 80 100 120-0.2

0

0.2

0.4

0.6

0.8

1

1.2Idealized Lane Marker Intensity Cross Section

20 40 60 80 100 120-4

-2

0

2

4x 10-3 Gaussian First Derivative

20 40 60 80 100 120-0.05

0

0.05Smoothed First Derivative

20 40 60 80 100 120-0.2

0

0.2

0.4

0.6

0.8

1

1.2Idealized Lane Marker Intensity Cross Section

20 40 60 80 100 120-8

-6

-4

-2

0

2

4x 10-4 Gaussian Second Derivative

20 40 60 80 100 120-6

-4

-2

0

2

4x 10-3 Smoothed Second Derivative

* =

* =

* =

20 40 60 80 100 120-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2Idealized Tar Stripe Intensity Cross Section

20 40 60 80 100 1200

0.01

0.02

0.03

0.04

0.05Gaussian Smoothing Kernel

20 40 60 80 100 120-0.7

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0Smoothed Intensity

20 40 60 80 100 120-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2Idealized Tar Stripe Intensity Cross Section

20 40 60 80 100 120-4

-2

0

2

4x 10-3 Gaussian First Derivative

20 40 60 80 100 120-0.05

0

0.05Smoothed First Derivative

20 40 60 80 100 120-1.2

-1

-0.8

-0.6

-0.4

-0.2

0

0.2Idealized Tar Stripe Intensity Cross Section

20 40 60 80 100 120-8

-6

-4

-2

0

2

4x 10-4 Gaussian Second Derivative

20 40 60 80 100 120-4

-2

0

2

4

6

8x 10-3 Smoothed Second Derivative

* =

* =

* =

Page 16: CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson ...hamburge/ProjectReport3_12.pdf · CS5320/6320 Computer Vision Class Project Team Members: Eric Johnson, Randy

Original Image

Magnitude of Laplacian of Gaussian Output Only Negative Values Retained