Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Motion and Depth from Optical Flow
Patrizia Baraldi^, Enrico De Micheli* & Sergio Uras*
t Lab. di Bioingegneria, Facolta' di Medicina,Universita' di Modena, Modena, Italy
* Dipartimento di Fisica dell' Universita di Genova,Via Dodecaneso 33, 16146 Genova, Italy.
Passive navigation of mobile robots is one of the chal-lenging goals of machine vision. This note demonstratesthe use of optical flow, which encodes the visual infor-mation in a sequence of time varying images [1], for therecovery of motion and the understanding of the threedimensional structure of the viewed scene. By using amodified version of an algorithm, which has recently beenproposed to compute optical flow, it is possible to obtaindense and accurate estimates of the true ID motion field.Then these estimates are used to recover the angular ve-locity of the viewed rigid objects. Finally it is shown that,when the camera translation is known, a coarse depthmap of the scene can be extracted from the optical flowof real time varying images.
The navigation of a robot in any environment requiresthe knowledge of the motion of the robot relative tothe environment, the three dimensional structure of thescene and the motion parameters of the object movingin the scene. These informations can be obtained by us-ing active sensors and/or by passive vision, provided bycameras mounted on the robot.
In this note it is shown how passive vision can be usedto recover depth when the camera on the robot is trans-lating and angular velocity when the camera looks atrotating objects. The proposed technique first computesthe optical from a sequence of time varying images byusing a modification of the algorithm recently proposed[2,3]. The angular velocity can be obtained by exploit-ing mathematical properties of the 2D motion field [5].Depth is obtained from the computed optical flow, usingan equation already proposed by many authors (Horn1987 , Tommasi personal communication).
THE COMPUTATION OF OPTICALFLOW
The motion of objects in the viewed scene at every timet defines a 3D velocity field, which is transformed by theimaging device of a T. V. camera in a ID vector field v =(vi(x,y),v2(x,y)), usually called the ID motion field.
It has been shown [3] that a ID vector field close tov, usually called the optical flow u, can be obtained bysolving the vector equation
- grad E = 0at (1)
where d/dt indicates the total time derivative (or theEulerian derivative) and grad E is the spatial gradientof the image brightness E(x,y,t). Fromeq. 1 the opticalflow u can be computed as
-1 d
where H is the Hessian matrix of E(x,y).
The relation between the optical flow u and the true IDmotion field v is:
_1 / T dE\u = t J + H [Jv gra.dE — grad ——I (4)
^ tit '
where 3vT indicates the transpose of the Jacobian ma-trix of v.
It is evident from eq. 2 that numerical stability of thecomputation of the optical flow requires a robust inver-sion of the matrix H, which is garanteed when det His large, and the conditioning number CJJ of the ma-trix H is close to one [7]. From eq. 4 it is evident thatwhen the term in brackets is bounded the optical flow uis close to the true ID motion field v when the entriesof the matrix H~* are small. Since H is symmetric theconditioning number C J J is equal to Amaa;/Amir, whereXmax and Am,n are the largest and smallest of the tworeal eigenvalues of H. As a consequence when det H islarge and CJJ is close to one, the two eigenvalues Amar
and Am(n will be both large ensuring that:
i. the computation of the optical flow from eq. 2 isnumerically robust;
ii. the optical flow u is usually close to the true 2Dmotion field v, with the exception of those locations. . . . , T ^ T dEin the image where Jv or —r— are very large.
Therefore the conditions det H large, and CJJ ~ 1 usu-ally garantee good recovery of the 2D motion field.
We now describe the procedure used to compute the op-tical flow from a sequence of time-varying images, suchas the one shown in Figures 1A and 3A.
205 AVC 1989 doi:10.5244/C.3.35
Figure 1. Computation of optical Sow for a ro-tating object. A) One frame of a sequence of anindoor scene. B) Row optical How obtained withspatial gaussian smoothing with mask size = 5pixels, temporal gaussian smoothing with masksize = 9 pixels, Det H > 0.05 and CH < 10.
The optical flow obtained by solving eq. 2 is first com-puted at each location. Many vectors with an erroneousmagnitude or direction are clearly present. By choosingonly those vectors, obtained when the computation ofH~* is numerically robust (det H large and C J J ~ 1) asparse, but almost exact optical flow is obtained (FiguresIB and 3B).
ANGULAR VELOCITY FROM OPTI-CAL FLOWIn order to recover a dense and meaningful optical flowit is desirable to fill in the empty areas of the opticalflow by using a filling-in procedure. The optical flowis then smoothed by the convolution with a gaussian
B
Figure 2. Computation of angular velocity. A)Optical Sow of Figure IB after filling-in andsmoothing with gaussian filter with mask size= 21. B) Angular velocity computed from theoptical Sow of Figure 2A versus time (in de-gree /fra me).
symmetrical filter and the results are shown in Figures2A and 4A. In Figure 1A a scene in which the rotationaxis was about parallel with the optical axis of the view-ing camera is displayed. In Figure 2A the immobile pointof the rotation, i.e. the point perspective projection ofa point which lies on the rotation axis, is clearly presentand the angular velocity u is equal to |A|, where A is theeigenvalue of the Jv matrix computed to the immobilepoint (if the optical axis is parallel to the rotation axiswe have |Ax | = |A2|) [3]. Figure 2B compares the trueangular velocity (solid straight line) and the computedangular velocity for the sequence of images of Figure 1A.The agreement between the true and computed angularvelocity depends on the texture of the scene. When the
206
Figure 3. Computation of optical How for ego-motion. A) One frame of a sequence represent-ing two books stacks acquired by a camera mov-ing towards them. B) Row optical How obtainedwith a gaussian smoothing with mask size = 7pixels, no temporal smoothing, Det H > 0.1 andconditioning number CJJ < 5. The focus of ex-pansion is visibly located at the left side of theimage plane; the angle between the direction oftranslation and the optical axis was of about 10degrees.
viewed scene is densely textured the accuracy can beas high as 95% (the mean accuracy in the sequence ofFigure 1A is 95.4%)
DEPTH FROM OPTICAL FLOWOptical flow can also be used to recover depth from mo-tion by using the formula
B. 200
Figure 4. Computation of a depth map. A) Opti-cal How of Figure 3B after Hlling-in and smooth-ing with a gaussian Hlter with mask size = 17. B)Depth map obtained by the optical How of Fig-ure 2C. The true distances from the image planewere 37 cm, 56 cm and 72 cm for the books stackin the lower left side, for the one on the right sideand for the background respectively. Computedmean values for the three regions were 35 ± 5 cm,61 ± 9 cm and 62 ± 13 cm.
Z — VrD
(5)
where ucamera is the velocity of the moving camera, Dis the distance of the point (x,y) on the image planefrom the focus of expansion Fe, V is the amplitude ofthe flow in (x,y), and Z is the depth of the point inthe scene projected in (x, y). Figure 3A shows an imagefrom a sequence taken by a camera translating towards
207
two book stacks at different depth. The books and thebackground were covered with newspaper sheets in orderto increment the texture of the scene. To avoid the com-putation of depth near the focus of expansion, where theoptical flow values are noisy, the angle between the direc-tion of translation and the optical axis was set to about10 degrees. Consequently the focus of expansion liednear to the image boundary. Figure 3B reproduces thecomputed optical flow and Figure 4A the optical flow af-ter the filling-in and smoothing procedures. When depthis averaged over the results obtained from a few frameswe obtain the map shown in Figure 4B. The results ob-tained for the higher book stack of the scene are in goodagreement with the true depth, whereas parts of the 3Dstructure of the lower one are noisy and almost indistin-guishable from the background (see legend for numericaldetails).
This note presents an algorithm which is adequate tocompute a dense optical flow from which it is possible torecover motion information and depth. The computationof optical flow occurs in two steps: in the first step byusing eq. 2 a row optical flow is obtained out of whichonly the reliable displacement vectors are selected; inthe second step a dense optical flow is obtained by fillingin holes of the optical flow produced in the first step.Different filling-in procedures can be used with differentresults. The proposed algorithm seems very suitable tosolve the problem of motion analysis in machine vision.
We wish to thank Vincent Torre for his continuous en-couragement and help and Marco Campani who has beenable to modify IATgX to meet the editor requirements.This research has been supported by the ESPRIT IIproject No. 2502.
REFERENCES
1. Gibson, J.J. 1950. The perception of the VisualWorld. Boston: Houghton Mifflin.
2. Uras, S., Girosi, F., Verri, A., fc V. Torre.1988. "A computational approach to motion per-ception". Biological Cybernetics, 60: 69-87 .
3. Girosi, F., Verri, A., & V. Torre. 1989. "Con-straints in the Computation of Optical Flow". Pro-ceedings of the IEEE Workshop on Visual Motion,Irvine CA.
4. Koenderink, J.J. & A.J. van Doom. 1977."How an ambulant observer can construct a modelof the environment from the geometrical structure ofthe visual inflow". In G.Hauske and E. Butendant(Eds.) Kibernetic iP77(01denbourg, Munchen).
5. Verri, A., Girosi, F. & V. Torre. 1989. "Math-ematical Properties of the ID Motion Fields: fromSingular Points to Motion Parameters". Proceedingsof the IEEE Workshop on Visual Motion, IrvineCA.
6. Dreschler, L. & H.-H. Nagel. 1982. "Volu-metric model and 3D trajectory of a moving carderived from monocular TV frame sequences of a
street scene". In Comput.Process. 20: 199-228.
Vision Graph. Image
7. Lanczos, C. 1961. Linear Differential Operators.London: D. Van Nostrand Company.
8. Ullmann, S. 1983. "Recent computational studiesin the interpretation of structure from motion". InHuman and Machine Vision, A. Rosenfeld & J. Beck(Eds.), Academic Press, New York.
208