Upload
juliet-thornton
View
219
Download
0
Embed Size (px)
Citation preview
1
CSP03-04 - Visual input processing
Visual input processing
Lecturer:Smilen Dimitrov
Cross-sensorial processing – MED7
2
CSP03-04 - Visual input processing
Introduction
• The immobot base exercise• Work on the visual input• Goal – object localization
in 3D• Setup:
– PC– Two Logitech QC Zoom
webcams
3
CSP03-04 - Visual input processing
Setup
• Setup for a PC:
1. Logitech QuickCam (QC) drivers 2. QuickTime 3. WinVDig (that corresponds to the installed version of QuickTime) 4. Max/MSP/Jitter
4
CSP03-04 - Visual input processing
Setup
• Camera parameters
Image Sensor: 1/4” Color 640 x 480 Pixel CMOS
Lens type: 3P F/# : F/2.4
Effective focal length : 5.0mm
5
CSP03-04 - Visual input processing
Setup
• Low tech configuration – stereo imaging not guaranteed (frame delays)
• Other options – Bumblebee camera
– True stereo camera – Firewire (power issues, drivers)
• Axis 206 camera
• IP camera (drivers)
6
CSP03-04 - Visual input processing
Goal of the vision processing algorithm
• Object detection: – the application needs to detect the
presence of a new object whenever it enters the monitored environment.
• Object recognition: – Once a new object is detected, it needs
to be classified to determine its type (e.g., a car versus a truck, a tiger versus a deer).
• Object tracking: – Assuming the new object is of interest to
the application, it can be tracked as it moves through the environment. Tracking involves computing current location of the object and its trajectory,
Color tracking
Estimation of 3D location
through two view geometry - stereopsis
7
CSP03-04 - Visual input processing
Goal of the vision processing algorithm
•
8
CSP03-04 - Visual input processing
Color tracking
• Using a Max/MSP/Jitter provided algorithm – jit.findbounds
•Input – min and max color range to react to, and video
•Output – min and max (x,y) coordinates of the rectangle where the color has been found
9
CSP03-04 - Visual input processing
Color tracking
• Jit.findbounds output – rectangle
• Center coordinate2
2
maxmin0
maxmin0
yyy
xxx
10
CSP03-04 - Visual input processing
Color tracking – example code
• Can be performed in Max/MSP javascript using jsui – slow !
11
CSP03-04 - Visual input processing
Color tracking - background
• Video tracking - the process of locating a moving object (or several ones) in time using a camera. An algorithm analyses the video frames and outputs the location of moving targets within the video frame.– video tracking systems usually employ a motion model which
describes how the image of the target might change for different possible motions of the object to track.
• Video tracking approaches: – Blob tracking: Segmentation of object interior (for example
blob detection, block-based correlation or optical flow) – Contour tracking: Detection of object boundary (e.g. active
contours or Condensation algorithm) – Visual feature matching: Registration
• Color tracking is a type of blob tracking
12
CSP03-04 - Visual input processing
Color tracking - background
• Blob detection refers to visual modules that are aimed at detecting points and/or regions in the image that are either brighter or darker than the surrounding. There are two main classes of blob detectors
(i) differential methods based on derivative expressions and (ii) methods based on local extrema in the intensity landscape.
• A blob (binary large object) is an area of touching pixels with the same logical state.
• A group of pixels organized into a structure is commonly called a blob. Problems related to blobs:
1. Where are the edges?2. Where is the center?3. How many pixels does it contain?4. What is the average pixel intensity?5. What is the blob's orientation (angle)?
13
CSP03-04 - Visual input processing
Color tracking - background
• Blob center calculation – simple method
14
CSP03-04 - Visual input processing
Color tracking - background
• A blob (binary large object) is an area of touching pixels with the same logical state. – All pixels in an image that belong to a blob are in a foreground
state. – All other pixels are in a background state. – In a binary image, pixels in the background have values equal
to zero while every nonzero pixel is part of a binary object.
• For jit.findbounds - this logical test of belonging to the blob is whether the color of the currently tested pixel falls within the range set to be detected
15
CSP03-04 - Visual input processing
Color tracking - background
• What is easily identifiable by the human eye as several distinct but touching blobs - may be interpreted by software as a single blob.
• A reliable software package will tell you how touching blobs are defined. For example, you can define touching pixels as adjacent pixels along the vertical or horizontal axis as touching or include diagonally adjacent pixels.
• Segmentation of the image - separating the good blobs from the background and each other as well as eliminating everything else that is not of interest.
• Segmentation usually involves a binarization operation – a black and white image result
16
CSP03-04 - Visual input processing
Color tracking - background
• blob analysis – logical – (generally) performed on black and white image
• Brightness - rectangle algorithm– The rectangle algorithm
keeps track of four points in each frame, the top most, left most, right most and bottom most points where the brightness exceeds a certain threshold value.
17
CSP03-04 - Visual input processing
Color tracking - background
• Tracking types:(I) objects of a given nature, e.g., cars, people, faces(II) objects of a given nature with a specific attribute, e.g.,
moving cars, walking people, talking heads, face of a given person
(III) objects of a priori unknown nature but of a specific interest, e.g., moving objects, objects of semantic interest manually picked in the first frame
• (I) and (II) - part of the input video frame is searched against a reference model (image patches – or overall shape[geometry]) describing the appearance of the object.
• (III) - the reference can be extracted from the first frame and kept frozen – color tracking
• Recent color tracking algorithms: – MeanShift– Continuously Adaptive Mean Shift (CamShift)
18
CSP03-04 - Visual input processing
Color tracking - background
• Advanced application of tracking in stereo – matching • Starting from a collection of images or a video sequence the first
step consists in relating the different images to each other.
• two images are shown with the extracted corners. Note that it is not possible to find the corresponding corner for each corner, but that for many of them it is.
• In our example, we are having only one 3D point to deal with – we assume the data obtained from the two cameras are matched
19
CSP03-04 - Visual input processing
Camera parameters
• Extrinsic and intrinsic parameters
• Extrinsic parameters– the orientation of the camera Euclidean co-ordinates with
respect to the world Euclidean co-ordinate system.This relation is given by matrices R and t.
– Thus there are six extrinsic camera parameters; three rotations and three translations.
20
CSP03-04 - Visual input processing
Camera parameters
• Extrinsic and intrinsic parameters
• Intrinsic parameters – coefficients of calibration matrix K
• px and py are the width and the height of the pixels, c=[cx cy 1]T is the principal point (defined as intersection of the optical axis and the retinal [image] plane - center of image plane) and a the skew angle as indicated
21
CSP03-04 - Visual input processing
Stereo 3D localization algorithm
22
CSP03-04 - Visual input processing
Stereo 3D localization algorithm
• Problem:
23
CSP03-04 - Visual input processing
Stereo 3D localization algorithm
• Writing the system for the two cameras
24
CSP03-04 - Visual input processing
Stereo 3D localization algorithm
• Special case – canonical configuration – binocular– The model has two
identical cameras separated only in the X direction by a baseline distance b. The image planes are coplanar in this model.
– The baseline is aligned to the horizontal co-ordinate axis, the optical axes of the cameras are parallel, the epipoles move to infinity, and the epipolar lines in the image planes are parallel.
• Rotation matrices are identity.
• b – distance, f – focal length• Extrinsic parameters
25
CSP03-04 - Visual input processing
Stereo 3D localization algorithm
• Intrinsic parameters are ignored here – no calibration !
• We will try to scale the coordinates manually until we get something meaningful.
26
CSP03-04 - Visual input processing
Stereo 3D localization algorithm
• Intersection of the lines in 3D is not guaranteed• Derivation using principle behind CPA (closest points of approach)
– Looking for the closest points on the lines
– Solution using parametric equations
27
CSP03-04 - Visual input processing
Stereo 3D localization algorithm
• Finally, we obtain the estimate point CMID which we declare to be our object location O(X,Y,Z)
• We will use this in code to calculate the vector location from the obtained coordinates from color tracking
• Will be programmed in JavaScript, and called from Max/MSP/Jitter
28
CSP03-04 - Visual input processing
Problems with the approach
• No calibration – no intrinsic parameters taken into account• Low end cameras – aberrations• Low end cameras – radial distortions
• No guarantee for time sync between left and right images
• In general – approximative/illustrative
29
CSP03-04 - Visual input processing
Implementation in Max/MSP/Jitter