53
Object Removal in Multi-View Photos Aaron McClennon-Sowchuk, Michail Greshischev

Object Removal in Multi-View Photos

Embed Size (px)

DESCRIPTION

Object Removal in Multi-View Photos. Aaron McClennon-Sowchuk , Michail Greshischev. Objectives. Remove an object from a set of images by using information (pixels) from other images in the set. The images must be of the same scene but can vary in time of taken and/or perspective of scene. - PowerPoint PPT Presentation

Citation preview

Page 1: Object Removal in Multi-View Photos

Object Removal in Multi-View Photos

Aaron McClennon-Sowchuk, Michail Greshischev

Page 2: Object Removal in Multi-View Photos

Objectives

Remove an object from a set of images by using information (pixels) from other images in the set.

The images must be of the same scene but can vary in time of taken and/or perspective of scene.

The allowed variance in time means objects may change location from one image to the next.

Applications: stock photography, video surveillance, etc.

Michail
First slide should never be wall of text
Page 3: Object Removal in Multi-View Photos

Steps

1. Read Images

2. Project images in same perspective

3. Align the images

4. Identify differences

5. Infill objects

Michail
Image Registration
Michail
Image Rectification
Michail
Not needed, part of rectification. This can be deleted.
Michail
Change Detection
Page 4: Object Removal in Multi-View Photos

Reading Images

How are images represented?– Matrices (M x N x P)

– M is the width of the image – N is the height of the image – P is 1 or 3 depend on quality of image

1: binary (strictly black/white) or gray-scale images

3: coloured images (3 components of colour: R,G,B)

What tools are capable of processing images?– Many to choose from but MatLab is ideal for matrices.

– Hence the name Mat(rix) Lab(oratory)

Michail
Represented where?
Michail
Examples? Also we didn't chose matlab because can do matricies. Maple, excel, 'R' can do that too. We chose it because of its large imaging foundation.
Page 5: Object Removal in Multi-View Photos

Object Removal in Multi-View Photos

Image Rectification

Page 6: Object Removal in Multi-View Photos

Image Rectification

Figure 1: Example rectification of source images (1) to common image plane (2). 1

Transformation process used to project two-or-more images onto a common image plane.

Corrects image distortion by transforming the image into a standard coordinate system. 1

Page 7: Object Removal in Multi-View Photos

Image Rectification

To perform a transform...

Cameras are calibrated and provide internal parameters resulting in an essential matrix representing relationship between the cameras.– We don’t have access to camera’s internal parameters.– What if single camera was used?

The more general case (without camera calibration) is represented by the fundamental matrix. 2

Page 8: Object Removal in Multi-View Photos

Fundamental Matrix

Algebraic representation of epipolar geometry.

3×3 matrix which relates corresponding points in stereo images.

7 degrees of freedom, therefore at least 7 correspondences are required to compute the fundamental matrix. 3

Page 9: Object Removal in Multi-View Photos

Corresponding Points

Figure out which parts of an image correspond to which parts of another image.– But what is a ‘part’ of an image?

‘part’ of an image is a Spatial Feature.

Spatial Feature Detection is the process of identifying spatial features in images.

Page 10: Object Removal in Multi-View Photos

Spatial Feature Detection - Edges

Canny, Prewitt, Sobel, Difference of Gaussians...

Figure 2: Example application of Canny Edge Detection 4

Page 11: Object Removal in Multi-View Photos

Spatial Feature Detection - Corners

Harris, FAST, SUSAN

Figure 2: Example application of Harris Corner Detection 5

Page 12: Object Removal in Multi-View Photos

Feature Description

Simply identifying a feature point is not in itself useful.– consider how one would attempt to match detected

feature points between multiple images.

Scale-invariant feature transform (SIFT) offers robust feature description. 6

– Invariant to scale– Invariant to orientation– partially invariant to illumination changes

Page 13: Object Removal in Multi-View Photos

SIFT

Uses Difference of Gaussians along with multiple smoothing and resampling filters to detect key points (Feature Points with descriptor data)

Key point specifies 2D location, scale, and orientation.

Page 14: Object Removal in Multi-View Photos

SIFT

Figure 3: Sample image for SIFT application. 7

Page 15: Object Removal in Multi-View Photos

SIFT – Feature Points

Figure 4: Detected feature points via SIFT. 7

Page 16: Object Removal in Multi-View Photos

SIFT – Key Point

Figure 5: A SIFT key point in detail. 7

Page 17: Object Removal in Multi-View Photos

SIFT - Matching

Matches key points by identifying nearest neighbour with the minimum Euclidean distance.

Ensures robustness via... Cluster identification by Hough transform voting. Model verification by linear least squares.

Page 18: Object Removal in Multi-View Photos

SIFT - Matching

Figure 5: Example of matched SIFT key points. Note its tolerance to image scale and rotation.

Page 19: Object Removal in Multi-View Photos

SIFT – Suitable for Multi-View?

SIFT fails to accurately match key points between images which vary significantly in perspective.

Figure 7 & 8: Comparison of SIFT accuracy with varying perspective angles.

Left image is 45 degrees with 152 matches.

Right image is 75 degrees with 11 matches. 8

Page 20: Object Removal in Multi-View Photos

SIFT – Suitable for Multi-View?

SIFT fails to accurately match key points between images which undergo non-scalable affine transformation or projection.

Figure 9: SIFT fails to identify any key point matches between rotated images on a cylinder. 8

Page 21: Object Removal in Multi-View Photos

ASIFT

Affine-SIFT (ASIFT) is a new framework for fully affine invariant image comparison.

Uses existing SIFT key point descriptors, but matching algorithm has improved.

Page 22: Object Removal in Multi-View Photos

ASIFT – Improvements over SIFT

Simulated images are compared by a rotation, translation and zoom-invariant algorithm.– (SIFT normalizes translation and rotation and simulates

zoom.)

Page 23: Object Removal in Multi-View Photos

ASIFT – Improvements over SIFT

Figure 10: ASIFT (left) identifies 165 matches compared to SIFT’s (right) 11 matches on surface rotated 75 degrees. 8

Page 24: Object Removal in Multi-View Photos

ASIFT – Improvements over SIFT

Figure 10: ASIFT identifies 381 matches between rotated surfaces. 8

Page 25: Object Removal in Multi-View Photos

Image Rectification

Quick Review...

1. Given multiple images of the same scene from different perspectives...

2. We have identified & matched feature points using ASIFT.

We now have sufficient matching points to calculate the fundamental matrix.

Page 26: Object Removal in Multi-View Photos

Calculating Fundamental Matrix

Random Sample Consensus (RANSAC) is used to eliminate outliers from matched points.1. Select 7 points at random.

2. Use them to compute a Fundamental Matrix between the image pair.

3. Project every point in the dataset onto the conjugate image pair using the Fundamental Matrix.

4. If at least 7 points were projected closer to their actual locations than their allowable errors, stop.

5. Use those 7 points to calculate final Fundamental Matrix.

Page 27: Object Removal in Multi-View Photos

Example Image Rectification

Input Images

Page 28: Object Removal in Multi-View Photos

Example Image Rectification

ASIFT Matches

Page 29: Object Removal in Multi-View Photos

Example Image Rectification

RANSAC selection

Page 30: Object Removal in Multi-View Photos

Example Image Rectification

Resulting Rectification

Page 31: Object Removal in Multi-View Photos

Identifying Image Differences

Possible Methods:1. Direct subtraction2. Structural Similarity Index (SSIM) 3. Complex Waveform SSIM

Michail
SSIM is "Structural SIMularity", no IndexIt should be writtenStructural Simularity (SSIM) Index
Page 32: Object Removal in Multi-View Photos

Identifying Image Differences

1. Direct subtraction– Too good to be true! (way too much noise)

Michail
Why can't it be removed with filters? Post processing? Thresholds?
Page 33: Object Removal in Multi-View Photos

Identifying differences

2. Structural Similarity Index (SSIM)– Number 0-1 indicating how “similar” two pixels are.

– 1 indicates perfect match, 0 indicates no similarities at all

– Number calculated based on:

– Luminance, function of the mean intensity for gray-scale image

– Contrast, function of std.dev of intensity for gray-scale image

Michail
Structure Simularity (SSIM) Index
Michail
This would instantly get a "wrong!" from our advisor. SSIM does not operate or represent a pixel-to-pixel similarity measurement. It measures over a window.
Michail
Wrong..."The resultant SSIM index is a decimal value between -1 and 1"1 is only reachable in the case of two identical sets of data.
Page 34: Object Removal in Multi-View Photos

Object Removal in Multi-View Photos

Complex Waveform SSIM

Page 35: Object Removal in Multi-View Photos

Complex Waveform SSIM

SSIM vs Complex Waveform SSIM(CWSSIM)

SSIM CWSSIM

Sensitive to pixel shifting (Spatial Shifts)

Tolerates small amounts of pixel shifting (Spatial Shifts)

Equally weight given to low resolution and high resolution differences.

Bands are scalable.

Reports incorrect magnitude of error in blurred images.

Correctly identifies level of error in blur.

Page 36: Object Removal in Multi-View Photos

CWSSIM - Implementation

Steerable Pyramid constructed for each image– (Steerable Pyramid is a linear multi-scale, multi-

orientation image decomposition)

SSIM value calculated for each band, from high to low frequency.

SSIM values for each band are scaled and summed.

Page 37: Object Removal in Multi-View Photos

CWSSIM - Implementation

Analyzing bands with multi-scale, multi-orientation image decomposition instead of direct pixel comparison provides tolerance for Spatial Shifts.

By reducing the contribution SSIM indexes belonging to high frequency bands we can reduce noise.– …but we lose recognition of changes in that frequency.

Page 38: Object Removal in Multi-View Photos

CWSSIM - Example

Input Images

Page 39: Object Removal in Multi-View Photos

CWSSIM - Example

Application of CWSSIM with equal frequency weights.

Page 40: Object Removal in Multi-View Photos

CWSSIM - Example

Input Images

Page 41: Object Removal in Multi-View Photos

CWSSIM - Example

Application of CWSSIM with decreased low frequency weight.

Page 42: Object Removal in Multi-View Photos

Identifying differences

Once again, way too much noise.

SSIM map: 0 black pixel 1 white pixel

Page 43: Object Removal in Multi-View Photos

– Concerns:– Identify regions to copy

• Calculate a bounding box (smallest area surrounding entire blob)

– How to distinguish noise from actual objects?• Area - those blobs with area below threshold are ignored • location - those blobs along an edge of image are

ignored.

– Copying method • Direct – images from same perspectives• Manipulated pixels – images from different perspectives.

Infilling the objects

Michail
This isn't ideal, we want the exact shape. Consider a case where the change is a diagonal line across the whole image.
Michail
Explain why there is noise around edge of image
Michail
How are these 2 different?In 1 case you transform the pixels then copy them.In the other case you copy them then transform them.The end result is identical...
Page 44: Object Removal in Multi-View Photos

Infilling the objects

Original bounding box results:

Matlab returns

Left position

Top position

Width and

Height of each

box

Michail
Formatting..
Page 45: Object Removal in Multi-View Photos

Infilling the objects

Result with small blobs and blobs along edges ignored:

Left: 119 Top: 52 Width: 122 Height: 264

Page 46: Object Removal in Multi-View Photos

Infilling the objects

Once regions identified, how can pixels be copied?– Same perspective – direct copy is possible.

Michail
Once regions 'are' identified
Page 47: Object Removal in Multi-View Photos

Infilling the objects

Result of direct copying

Page 48: Object Removal in Multi-View Photos

Infilling the objects

Different perspectives– Goal: remove black trophy from left image

Page 49: Object Removal in Multi-View Photos

Infilling the objects

Direct copying produces horrendous results!

Rectified image Result

Michail
This doesn't make sense.Why is the left "rectified image" shown? You're copying the blank wall from the right image, not the left. Show the right image rectified, not the left.
Page 50: Object Removal in Multi-View Photos

Work to come...

Copying techniques – Need better method for infilling objects between images in

different perspectives. Perhaps use same alignment matrix.

Anti-Aliasing – Method to smooth the edges around pixels copied from one

image to another– example looks alright but could improve other test cases

User friendly interface– Current state: a dozen different MatLab scripts. – In the perfect world, we’d have a nice interface to let user load

images and clearly displa

Michail
Fundamental Matrix.If you're copying from a rectified image the Fundamental Matrix has already been applied to the region you're copying.
Page 51: Object Removal in Multi-View Photos

Thank You! Questions?

Page 52: Object Removal in Multi-View Photos

References

1. Oram, Daniel (2001). "Rectification for Any Epipolar Geometry“

2. Fusiello, Andrea (2000-03-17). "Epipolar Rectification". http://profs.sci.univr.it/~fusiello/rectif_cvol/rectif_cvol.html.

3. Richard Hartley and Andrew Zisserman (2004). “Multiple View Geometry in Computer Vision Second Edition”

4. Ma,Yi. (1996) Basic Image Processing Demos (for EECS20) http://robotics.eecs.berkeley.edu/~sastry/ee20/index.html

5. Mark Nixon & Alberto Aguado (2002), Feature Extraction & Image Processing, Newnes

6. Lowe, D. G., “Distinctive Image Features from Scale-Invariant Keypoints”, International Journal of Computer Vision, 60, 2, pp. 91-110, 2004.

7. Andrea Vedaldi and Brian Fulkerson (2005), “VL_SIFT” http://www.vlfeat.org/overview/sift.html

8. Jean-Michel Morel and Guoshen Yu (2010), “SIFT and ASIFT”,  ASIFT: A New Framework for Fully Affine Invariant Image Comparison

Page 53: Object Removal in Multi-View Photos

References

Z. Wang and A. C. Bovik, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. Image Processing, vol. 13, pp. 600 – 612, Apr. 2004. www.ece.uwaterloo.ca/~z70wang/publications/ssim.html