10
International Journal of Advance Foundation and Research in Computer (IJAFRC) Volume 1, Issue 5, May 2014. ISSN 2348 - 4853 44 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org An Intelligent Object Inpainting Approach For Video Repairing Mr.Abhijeet A.Chincholkar.Prof.Salim A.Chavan. PG Student, M.E. Digital Electronics, DBNCOET Yavatmal, Maharashtra, India. Associate Professor and Vice Principal, DBNCOET Yavatmal, Maharashtra, India. [email protected],[email protected] A B S T R A C T This paper improves technique to speed up video Inpainting for Video Repairing. This approach reduces search area by using 2D slices rather than 3D patches. In this method RANSAC algorithm for Estimation of Transformation block, KLT Algorithm for feature tracking and Max compiler is used to compile C programming and reduces processing time. To overcome over-smoothing problem method finds best match to give the good results in less time. This method also tries to maintain spatial consistency and temporal motion continuity simultaneously of an object. Homemade video includes jitter (unintended motion) and some kind of intentional motion such as pan or zoom. Viewers preferred always to see jitter free videos, created by smoothly moving camera. Sometimes in video analysis contrast, aligning to fix stable background is preferred. This paper presents simple algorithm which is able to remove both types of motion by using an efficient way to track background points while ignore moving foreground points. This approach is mostly similar to image mosaicing but result is video rather than an enlarged image. It is related with multiple objects tracking approach but quite simpler due to moving objects tracking is not needed. The algorithm receives a video input and returns one or multiple stabilized videos. Video breaks into parts when algorithm detects background change and it fix upon a new background. Index Terms :- Feature Extraction, Image Mosaicing, Feature Selection, Video Stabilization, Matrix Transformation, Degree of Fredom . I. INTRODUCTION There are two types of motions which dominate the videos taken by a person with a help of hand-held camera. First is intentional motion such as panning or zooming and second are unintentional motions which are unwanted shakes and jitters. Some recent work on video stabilization tries to remove shakes and jitters while smoothing the most probably intended motion. This is done by firstly identifying common features in the scene and estimating frame-to-frame movement of these features. Then, algorithms work as follows: 1) Takes estimation of original camera motion, 2) fit proposed model for capturing smoothed most probably intended camera motion, 3) solve for transformation matrices for each frame merging them with new stabilized video without jitter and apparently smooth camera motion. In recent work done the same approach is by Grundmann et al.[6], demonstrate the context of YouTube a robust algorithm for stabilizing the video to produce a final video which appears a shot due to improper handling by a professional cameraman. In some situation, associated with video analysis, where goal stabilizes a video in respective of its background and accordingly removes clear observed camera motion entirely. Many reasons are here to remove camera motions such as stabilization of videos which enables background modeling[16], makes object’s motion more clear to human and observers both. There are some modes where goal of removing camera motion is not practical. Where cameras are mounted on moving vehicles, more complications are involves multiple object tracking and camera modeling is

An Intelligent Object Inpainting Approach For Video Repairing

  • Upload
    ijafrc

  • View
    10

  • Download
    0

Embed Size (px)

DESCRIPTION

This paper improves technique to speed up video Inpainting for Video Repairing. This approach reduces search area by using 2D slices rather than 3D patches. In this method RANSAC algorithm for Estimation of Transformation block, KLT Algorithm for feature tracking and Max compiler is used to compile C programming and reduces processing time. To overcome over-smoothing problem method finds best match to give the good results in less time. This method also tries to maintain spatial consistency and temporal motion continuity simultaneously of an object. Homemade video includes jitter (unintended motion) and some kind of intentional motion such as pan or zoom. Viewers preferred always to see jitter free videos, created by smoothly moving camera. Sometimes in video analysis contrast, aligning to fix stable background is preferred. This paper presents simple algorithm which is able to remove both types of motion by using an efficient way to track background points while ignore moving foreground points. This approach is mostly similar to image mosaicing but result is video rather than an enlarged image. It is related with multiple objects tracking approach but quite simpler due to moving objects tracking is not needed. The algorithm receives a video input and returns one or multiple stabilized videos. Video breaks into parts when algorithm detects background change and it fix upon a new background.

Citation preview

Page 1: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

44 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

An Intelligent Object Inpainting Approach For Video

Repairing Mr.Abhijeet A.Chincholkar.Prof.Salim A.Chavan.

PG Student, M.E. Digital Electronics, DBNCOET Yavatmal, Maharashtra, India.

Associate Professor and Vice Principal, DBNCOET Yavatmal, Maharashtra, India.

[email protected],[email protected]

A B S T R A C T

This paper improves technique to speed up video Inpainting for Video Repairing. This approach

reduces search area by using 2D slices rather than 3D patches. In this method RANSAC algorithm

for Estimation of Transformation block, KLT Algorithm for feature tracking and Max compiler is

used to compile C programming and reduces processing time. To overcome over-smoothing

problem method finds best match to give the good results in less time. This method also tries to

maintain spatial consistency and temporal motion continuity simultaneously of an object.

Homemade video includes jitter (unintended motion) and some kind of intentional motion such

as pan or zoom. Viewers preferred always to see jitter free videos, created by smoothly moving

camera. Sometimes in video analysis contrast, aligning to fix stable background is preferred. This

paper presents simple algorithm which is able to remove both types of motion by using an

efficient way to track background points while ignore moving foreground points. This approach is

mostly similar to image mosaicing but result is video rather than an enlarged image. It is related

with multiple objects tracking approach but quite simpler due to moving objects tracking is not

needed. The algorithm receives a video input and returns one or multiple stabilized videos. Video

breaks into parts when algorithm detects background change and it fix upon a new background.

Index Terms :- Feature Extraction, Image Mosaicing, Feature Selection, Video Stabilization, Matrix

Transformation, Degree of Fredom .

I. INTRODUCTION

There are two types of motions which dominate the videos taken by a person with a help of hand-held

camera. First is intentional motion such as panning or zooming and second are unintentional motions

which are unwanted shakes and jitters. Some recent work on video stabilization tries to remove shakes

and jitters while smoothing the most probably intended motion. This is done by firstly identifying

common features in the scene and estimating frame-to-frame movement of these features. Then,

algorithms work as follows: 1) Takes estimation of original camera motion, 2) fit proposed model for

capturing smoothed most probably intended camera motion, 3) solve for transformation matrices for

each frame merging them with new stabilized video without jitter and apparently smooth camera

motion.

In recent work done the same approach is by Grundmann et al.[6], demonstrate the context of YouTube a

robust algorithm for stabilizing the video to produce a final video which appears a shot due to improper

handling by a professional cameraman. In some situation, associated with video analysis, where goal

stabilizes a video in respective of its background and accordingly removes clear observed camera motion

entirely. Many reasons are here to remove camera motions such as stabilization of videos which enables

background modeling[16], makes object’s motion more clear to human and observers both. There are

some modes where goal of removing camera motion is not practical. Where cameras are mounted on

moving vehicles, more complications are involves multiple object tracking and camera modeling is

Page 2: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

45 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

required. In the same manner this work on stabilization, works incorporates with detection of salient

features [6], Lucas and Kanade tracking[6,11], RANSAC [6] to test potential feature correspondences, and

at last motion modeling is done in terms of frame-to-frame alignment transforms [6]. The best two

aspects in this algorithm are such as, it introduces staged multi-frame mechanism to introduce new

salient features in tracking procedure and takes decisions that whether a salient points are part of stable

background based on measurements over multiple frames. Transformation matrix that registers frame t

to the background is computed using features initially detected in frame t - 2. These features are tracked

to frame t - 1 by using Lucas-Kanade Tomasi algorithm and their motion is compared with camera

motion at t - 1. Those features which move consistently with camera and potential background points are

forwarded to frame t, whereas features with inconsistent motion are presumed to be foreground features

and are dropped. Result consider transformation matrix at time t is estimated by applying RANSAC

algorithm and motion modeling to set of features on background points and have persisted for two time

steps. Respectively transformation at time t + 1 is estimated from features previously extracted at time t -

1. This type of feature selection process adds robustness which provides more positive approach towards

the procedure. Result shows that this method sort registration errors better than the techniques which

are previously worked. The most prior work follows by some difference relative directly from goal of

stabilizing relative to a fixed background. In this approach, final step of generating the stabilized video

establishes common reference frame and mapping of all frames back to this reference one. In this process

of mapping videos to a common reference frame for video repairing is mostly done as image mosiacing

[3]. This paper enlist some points as follows: - reviewed video stabilization and image mosaicing related

work, our video stabilization algorithm, an experimental evaluation comparing this method with

alternative methods based on prior work for video stabilization and image mosaicing algorithms in the

literature.

II. RELATED WORK

The main aim of Video Stabilization is to remove unwanted motion. It is almost impossible for any

general person to hold camera and may not be introduce small but rapid movements such as jitters.

Video Stabilization algorithms are often takes an estimate of smooth camera motion path from video

frame data [11]. Grundmann et al. [6] presented a robust method for finding an L1- optimal Camera path

to generate stabilized videos. Basically this algorithm is based on Linear Programming framework which

finds optimal frame partitions of smooth camera path. Path modeling itself having fitting portions of the

path to constants, linear and parabolic motion models. A cropped window of fixed aspect ratio (AR) is

moved along with that optimal path to include salient points or regions when minimizing L1 smoothness

constraint. The main goal is that to stabilize video such that it might have been shot by a professional

cinema photographer using expensive physically stabilized cameras. Such a video is feeling pleasant for

human viewers.

III. OVERVIEW COMMON

Previous work on image alignment or image mosaicing is more closely related to goal of creating

stabilized backgrounds is closer than the recent work on video stabilization for smooth camera motion

trajectories. Objectives of both work shares common aspects such as extracting matching features in

successive frames. Features like harris corners [3], SIFT features or Good features [12, 4] can be used for

feature selection. Features matching techniques as correlation of a window centered about to feature

point[3], or tracking methods as Lucas Kanade Tracker [12] or Kalman Filters [4] are mostly used for

tracking features from frame to frame. Frame-to-frame motion models also play a vital role in the same

algorithm. Many Transformations of degrees of freedom (DOF) are common, together with likeness

Page 3: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

46 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

transform (4 DOF) [13], affine transform (6 DOF)[7] or homography transform (8 DOF)[4,3,12]. Such

transformations are regularly estimates from the matching features between pairs of video frames. When

Experiment takes place it is observed that balance must be struck between the ability to model is more

complex forms where the need to find more matching features is simple. To constrain the additional DOF,

further higher DOF motion models are used due to their much larger space of possible solutions and

sometimes they are susceptible to settling on unrealistic frame-to-frame motions. A major problem in

this algorithm is that how to decide which matched features is to be use. If scene containing independent

moving objects then all feature are not matches between frames and that are associated with the

dominant motion. Feature points on independently moving object if included in calculation of the

dominant motion then estimate will throw off. Then it is compulsory to describe feature pairs by inliers

or outliers relative to motion estimation. To separate outliers and inliers, RANSAC [3] or LMeS (Least

Median of Squares) [12] are mainly used.

IV. SPECIFIC PRIOR EFFORTS

Real-time scene video stabilization demonstrated by Hansen et al. [7], Their VFE-100 system introduced

a multi-resolution iterative method using the Laplacian pyramid images. Each iteration contain optical

flow estimation is done by using cross correlation of current image and previous image at particular

pyramid level. A linear motion model is fit to optical flow and previous image is merged with that model.

In next iteration optical flow estimation is done in between merged previous image and the current one

image at higher resolution level of pyramid. Morrimoto and Chellappa [13] proposed fast and robust

implementation of 2-D electronic image stabilization. This method selects the features on horizon by

thresholding and dividing Laplace of image into vertical zones and selecting topmost feature of every

zone. The selected features are tracked from ft - 1 to the frame ft by a multi-resolution scheme and also

involving Laplacian pyramids. feature points at every pyramid level is in frame ft is searched over

window centered about point in frame ft - 1 and point which returns the minimum SSD is finest match.

Estimation obtained at coarse level is used to search for minimum SSD at finer level of Laplace pyramid.

At last Least square solution is used for estimation of the similarity matrix from corresponding point.

Motion matrices are then combined from reference frame to current frame and current frame is merged

with accumulated motion model.

An image mosiacing algorithm employing these techniques by Capel and Zisserman[3]. In that system a

window-based localized correlation score used to match harris corners in between two consecutive

frames. RANSAC discards outlier points and estimate homography that was well defined matched inliers.

At last estimation of corresponding points and homography were advanced using non-linear optimizer

by minimizing euclidean distance between original feature points and corrected feature points of

correspondences. In that method cost was minimized using the Levenberg-Marquardt algorithm. Censi et

al. [4] approaches for image mosaicing with the feature tracking. This algorithm tracks best features [17]

in each subsequent frames of a sequence using linear Kalman filter. Predicted position of feature point is

extracted from predicted state of Kalman filter and neighborhood of predicted position is searched for

minimum SSD(sum of square difference) errors to find corresponding feature points. This system used

robust rejection rule x84 [5] to identify the outliers. The residual of each feature is calculated and the

feature whose residual differs by more than 5.24 MAD from the median residual discards an outlier.

Another image mosaics example was used in wide area surveillance. Mei et al. [12] presented background

retrieval system which detect Good Features in the reference frame and tracks them over subsequent

frames using Kanade-Lucas-Tomasi [17] tracker algorithm for obtaining frame-to-frame

correspondences. Homography estimation from those correspondences and Least Median of Squares

Page 4: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

47 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

(LMeS) algorithm are used to remove outliers. Mixture of Gaussians (MoG) [16], proposes background

modeling procedure to separate stable background from points of interest.

Heikkila et al. [8], proposes an automatic image mosaicing method for the wide area surveillance. In that

method extracts the SIFT features from incoming images and constructs a mosaic image. RANSAC method

used to reject outliers and parameters of homography are filtered by using Levenberg Marquardt

algorithm for minimizing geometric cost function defined over inliers. In another example of Image

Alignment SIFT Flow, Liu et al. [9], proposed system which uses pyramid based on discrete flow

estimation algorithm to match SIFT descriptors between two consecutive images and defines

neighborhood for SIFT flow. Dual layer loopy believes propagation used to minimize objective functions

at each pyramid level, which constrains SIFT descriptor to matched flow vector and constrain flow

vectors of adjacent pixels to be alike. This system’s primary estimates corresponding images of different

scene categories. However it uses for registration of similar scenes satellite images.

V. ALIGNMENT TO A COMMON REFERENCE

Alignment of frames in video comparing with reference frame will results in video where frames

appeared motionless except borders where content background changes [2]. Two issues are arises when

alignment to reference frame. 1) Accumulating the errors in motion estimation can lead to unacceptable

errors relative to reference frame. 2) Amount by which new frame overlaps to reference frame can grow

too small. Man and Picard [10], solves this type of problem by splitting frames into a subsets which can

be best registered. Comparing these algorithms that align video to common reference frame involves

both evaluating breaks and quality of aligned subsets of frames. Morimoto and Chellappa [14], proposed

fidelity measure which corresponds to Peak Signal to Noise Ratio (PSNR) in between the stabilized

consecutive frames. The PSNR is function of Mean squared error which is average departure/pixel from

desire stabilized result.

VI. STABILIZATION ALGORITHMS

The previous algorithms described here take one video input and return one/more videos, each of them

will be stabilized with respect to stationary reference frame. In another manner, goal is for background to

remaining fixed throughout which returned video. Input videos broken when the camera motions so

alters objects visible in background. There is no longer possibility to align a video with a common

reference. Criterion for breaking video is discussed further. Four algorithms are described further. Each

carries out broadly three operations such as: 1) Feature extraction and feature mapping between the

consecutive frames, 2) Frame to frame motion estimation from correspondences and 3) Motion

compensation. In general these steps are described further.

VII. Frame to Frame Motion Estimation

In general, camera motions are estimated from feature point correspondences. Features points are

extracted, selected, and matched across frames, and then affine motions between consecutive frames are

calculated from these tracked points. For frames ft and ft - 1 the affine transform may be expressed as:

�����1 � � ��1,1 �1,2 �1,3�2,1 �2,2 �2,30 0 1 � ��� 1�� 11 �

Page 5: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

48 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

This is a standard choice of motion model. The 6 degrees of freedom affine has six unknown parameters,

and three point correspondences generate six linear constraints equations. Thus match between any

three features in one frame to next, typically define an alignment transformation. When 8DOF

experiments with homographies, we found that the result shows worse than the 6 DOF transform.

RANSAC algorithm is used to find affine transformation supported by majority features. After each

iterations of RANSAC, three point-wise correspondences are elected at random and used them for

estimation of affine motion matrix. Estimated motion matrix checks how many points are lies in

consensus set. If percentage of points that are fits estimated affine model is more than threshold method

stops and declares that model is good. The points those were fit this model known as inliers and the

remaining are known outliers. After obtaining inliers from RANSAC algorithm, method performs a linear

least square solution for obtaining final affine inliner matrix.

RANSAC algorithm pre assumes point wise corresponding feature points in between two frames. It can be

done by two way of finding corresponding match feature points used in algorithms as follows. One is base

on feature similarity means match measure similarity between feature points expressed in the feature

space. The same approach is used by Heikkila et al. [8], SIFT features to describe SIFT points is used.

Other is a tracker to move points forward from frame t - 1 to the frame t. Mei et al. [12], used Lucas

Kanade Tomasi feature tracker [18]. Algorithm below uses an iterative pyramidal implementation of KLT

base on optical flow to provide robustness to large displacements [1].

VIII. MOTION COMPENSATION

In this approach one frame is taken as reference frame and next frame is registered to this reference

frame. Suppose if Hi represents affine transform between the frames i and i - 1 and if reference frame is

H0, then frame i can be mapped to first frame by composition of transformations:

H1…….i = H1, H2, H3………Hi Equation 1

Given motion model of frames i with respect to reference frame, then this method merge frame i using

inverse of H1….i. Merging frames means the projection of each pixel coordinate to source image by

motion matrix to new coordinate in destination image. For estimated pixel value in destination image

sampling is done in reverse mode which is going from destination to source. For each pixel (x, y) of

destination image functions compute coordinates of corresponding “donor pixel”, in source image and

copy pixel value such as mentioned below.

dst (x,y) = src (fx (x,y), fy (x,y)) Equation 2

Where fx (x,y) and fy (x,y) are merging functions. Since fx (x,y) and fy (x,y) are seldom integers, bilinear

interpolation which is used for obtaining new interpolated pixel values. When fx (x,y), fy (x,y) or both fall

outside the source image, then dst(x,y) is set to zero.

IX. CRITERIA FOR BREAKING VIDEOS

This algorithm introduces breaks in to two events. Very first if excessive amount of panning is detected

means where majority of non-visible pixels in reference frame, then break is created. Such a type of break

is triggered by monitoring fraction of pixels in current frame that lies outside reference frame when

mapped back to reference frame using accumulated motion model (H1…..i). At certain condition this

Page 6: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

49 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

algorithm typically set to break video, if more than half of the total pixels in the current frame are lies

outside reference frame. And in another condition this triggers for break as per excessive scaling in

transformation matrix. This type of trigger makes the sense in this domain because this algorithm works

with camera pan but not with zoom in/out during single video. Triggers are implemented by monitoring

determinant of accumulated motion model H1….i. If ratio drifts too far from 1:0, means it start scaling

background relative to reference frame then break is created. In general for this an algorithm, if

determinant falls outside range 0:95 to 1:05 then video are broken. It is observed in experiment that,

determinant of motion matrix lies very close to 1:0 for frame to frame estimates. The determinant always

remains within 1 ± Ɛ, where Ɛ is in order of 10-4. It is also observed in cases where errors accumulated

in accumulated motion model H1…..i, as per previous discussion and starting with a new reference

becomes necessary. Experiments test revealed which is used Homography transform which tends to

cause “skew” at large instance and perspective errors. Due to this testing for these errors in turn leads for

more breaks. The process becomes lengthy for stabilizing videos using techniques in [8,12], which used

for homography transforms. However, we solve this led to large number of video breaks, and

subsequently changed the implementations by using affine models.

X. Proposed Algorithm

The intelligent algorithm introduced here uses a two staged process to refresh set of SIFT feature points

that are tracked using the KLT algorithm. Motivation behind this design a new system for background

inpainting for video repairing by the use of Lucas-Kanade Tomasi algorithm and KLT Transform, RANSAC

algorithm and homography transform speedup the inpainting approach. In this algorithm computational

need to create pair wise correspondences from scratch in each successive frame is solved. It depends on

initialization step where points which are to be tracked that established based on first frames of video. As

tracking proceeds, set of points being tracked and shrinks in size. Points are getting dropped when KLT

algorithm cannot establish with confidence in new position in a new frame. If the scene got change then,

there is no mechanism for refreshing tracked points.

Figure 1: Different stages involves in video Inpainting for Video Repairing.

Old Old Inliers Inliers H Frame Fresh Points Points

Figure 2: Single frame Robust Stage RANSAC Tracker Algorithm.

KLT RANSAC Filter Inliers

KLT KLT

Input Video

Preprocessing

Model Sampling

Model Alignment

Motion Completion

Video Inpainting

Page 7: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance

50 | © 2014, IJAFRC All Rights Reserved

This algorithm solve both problems through an iterative process which constantly refreshes group of

SIFT features which are being tracked. Figure 2 shows algorithm for a singl

points enters in frame in each iteration: 1) Set of inliers extracted from two frames previous to current

frame and 2) Set of fresh points extracted from previous frame. Inliers t from previous frame is to be

used by RANSAC to estimate motion of current frame with respect to previous frame. Once motion is

estimated then inliers are totally discarded. Fresh points are then tested for compatibility with estimated

motion between current and previous one frame and points which are c

on next frame as new set of inliers. Fresh applicant of SIFT features were extracted from current frame

and passed it as fresh points to next frame. Here recent algorithm is described as Robust Staged RANSAC

Tracking algorithm. For initialization tracking algorithm in Figure 2, we need bootstrap frame, SIFT

features extraction is done from initial frame. KLT Algorithm updates position of these features in next

frame and RANSAC compute motion and separate inliers from outlier

frame as inliers and new set of SIFT features are extracted from first frame and passed as staged fresh

points for second frame.

Figure 3: Robust Staged RANSAC Tracking Algorithm with motion compensation (H).

XI. EXPERIMENTAL RESULTS

In this method for each pair of images it sorts frame to frame pixel differences by magnitude and take

sum of all differences below the median. Those pixels are lies outside the video frame, are ignored. This

type of errors is normalized by number of pixels which are a part of video content of the two frames. This

error condition is slightly complex because score checks how well the background is to be reconstructed

while ignoring the pixel differences occurred due to moving objects. N

accumulated for consecutive frames, except there are breaks and pairs in frames across registration

where breaks are ignored. Here different threshold levels are used when creating breaks in video. Circled

points indicate default algorithm configurations which points need to be transfer. This algorithm checks

whether the determinant of accumulated motion model H1….i, falls outside the range 0:95 to 1:05 then

those errors does not accumulate in accumulated motion model. When range

numbers of breaks are increases. The frames which are lies in between breaks are finely registered and

have lower overall errors. At a broader range, breaks are decrease but overall error scores are increase.

XII. CONCLUSION AND FUTURE SCOPE

This paper presents an intelligent object based video inpainting technique for video repairing which

stabilizes video relatively with fixed background. Here staged multi frame video alignment method

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014.

© 2014, IJAFRC All Rights Reserved

This algorithm solve both problems through an iterative process which constantly refreshes group of

SIFT features which are being tracked. Figure 2 shows algorithm for a single frame. Two set of features

points enters in frame in each iteration: 1) Set of inliers extracted from two frames previous to current

frame and 2) Set of fresh points extracted from previous frame. Inliers t from previous frame is to be

estimate motion of current frame with respect to previous frame. Once motion is

estimated then inliers are totally discarded. Fresh points are then tested for compatibility with estimated

motion between current and previous one frame and points which are consistent with motion are passed

on next frame as new set of inliers. Fresh applicant of SIFT features were extracted from current frame

and passed it as fresh points to next frame. Here recent algorithm is described as Robust Staged RANSAC

thm. For initialization tracking algorithm in Figure 2, we need bootstrap frame, SIFT

features extraction is done from initial frame. KLT Algorithm updates position of these features in next

frame and RANSAC compute motion and separate inliers from outliers. Inliers are passed to second next

frame as inliers and new set of SIFT features are extracted from first frame and passed as staged fresh

Figure 3: Robust Staged RANSAC Tracking Algorithm with motion compensation (H).

In this method for each pair of images it sorts frame to frame pixel differences by magnitude and take

sum of all differences below the median. Those pixels are lies outside the video frame, are ignored. This

zed by number of pixels which are a part of video content of the two frames. This

error condition is slightly complex because score checks how well the background is to be reconstructed

while ignoring the pixel differences occurred due to moving objects. Normalized errors scores are

accumulated for consecutive frames, except there are breaks and pairs in frames across registration

where breaks are ignored. Here different threshold levels are used when creating breaks in video. Circled

t algorithm configurations which points need to be transfer. This algorithm checks

whether the determinant of accumulated motion model H1….i, falls outside the range 0:95 to 1:05 then

those errors does not accumulate in accumulated motion model. When range

numbers of breaks are increases. The frames which are lies in between breaks are finely registered and

have lower overall errors. At a broader range, breaks are decrease but overall error scores are increase.

SCOPE

This paper presents an intelligent object based video inpainting technique for video repairing which

stabilizes video relatively with fixed background. Here staged multi frame video alignment method

Frame 0

•Extract SIFT

Frame 1

•Tracked SIFT Points

•Extract SIFT

Frame 2

•Tracked Inliners

•Extract SIFT

Frame 3

•Tracked Inliners

•Extract SIFT

Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

www.ijafrc.org

This algorithm solve both problems through an iterative process which constantly refreshes group of

e frame. Two set of features

points enters in frame in each iteration: 1) Set of inliers extracted from two frames previous to current

frame and 2) Set of fresh points extracted from previous frame. Inliers t from previous frame is to be

estimate motion of current frame with respect to previous frame. Once motion is

estimated then inliers are totally discarded. Fresh points are then tested for compatibility with estimated

onsistent with motion are passed

on next frame as new set of inliers. Fresh applicant of SIFT features were extracted from current frame

and passed it as fresh points to next frame. Here recent algorithm is described as Robust Staged RANSAC

thm. For initialization tracking algorithm in Figure 2, we need bootstrap frame, SIFT

features extraction is done from initial frame. KLT Algorithm updates position of these features in next

s. Inliers are passed to second next

frame as inliers and new set of SIFT features are extracted from first frame and passed as staged fresh

Figure 3: Robust Staged RANSAC Tracking Algorithm with motion compensation (H).

In this method for each pair of images it sorts frame to frame pixel differences by magnitude and take

sum of all differences below the median. Those pixels are lies outside the video frame, are ignored. This

zed by number of pixels which are a part of video content of the two frames. This

error condition is slightly complex because score checks how well the background is to be reconstructed

ormalized errors scores are

accumulated for consecutive frames, except there are breaks and pairs in frames across registration

where breaks are ignored. Here different threshold levels are used when creating breaks in video. Circled

t algorithm configurations which points need to be transfer. This algorithm checks

whether the determinant of accumulated motion model H1….i, falls outside the range 0:95 to 1:05 then

those errors does not accumulate in accumulated motion model. When range of H is smaller, then

numbers of breaks are increases. The frames which are lies in between breaks are finely registered and

have lower overall errors. At a broader range, breaks are decrease but overall error scores are increase.

This paper presents an intelligent object based video inpainting technique for video repairing which

stabilizes video relatively with fixed background. Here staged multi frame video alignment method

Page 8: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

51 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

efficiently tracks the inliers and trying for maintaining number of tracked points may not decrease over

time. This technique used first frame as the reference frame. In future, challenge would be to select a

reference frame dynamically which optimizes number of breaks and stabilization error. As a result, frame

to frame motion is estimated and aligned more strongly than in previous techniques. In this technique

algorithm tries to decrease the computation time. But it is not suitable in such a domain, where camera is

mobile or continuously zoom-in/Zoom-out. In such a critical condition, a goal of registering frame to

fixed reference frame is not suitable.

Figure 4: Inliners plot summarized result analysis of three sample videos where camera motions are

unintentional.

Page 9: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

52 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

Figure 5: Merging frames summarized result analysis of three sample videos of camera unintentional

motion.

Table 1:- Performance Analysis of Algorithm Results of Three Sample Videos.

XIII. REFERENCES

[1] In 1999, J. Y. Bouguet proposed Pyramidal implementation of lucas-kanade feature tracker.

Technology Rep., Intel Corporation, Microprocessor Research Labs.

[2] In 2012, M. Bruder, G. A. Roitman and B. Cernuschi-Frias proposed robust methods for

background extraction in video. Argentine Symposium on Artificial Intelligence, pages 83–95.

[3] In 1998, D. Capel and A. Zisserman proposed Automated mosaicing with super-resolution zoom.

In Computer Vision and Pattern Recognition, Proceedings. IEEE Computer Society Conference on,

pages 885–891.

[4] In 1999, A. Censi, A. Fusiello and V. Roberto proposed Image stabilization by features tracking. In

Image Analysis and Processing, Proceedings. International Conference on, pages 665–667.

[5] In 1986, P. J. R. Frank R. Hampel, Elvezio M. Ronchetti and W. A. Stahel proposed Robust Statistics:

The Approach Based on Influence Functions. Wiley Series in Probability and mathematical

statistics.

Calculated Parameters Sample

Video 1

Sample

Video 2

Sample

Video 3

Number of Iterations taken by Algorithm 26 24 23

Maximum Inliners found in Frames 18 30 20

Minimum Inliners found in Frames 7 14 7

Maximum Number of Reference frame Inliner points 99 97 120

Minimum Number of Reference frame Inliner points 99 96 119

Maximum Time taken to creat points on Reference frame 0.05 0.06 0.05

Minimum Time taken to creat points on Reference frame 0.02 0.02 0.03

Maximum Number of Current frame Inliner points 112 112 161

Minimum Number of Current frame Inliner points 78 90 131

Maximum Time taken to creat points on Current frame 0.04 0.04 0.06

Minimum Time taken to creat points on Current frame 0.02 0.02 0.03

Maximum Time taken to match frames 0.001 0.0016 0.0032

Minimum Time taken to match frames 0.0007 0.0008 0.0013

Page 10: An Intelligent Object Inpainting Approach For Video Repairing

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 5, May 2014. ISSN 2348 - 4853

53 | © 2014, IJAFRC All Rights Reserved www.ijafrc.org

[6] In 2011, M. Grundmann, V. Kwatra, and I. Essa proposed Auto-directed video stabilization with

robust optimal camera paths. In Computer Vision and Pattern Recognition (CVPR), IEEE

Conference on, pages 225–232.

[7] In 1994, M. Hansen, P. Anandan, K. Dana, G. Van der Wal, and P. Burt proposed Real-time scene

stabilization and mosaic construction. In Applications of Computer Vision, Proceedings IEEE

Workshop, pages 54–62.

[8] In 2005, M. Heikkil and M. Pietik ainen proposed an image mosaicing module for wide-area

surveillance. In Proceedings of third ACM international workshop on Video surveillance & sensor

networks, pages 11–18.

[9] In 2011, C. Liu, J. Yuen, and A. Torralba proposed Sift flow: Dense correspondence across scenes

and its applications. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(5):978 –

994.

[10] In 1995, S. Mann and R. W. Picard proposed Video orbits of the projective group: A new

perspective on image mosaicing.

[11] In 2006, Y. Matsushita, E. Ofek, W. Ge, X. Tang, and H.-Y. Shum proposed Full-frame video

stabilization with motion inpainting. Pattern Analysis and Machine Intelligence, IEEE

Transactions on, 28(7):1150–1163.

[12] In 2005, X. Mei, M. Ramachandran, and S. K. Zhou proposed Video background retrieval using

mosaic images. In International Conference on Acoustics, Speech, and Signal Processing, volume

2, pages 441–444.

[13] In 1996, C. Morimoto and R. Chellappa proposed Fast electronic digital image stabilization. In

Pattern Recognition, IEEE Proceedings of the 13th International Conference on, volume 3, pages

284– 288.

[14] In 1998, C. Morimoto and R. Chellappa proposed Evaluation of image stabilization algorithms. In

Acoustics, Speech and Signal Processing, Proceedings of IEEE International Conf. on, vol.5, pages

2789–2792.

[15] In 2000, C. Stauffer and W. E. L. Grimson proposed Learning patterns of activity using real-time

tracking. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 22(8):747–757.

[16] In 1991, C. Tomasi and T. Kanade proposed Detection and tracking of point features. School of

Computer Science, Carnegie Mellon University.

[17] In 1994, C. Tomasi and J. Shi Proposed Good features to track. Proceeding IEEE CS Conference

Computer vision and Pattern Recognition, Pages 593-600.