University of Toronto Oct. 18, 2004 Modelling Motion Patterns with Video Epitomes Machine Learning Group Meeting University of Toronto Oct. 18, 2004 Vincent

Oct. 18, 2004

Universityof Toronto

Modelling Motion Patterns with Video Epitomes

Machine Learning Group MeetingUniversity of Toronto

Oct. 18, 2004

Vincent Cheung

Probabilistic and Statistical Inference GroupElectrical & Computer Engineering

University of TorontoToronto, Ontario, Canada

Advisor: Dr. Brendan J. Frey

Cheung2 / 18

ML Group Meeting, Oct. 18, 2004

Outline

● Image epitome► What?► Why?► How?

● Implementation computation issues► Efficiently implementing the learning algorithm

● Video epitome► Extension to videos► Filling-in missing information

Cheung3 / 18

ML Group Meeting, Oct. 18, 2004Im

age

Lea

rnin

gV

ide

o

Image Epitome

● Jojic, N., Frey, B., & Kannan, A. (2003). Epitomic analysis of appearance and shape. In Proc. IEEE ICCV.

● Miniature, condensed version of the image

● Models the image’s shape and textural components.

● Applications► object detection► texture segmentation► image retrieval► compression

Cheung4 / 18

ML Group Meeting, Oct. 18, 2004Im

age

Lea

rnin

gV

ide

o

Image Epitome Examples

Cheung5 / 18

Imag

eL

earn

ing

Vid

eo


Learning the Image Epitome (1)

Cheung6 / 18

Imag

eL

earn

ing

Vid

eo


Learning the Image Epitome (2)

Epitome

Input image

Training Set

SamplePatches

UnsupervisedLearning

e

Z1 Z2 ZM

…

TMT2T1Bayesiannetwork

e – epitomeTk – mappingZk – image patch

E:

M:

Cheung7 / 18

Imag

eL

earn

ing

Vid

eo


Xe(i,j)

KK

N

N

Cumsum

2

N

N

Xe(i,j)

KK

N

N

Cumsum

2

N

N

Shifted Cumulative Sum Algorithm

(2, 2), (i+1, j+1)

(1, 2), (i, j+1)

(2, 1), (i+1, j)

– col 1+ col (P+1)

– row 1+ row (P+1)

–

+

+ + pixel (1,1)+ pixel (P+1, P+1)

(1, 1), (i, j)

(2, 2), (i+1, j+1)(2, 2), (i+1, j+1)

(1, 2), (i, j+1)(1, 2), (i, j+1)

(2, 1), (i+1, j)(2, 1), (i+1, j)

– col 1+ col (P+1)

– row 1+ row (P+1)

–

+

+ + pixel (1,1)+ pixel (P+1, P+1)

(1, 1), (i, j)(1, 1), (i, j)

+-

-+

Cheung8 / 18

Imag

eL

earn

ing

Vid

eo


X e

P

P

P

P

Ta

Tß

X e

P

P

P

P

Ta

Tß

Collecting Sufficient Statistics

Cheung9 / 18

Imag

eL

earn

ing

Vid

eo


Extending Epitomes to Videos

● Desire a miniature, condensed version of a video sequence

● Want the epitome to model the basic shape, textural, and motion patterns of the video

● Applications► optic flow► segmentation► texture transfer► layer separation► compression► noise reduction► fill-in / inpainting

Cheung10 / 18

Imag

eL

earn

ing

Vid

eo


Input Video

Frame 1Frame 2

Frame 3

Training Set

SamplePatches

Video Epitome

UnsupervisedLearning

Video Epitome

Cheung11 / 18

Imag

eL

earn

ing

Vid

eo


Video Epitome Example

Temporally Compressed

Spatially Compressed

Cheung12 / 18

Imag

eL

earn

ing

Vid

eo


Video Inpainting (1)

● Fill in missing portions of a video► damaged films► occluding objects

● Reconstruct the missing pixels from the video epitome

Cheung13 / 18

Imag

eL

earn

ing

Vid

eo


Video Inpainting (2)

Cheung14 / 18

Imag

eL

earn

ing

Vid

eo


Filling-in Missing Data

Likelihood

Joint

VariationalApprox

E-Step

M-Step

Cheung15 / 18

Imag

eL

earn

ing

Vid

eo


Missing Channels

● Generalization of the video inpainting problem

● Inpainting► Missing entire pixels

● Missing Channels► Missing one or more of the red, green, or blue (RGB)

components of a given pixel

● Epitome must consolidate multiple patches together to piece together the missing channel information

► No training patch contains all the channel information► Use the epitome to fill-in the missing data

Cheung16 / 18

Imag

eL

earn

ing

Vid

eo


Image Missing Channels Fill-in

Cheung17 / 18

Imag

eL

earn

ing

Vid

eo


Video Missing Channels Fill-in

Cheung18 / 18


Conclusion

● Improved the efficiency of learning image epitomes

● Extended the concept of epitomes to video sequences

● Demonstrated the ability of video epitomes to model motion patterns through video inpainting

Cheung19 / 18


Future Work

● Determining the size of the epitome► Dependent on the complexity of the image / video

■ Minimum description length■ Variational Bayesian

● Optimal patch size(s)► Problem specific

● Additional transformations into the epitome► Rotation► Scale

● Additional video epitome applications► Super-resolution► Layer separation► Object recognition

Documents

University of Toronto Oct. 18, 2004 Modelling Motion Patterns with Video Epitomes Machine Learning Group Meeting University of Toronto Oct. 18, 2004 Vincent