21
Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported by NASA WV EPSCoR Award 2005-2006

Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Embed Size (px)

Citation preview

Page 1: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Superresolution of Texts

from Nonideal VideoXin LiLane Dept. of CSEE West Virginia UniversityMorgantown, WV 26506-6109

This work is partially supported by NASA WV EPSCoR Award 2005-2006

Page 2: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Outline Introduction

What is SR? Why SR? How to achieve SR? A general framework for SR: registration + restoration Understand the boundary of formulating SR as an inverse

problem SR of texts from nonideal video

Problem statement: why texts and nonideal video? Analyze error accumulation in multiframe registration Address the issue of quality/PSF consistency in restoration Experimental Results

Conclusions

Page 3: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Image Resolution

Chip size

Field-Of-View: HW

Pixel size

Sampling Distance

W

H

Gonzalez “Digital Image Processing”

Page 4: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Why Higher Resolution? Improved objective fidelity

Natural scene is seldom band-limited Higher resolution implies smaller representation

errors Improved subjective quality

Attention enhances spatial resolution Spatial resolution enhances attention?

Improved measuration/recognition Law enforcement, forensics/biometrics: face

recognition grand challenge (FRGC), iris recognition, vehicle license plate recognition

Page 5: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Towards Gigapixel: Artistic Approach

Mega-pel Giga-pel

http://triton.tpd.tno.nl/gigazoom/Delft2.htm

Photographers and artists have manually or semi-automatically stitched hundreds of mega-pel pictures together to demonstrate how a giga-pel picture looks like the power of pixels

Page 6: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Scientific Solutions Sensor-based

Reduce pixel size: limit – 0.40m2 for a 0.35 m CMOS

process Increase chip size: ineffective due to increased capacitance

(bad for speeding up a charge transfer rate) Computational (Super-resolution)

Exploit the tradeoff between space and time: obtain a HR from multiple LR copies

Physical principles of imaging plays the fundamental role in defining the relationship between LR and HR

Hybrid: the convergence of the camera and the computer Computational cameras: catadioptric camera, jitter camera

(Ben-Ezra, Zomet and Nayar)

Page 7: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

SR: A General FrameworkS.C. Park et al., “Super-resolution image reconstruction: a technical overview”,IEEE Signal Processing Magazine, pp. 21-36, May 2003

pkx kkkk 1,nMDBy

SR can be formulated as an inverse problem, assuminga mathematical model linking LR to HR images is known

Page 8: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

SR: At the Intersection of SP and CV Registration problem

Translational models Subpixel accuracy phase correlation (Foroosh, Zerubia

and Berthod’1996) Subspace methods in the frequency domain

(Vandewallea, Sbaiza, SG usstrunka and Vetterli) Projective models or planar homography (Capel and

Zisserman’2003) Images of a planar surface under arbitrary camera

motion or images of a scene under fixed camera Restoration problem

Model-based: regularized deblurring, robust SR (Farsiu, Elad and Milanfar’2004)

Learning-based: exemplar-based SR (Freeman, Jones and Pasztor’2002), video epitome (Cheung, Frey and Jojic’2005)

Page 9: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Understand the Boundary of SR as an Inverse Problem Limited modeling capability

Fixed enhancement ratio specified by the down-sampling operation We formulate scalable (progressive) SR: as more data

become available, higher resolution can be achieved Inevitable approximation when warping gets

complex We advocate nonuniform interpolation based forward

approach in the case of arbitrary camera motion Sensor PSF is often unknown and time-varying

We propose to adaptively select a subset of LR images

Page 10: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Outline Introduction

What is SR? Why SR? How to achieve SR? A general framework for SR: registration + restoration Understand the boundary of formulating SR as an

inverse problem

SR of texts from nonideal video Problem statement: why texts and nonideal video? Analysis of error accumulation in multiframe registration Issue of phase/PSF consistency in restoration : NOT all LR

images are useful Experimental Results

Conclusions

Page 11: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

SR-of-Texts from Nonideal Video

SRHR image of license plate

Given a segment of video clip that contains some texts thatare illegible due to the limited resolution, how to produce a HRimage in which the texts become clearly readable (by human)?

Problem Statement

Page 12: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Defining the Boundary of Problem Why texts?

Texts represent an important class of visual information (e.g., law enforcement applications)

Relatively easy assessment of SR results by human observers

Texts are often printed to a planar surface, which facilitates the registration

What do we mean by nonideal video? Uncontrolled real-world acquisition conditions: handheld

camera (arbitrary camera motion), unfavorable illumination, unknown PSF, inevitable compression artifacts, and so on

Page 13: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Our Practical Approach

Consistency-guidedPreprocessing

Homography-basedRegistration

NonuniformInterpolation

Diffusion-aidedBlind Deconvolution

Tailored for bimodaltextual images

Not all LR images areused in our SR scheme

Accuracy is guaranteed byplanar surface assumption

Search for an appropriatemagnifying ratio and phase

Page 14: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

LR Image Consistency

Quality consistency PSF consistency

Human vision helps the selection of consistent LR images

Page 15: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Homography-based Multiframe Registration

or

Homography matrix

Sequential

Parallel

image1

imageK

imageK

image1

image2

image2

Mosaicing: slightly-overlapped FOV sequential Superresolution: severely-overlapped FOV parallel

Page 16: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Nonuniform Interpolation

:

: targeted data points at HRLattice

Data grid

Fused data points fromregistered LR images

phase of HR lattice

Target HR lattice: min d(, ) over two parameters: distance and phase

distance of HR lattice

Page 17: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Experimental Results (I): SR Comparison on Benchmark Data

UCSC-SR Ours

Beforedeblurring

Afterdeblurring

Input: 20 LR images

… …

Thanks to Prof. Milanfar for providing us

the UCSC-SR software

Page 18: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Experimental Results (II): SR Results Comparison on Nonideal Video

Input: 4 LR images

UCSC-SR Ours

Page 19: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Experimental Results (II): SR Results Comparison on Nonideal Video

Input: 4 LR images

UCSC-SR Ours After deblurring

Page 20: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Experimental Results (III):Impact of Error Accumulation

sequentialparallel

sequentialparallel

K=4

K=8

Error accumulation in sequential registration degrades image quality when K is large

Page 21: Superresolution of Texts from Nonideal Video Xin Li Lane Dept. of CSEE West Virginia University Morgantown, WV 26506-6109 This work is partially supported

Conclusions and Perspectives

SR of texts from nonideal video A class of SR problems whose boundary can be well

defined An example supporting a practical, forward approach

towards SR To have a better understanding of SR techniques

We need to look at the problem from a perceptual perspective

New applications such as video compression, distributed coding, iris recognition, biomedical imaging will help us define the boundary of SR

Spatial vs. temporal SR: fundamental space-time tradeoff