Stephan Kopf Department of Computer Science IV University ......Stephan Kopf 15.02.2011 18 RANSAC: Fischler, Bolles: Random sample concensus: a paradigm for model fitting with applications

Stephan Kopf Department of Computer Science IV

University of Mannheim, Germany

Motivation

Part I: Basic Retargeting Operations

◦ Scaling and cropping

◦ Regions of interest

◦ Automatic crop & scale

◦ Sports video adaptation

Part II: Seam Carving

◦ Seam carving for images

◦ Preservation of straight lines

◦ Fast seam carving for videos

Summary

15.02.2011 Stephan Kopf 2

Mobile phones are multimedia devices that allow to

◦ browse the Web

◦ display images and videos

◦ support novel input technologies (multi-touch)


But they still have limitations:

◦ Small screen size

◦ Wireless connection (bandwidth)

◦ Computational power (CPU, memory)

◦ Battery

Typical resolutions of images and videos

◦ Digital camera: 10 megapixels (3.600 x 2.700 pixels)

◦ Camcorder: high definition (1.920 x 1.080 pixels)

◦ Mobile phone (240 x 320 pixels)


HD video mobile phone

Bitrate: 24 Mbit/s

Distortions caused by scaling (aspect ratio)

Goals of media retargeting

Shrink photos and videos for the presentation on a mobile phone (this automatically limits the bitrate)

Keep aspect ratio

Preserve the most important visual content

Algorithms for image and video retargeting


6

Shrink image (merge pixels) by a fixed scale factor (uniform scaling)

Different scale factors for each axis change the aspect ratio (non-uniform scaling)

Relevance of image content is ignored

„Letterboxing“ is used to preserve aspect ratio

Example:


Crop image borders until aspect ratios of image and display match

Relevance of image content is ignored: important content may be lost

Typically use scaling to convert to target size

Example:


Idea

Identify most relevant image regions (regions of interest)

Crop borders but preserve regions of interest

Use automatic algorithms to identify regions of interest:

◦ Saliency maps

◦ Faces

◦ Text regions


Assumption: image regions that are relevant for an observer have a high contrast

Step 1: Contrast map of an image of size n × m :

color of a pixel: pi ,j

pixel in local neighborhood of pi ,j :

distance function: d (.)

Step 2: Quantize contrast map

Step 3: Find connected regions

Step 4: Mark region of interest


*Source: Ma and Zhang HJ: Contrast-based image attention analysis by using fuzzy growing, ACM Intl. Conf. on Multimedia, 2003

contrast map quantized contrast map

region of interest bounding box


Use automatic face detection algorithms to localize face regions

Frontal face detection algorithms work very robust (in contrast to face recognition)


Characteristic features of text:

◦ horizontal alignment

◦ significant luminance difference between text and background

◦ the character size is within a certain range

◦ single-colored

◦ text is visible in consecutive frames (video)

◦ horizontal or vertical motion is possible (video)

Calculate a horizontal projection profile to detect the boundaries of text lines


Calculate importance value V for each region of size H:

minimum perceptible size: Hmin

maximum reasonable size: Hmax

Find optimal target region W based on regions of interest Si:


Selection of one feature

Combination of two features

… three features Full image


scaling

cropping

crop & scale


Automatically detect:

Court lines

Players

Ball

scaled video

modify video content


*Source: Kopf, Guthier, Farin, Han: Analysis and Retargeting of Ball Sports Video, IEEE Workshop on Applications of Computer Vision, 2011

Step 1: Mark bright pixels (line pixels)

Step 2: Algorithm to detect straight lines (based on RANSAC)

1. Randomly select two line pixels and calculate line parameters

2. Count number of white pixels N located on line

3. If (N > threshold) stop

4. Goto 1.

Step 3: Remove line pixels and detect next line (Step 2)


RANSAC: Fischler, Bolles: Random sample concensus: a paradigm for model fitting with applications to image analysis and automated cartography, Communications ACM, vol 24(6), 1981.

Problem: Position of lines change from frame to frame

Solution: use a reference court model to estimate camera motion

◦ Step 1: Calculate intersection points of two lines

◦ Step 2: Transform lines to court model

How many intersection points do we need for the transformation?


Translation (horizontal/vertical shift)

1 intersection point


Translation and scaling

2 intersection points

Affine transform (translation, scaling, rotation)


Perspective transform


cropping scaling crop & scale

(zoom on largest player) modify lines & ball


22

If important content is located near image borders: crop & scale is not applicable

Idea of seam carving*

Systematic removal of less important pixels

Use energy function as measurement of „importance“ of single pixels

*Source: Shai Avidan and Ariel Shamir: Seam Carving for Content-Aware Image Resizing. ACM SIGGRAPH, 2007


Image width should be reduced by 40 percent

original image energy map


Remove N pixels with the lowest energy from each line


remove N=200 pixels from each line

based on energy values

source image

Summarize energy in each column of the image and remove N columns with lowest energy

remove 200 columns based on energy values

of columns

original image


A vertical seam is an 8-connected path of pixels from top to bottom that contains one and only one pixel in each row.

Formal definition:

Horizontal seams are defined in a analog way.


1|1)-x(i-x(i)| :i subject to ,i)}{(x(i), =}{s=s n

1i

n

1i

x

i

x

Advantage of seams compared to columns or rows:

◦ Pixels of low energy are removed

◦ Relevant objects are preserved


Remove the vertical seam with the lowest energy

Repeat this step N times


remove N=200 seams based on lowest energy

source image

Seam carving uses an energy function that characterizes the relevance of each pixel (similar to saliency maps).

The optimal seam minimizes the cumulated pixel energy of all seam pixels.

Method to find optimal seam: dynamic programming


M ( i, j ) specifies the cost of the optimal (vertical) seam from the upper image border to pixel position (i, j )

Calculate M( i, j ) recursively:

)1,1(

),1(

)1,1(

min),(),(

jiM

jiM

jiM

jiejiM


1

Example how to calculate the optimal seam:

1

3

6

7

3

6

7

2 5

1

4

1

2

3

4

1

2

3

3

5

4

4

1

)1,1(

),1(

)1,1(

min),(),(

jiM

jiM

jiM

jiejiM

2 5

4

3

4

5

4

5

7

9

8

9


energy map cumulated energy map M( i, j )

Image gradient: simple energy function that calculates the luminance difference to adjacent pixels:

Assumption: Luminance values do not differ much in image regions of low relevance

This simple energy function gives good results in many cases

),(),()),(( yxIy

yxIx

yxIe


Problem: The light house is an important region, but the pixel values are very similar

original image optimal seams result


Combine energy function with saliency map

)),((),()),(( yxIeyxsaliencywyxIe ssal

saliency map optimal seams result (esal is used as energy function)

Source: Hwang and Chien. Content-Aware Image Resizing using Perceptual Seam Carving with Human Attention Model. IEEE Conference on Multimedia and Expo, 2008.


Use results from face detection as additional saliency:

)),((),(),()),(( yxIeyxfacewyxsaliencywyxIe fsfacesal

saliency map

face map

seams based on esal+face as

energy function

result


The quality of seam carving drops significantly in case of straight lines

original image seam carving (width reduced by 40%)

Source: Kiess, Kopf, Guthier and Effelsberg: Seam Carving with Improved Edge Preservation. Proc. of IS&T/SPIE Electronic Imaging, 2010.


Problem: lines become distorted when seams are removed

image section visualizing a straight line

seams intersect a straight line

result after removal of seams


This is especially critical when several seams intersect a line in adjacent pixel positions

Idea: Distribute intersection points of seams and lines along the line

seams intersect a line in adjacent pixel positions

result: errors are clearly visible

equal distribution of intersection

points

result: errors are much less obvious


Implementation: modify energy function before the next optimal seam is calculated

Intersection point of seam and line: increase energy values in a certain radius

The following seams will avoid these pixels

seam intersects a straight line

Modify energy function next to the intersection

point

detect next seams and modify energy function for each

intersection


original image seam carving seam Carving with line preservation


1. Idea: Use seam carving on each frame separately video becomes blurred and shaky

original adapted

Improvements

Video defines a 3D space-time volume (3D cube)

Remove 2D seam manifolds (seam surface areas) where each seam pixel is connected in 3D

Use graph cuts (max-flow min-cut) to detect optimal seam manifold

Source: Rubinstein, Shamir, Avidan: Improved seam carving for video retargeting. ACM Trans. Graph. 27, 3, 2008.

time

source node

sink node

frame N

frame 1 edges: energy between pixels

Computational effort?

Idea

Create one image that aggregates the pixel values / energy values of all frames

Detect 1D seam in aggregated image

Map this seam back to all frames

Source: Kopf, Kiess, Lemelson, Effelsberg: FSCAV -

Fast Seam Carving for Size Adaptation of Videos,

ACM Intl. Conf. on Multimedia, 2009.


Problem: camera motion, zoom, panning

Use image registration techniques to calculate the parameters of the camera model (use perspective camera model)

Align frames and create a background image

Detect optimal seam in background image

Use inverse camera motion to transform optimal seam back to all original frames


Example: construct background image from a camera pan



scaling seam carving


scaling fast seam carving

The quality of adapted images or videos depends on the visual content. The results of crop & scale might be much better than seam carving or vice versa.

Crop & scale typically works well if the relevant content is located in a small region.

In case of large background areas, many seams with low energy are detected and the results based on seam carving are very good.


No technique works well if most of the content is highly relevant.

Would it be possible to find better energy functions for seams?

Would it be possible to preserve other geometric objects similar to straight lines?

Would it be possible to automatically evaluate the quality of adapted images or videos?


Documents

Stephan Kopf Department of Computer Science IV University ......Stephan Kopf 15.02.2011 18 RANSAC: Fischler, Bolles: Random sample concensus: a paradigm for model fitting with applications