Download ppt - Cvpr2007 object category recognition p4 - combined segmentation and recognition

Part 4: Combined segmentation and recognition

by Rob Fergus (MIT)

Aim• Given an image and object category, to segment the object

Segmentation should (ideally) be• shaped like the object e.g. cow-like• obtained efficiently in an unsupervised manner• able to handle self-occlusion

Segmentation

ObjectCategory

Model

Cow Image Segmented Cow

Slide from Kumar ‘05

Feature-detector view

Examples of bottom-up segmentation

• Using Normalized Cuts, Shi & Malik, 1997

Borenstein and Ullman, ECCV 2002

Jigsaw approach: Borenstein and Ullman, 2002

Perc

ep

tual an

d S

en

sory

Au

gm

en

ted

Com

pu

tin

g

Inte

rleaved

Ob

ject

Cate

gori

zati

on

an

d S

eg

men

tati

on

Implicit Shape Model - Liebe and Schiele, 2003

BackprojectedHypotheses

Interest PointsMatched Codebook Entries

Probabilistic Voting

Voting Space(continuous)

Backprojection

of Maxima

Segmentation

Refined Hypotheses(uniform sampling)

Liebe and Schiele, 2003, 2005

Random Fields for segmentation

I = Image pixels (observed)h = foreground/background labels (hidden) – one label per pixel = Parameters

PriorLikelihood

)|(),|()|,(),|( hphIphIpIhp

Posterior Joint

1. Generative approach models joint Markov random field (MRF)

2. Discriminative approach models posterior directly Conditional random field (CRF)

I (pixels)Image Plane

i

j

h (labels)

{foreground,background}

hi

hj Unary Potential

i(I|hi,i)

Pairwise Potential (MRF)

ij(hi, hj|ij)

ijijjiij

iiii hhhI

Z)|,(),|(

)(

1

MRF PriorLikelihood

Generative Markov Random Field

)|(),|()|,( hphIpIhp

Prior has no dependency on I

Conditional Random FieldLafferty, McCallum and Pereira 2001

ijijjiij

iiii IhhIh

IZIhp )|,,()|,(

),(

1),|(

PairwiseUnary

• Dependency on I allows introduction of pairwise terms that make use of image.

• For example, neighboring labels should be similar only if pixel colors are similar Contrast term

Discriminative approach


i

j

hi

hj

e.g Kumar and Hebert 2003


i

j

hi

hj

Figure from Kumar et al., CVPR 2005

OBJCUT

Ω (shape parameter)

Kumar, Torr & Zisserman 2005

ijijjiijijjiijii

iiii hhIhhhhIIhp ),,|()|,( )|(),|(),,|( 2121

PairwiseUnary

• Ω is a shape prior on the labels from a Layered Pictorial Structure (LPS) model

• Segmentation by:

- Match LPS model to image (get number of samples, each with a different pose

-Marginalize over the samples using a single graph cut [Boykov & Jolly, 2001]

Label smoothness

ContrastDistance from Ω

Color Likelihood

OBJCUT:Shape prior - Ω - Layered Pictorial Structures (LPS)

• Generative model

• Composition of parts + spatial layout

Layer 2

Layer 1

Parts in Layer 2 can occlude parts in Layer 1

Spatial Layout(Pairwise Configuration)

Kumar, et al. 2004, 2005

In the absence of a clear boundary between object and background

SegmentationImage

OBJCUT: ResultsUsing LPS Model for Cow

Levin & Weiss [ECCV 2006]

ij

jii

IFi hhjiwhhIhEi

),();( ,

Segmentation alignment with image edges

Resulting min-cut segmentation

Consistency with fragments segmentation

[Lepetit et al. CVPR 2005]

• Decision forest classifier

• Features are differences of pixel intensities

Classifier

Winn and Shotton 2006

Layout Consistent Random Field

Layout consistency

(8,3) (9,3)(7,3)

(8,2) (9,2)(7,2)

(8,4) (9,4)(7,4)

Neighboring pixels

(p,q)

? (p,q+1)(p,q) (p+1,q

+1)(p-

1,q+1)

Layoutconsist

ent


Layout Consistent Random Field

Layout consistency

Part detector


Stability of part labelling

Part color key

Object-Specific Figure-Ground Segregation

Stella X. Yu and Jianbo Shi, 2002

Image parsing: Tu, Zhu and Yuille 2003

Image parsing: Tu, Zhu and Yuille 2003

Segment out all the cars

….

fused tree model for cars

Unseen image

Training images

Segmented Cars

Segmentation Trees

OverviewOverview

Multiscale Seg.

Todorovic and Ahuja, CVPR 2006

Slide from T. Wu

LOCUS model

Deformation field D

Position & size T

Class shape π Class edge sprite μo,σo

Edge image e

Image

Object appearance λ1

Background appearance λ0

Mask m

Shared between images

Different for each image

Kannan, Jojic and Frey 2004Winn and Jojic, 2005

In this section: brief paper reviews

• Jigsaw approach: Borenstein & Ullman, 2001, 2002• Concurrent recognition and segmentation: Yu and Shi,

2002• Image parsing: Tu, Zhu & Yuille 2003 • Interleaved segmentation: Liebe & Schiele, 2004, 2005• OBJCUT: Kumar, Torr, Zisserman 2005• LOCUS: Winn and Jojic, 2005• LayoutCRF: Winn and Shotton, 2006• Levin and Weiss, 2006• Todorovic and Ahuja, 2006

Summary

• Strength– Explains every pixel of the image– Useful for image editing, layering, etc.

• Issues– Invariance issues

• (especially) scale, view-point variations

– Inference difficulties

Conditional Random Fields for Segmentation

• Segmentation map x• Image I

Low-level pairwise term High-level local term

Pixel-wise similarity

Object-Specific Figure-Ground Segregation

Some segmentation/detection results

Yu and Shi, 2002

• Multiscale Conditional Random Fields for Image Labeling

• Xuming He Richard S. Zemel Miguel A´ . Carreira-Perpin˜a´n

• Conditional Random Fields for Object

• Recognition

• Ariadna Quattoni Michael Collins Trevor Darrell

OBJCUT

Probability of labelling in addition has• Unary potential which depend on distance from Θ (shape parameter)

D (pixels)

m (labels)

Θ (shape parameter)

Image Plane

Object CategorySpecific MRFx

y

mx

my

Unary PotentialΦx(mx|Θ)

Kumar, et al. 2004, 2005

Localization using features

Levin and Weiss 2006

Levin and Weiss, ECCV 2006

Results: horses

Results: horses

Cows: Results• Segmentations from interest points

Single-frame recognition - No temporal continuity used!

Liebe and Schiele, 2003, 2005

Examples of low-level image segmentation

• Normalized Cuts, Shi & Malik, 1997

Borenstein & Ullman, ECCV 2002

LayoutCRF