60
A SYNTAX FOR IMAGE UNDERSTANDING Narendra Ahuja University of Illinois at Urbana-Champaign May 21, 2009 Work Done with . Sinisa Todorovic, Mark Tabb, Himanshu Arora, Varsha . Hedau, Bernard Ghanem, Tim Cheng .

A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

A SYNTAX FOR IMAGE UNDERSTANDING

Narendra Ahuja

University of Illinois at Urbana-Champaign

May 21, 2009

Work Done with .

Sinisa Todorovic, Mark Tabb, Himanshu Arora, Varsha .

Hedau, Bernard Ghanem, Tim Cheng .

Page 2: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

The Question

What is a good low-level image representation

to enable

Object Recognition,

Reasoning,

Synthesis, ... ?

Page 3: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

What is an Object?

Object = Layout of parts with some Intrinsic Properties

e.g., Wall = Layout of Doors, Windows …

Each Part is itself a (simpler) Object Object = Hierarchy

e.g., Building WallWindows...

Object Complexity = Complexity of parts/hierarchy and layout

e.g., Building comprised of bricks

Page 4: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

What is Not an Object

A crowded city street

A serene landscape

Allowed but not for today, for focus on more immediate issues

Page 5: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

The Scene Object

Scene = Layout of Objects

= Hierarchical Layout

Page 6: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

From Scene to Image

Imaging Preserves Localization

Image = Hierarchical Layout of Regions

Page 7: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Image vs. Objects

Image

Subimages of Parts

Smallest Subimages

(= Smallest parts)

Image

Simpler Objects

Primitive Objects

Page 8: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Recognition and Segmentation

Do not have access to windows with only the object of

interest

For model acquisition as well as subsequent recognition

Need to consider Simultaneous Segmentation and

Modeling/Recognition

Page 9: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Combinatorial Problem

Where is Which Object?

Too many possible subimages

To be matched with object models

Circular problem

Reduce combinatorial complexity,

by reducing object/image size

Page 10: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Parts are Simpler to Represent/Model

Smaller images/objects are

likely to be easier to handle

i.e.

Number of matching Object Models is

likely to be Smaller

Primitive Objects have

the Smallest Number of Candidate Models

Page 11: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Object Representation is Recursive

Object

=

Arrangement of Parts

Characterized by three types of Properties

Photometric Geometric Topological

Each Part is sufficiently simple, or is an

Page 12: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Breaking the Loop

Identify Candidate Subimages

From

A Hierarchical Partitioning of an Image

i.e.,

A Multiscale, Low-Level Image Segmentation

Segments = Objects of different complexity

Page 13: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Why Segments as Candidate Objects

Photometric Segments useful estimates of objects

Because

Object Boundary

Almost Always = Photometric Boundary

Although Photometric Boundary

May or May not = Object Boundary

Because

Independent Objects

Independent shape, orientation, reflectance

Segment/Object Contour

Page 14: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

The Argument of Dimensionality

Segment dimensionality = 2D

= Our object dimensionality

Segment information capacity matched with object

vs.

Lower dimensional representations

e.g.,

Point features

Edge fragments

Although 3D still missing

Page 15: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Extensibility

Due to more complete correspondence to parts

Segments

• Simplify analysis/reduce dependence on tools

• Offer greater promise for moving beyond

the basic tasks of today

e.g., to more complex objects,

more abstract objects,

context sensitivity...

Page 16: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Representation Issues vs. Analysis Details

Will focus more on the representation issues

and skip

Detailed tools to carry out the various tasks

e.g. tools for: Probabilistic analysis

Structural analysis

Page 17: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Image Representation

Image Homogeneous regions at

ALL contrasts and sizes

Multiscale

Segmentation

Extract Hierarchical Layout of Regions

Region = Largest Homogeneous Set of Contiguous Pixels

Ahuja PAMI96, Tabb&Ahuja TIP97, Arora&Ahuja ICPR 06

Page 18: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Example Segmentations for Several Contrasts

in Photometric Hierarchy

Page 19: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Image Representation = Segmentation Tree

Multiscale Segmentation Segmentation Tree (of embedded regions)

Page 20: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Image Objects and Image Segmentation Tree

• Images Э Multiple Independent Objects

• Image Tree Э Multiple independent Subtrees

• Each Object = One or More Subtrees

• Object Modeling = Capturing Object Subtrees

Page 21: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

• Photometric: Intensity contrast and variance

• Geometric:

• Area, variance of children areas

• 1st central moment, eccentricity

• Squared perimeter over area

• Topological:

• Angle between child and parent’s principal axes

• Displacement of child centroids

• Context vector: spatial distribution of sibling regions

• Todorovic&Ahuja PAMI07, IJCV07

Examples of Properties

Page 22: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Modeling and Recognition = Subtree Matching

Discovery = Matching across image sets (frequency)

Modeling = Finding canonical tree of an object

category

as pdf’s of properties and structure

Recognition = Probabilistic matching

All unsupervised

Page 23: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Model from Multiple Instances of Objects in a Category

Aligning and

Registering

Category

Occurrences

Sets of Matching

Nodes

Page 24: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Object Category Model = Stochastic Tree Structure

region properties

number of children

Object part (hidden)

Exponential Gaussian

Markovian chainstructure + parameters

Each Node and Branch Probabilistically Determined

Page 25: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Model = Grammar

Object Subtree Model

=

Tree of Probability Density Functions

=

Stochastic Grammar

Page 26: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

From Model to Simultaneous Recognition and Segmentation

Inference = Matching image tree against the learned tree model

Page 27: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Results: Weizman Horses

training

images

category model

Page 28: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Results: Weizmann Horses

• Object segmentation is good on contours that are:

• Jagged

• Blurred

• Form complex patterns

• Low-contrast regions merge with background

Page 29: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Recall and Precision

Page 30: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Real World

> 30,000 categories

Page 31: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Too Many Categories

• 30000 independent models is not a good idea

Because world is not full of unrelated things

a. Parts are shared among objects

b. In different configurations in different objects

c. So category representations interrelated

d. This is directly reflected in apparent organization of Human

Knowledge/Semantics

Page 32: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Any similar 2D objects?

Arbitrary Images

Category = Set of Similar 2D ObjectsCategories Found

Page 33: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Scaling Up Category Representation

• Categories = Configurations of Shared Subcategories

• Subcategories are simpler and smaller

• Robust detection

• Sharing = Sublinear complexity Minimal computation

unshared

object parts

Page 34: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Multi-Category Representation = Taxonomy

• Interleaved Trees of

• Probability Density Functions of

• Tree Structures, and Tree Node

Properties

Page 35: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

UIUC Hoofed Animals Dataset: Contains Six Animals

Page 36: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Simultaneous Recognition and Segmentation

Page 37: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Results: AnimalsSimultaneous Detection, Recognition, SegmentationSimultaneous Recognition and Segmentation

Page 38: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Taxonomy Structure

Vs

Observed Category Statistics

Page 39: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Not All Subcategories are Equally Informative

• So far

• P (Detection) = P (Match Quality)

• = Decision Making Based on Likelihood

• Uniform Priors on

• P (Subcat)

• P (Cat| Subcat)

Page 40: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

But Discovered Unshared Provide More Evidence

If legs, then many possibilities

If antlers, then very likely deer

If lake, then very unlikely desert

Unshared Categories Uniqueness

Page 41: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Need Bayesian Detection

• During Training on Representative Datasets

• Estimate P(Cat)

• Estimate P(Cat| Subcat)

• Todorovic&AhujaCVPR08

Page 42: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Results: Caltech-101 and Caltech-256

Caltech-101

Caltech-256

Page 43: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Bringing In Layout

Page 44: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

So Far Pure Hierarchy

Image = Segmentation TREE of Regions

Object = Subtree Actually {Subtrees}

= Recursive Embedding of Regions

Taxonomy = Interleaved STs

All Characterized by Probabilities

Page 45: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Problems with Pure Hierarchy

No Explicit Layout Information

Object Model = No Neighbor Relationships Among Parts

Undesirable Consequence:

Recognition Insensitive to Spatial Scrambling of Parts !

Page 46: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Solution = Connected Segmentation Tree (CST)

Add Links between Neighbor Nodes

Implementation = Links between Siblings

Result: Connected Segmentation Tree (CST)

= Hierarchy of Neighbor Graphs

Ahuja&Todorovic, CVPR’08

Page 47: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

CST Based Taxonomy

Each Category = CST Subtree

(Actually {SubCSTs})

Taxonomy = Interleaved CSTs

= Interleave Hierarchies of Neighbors Graphs

Page 48: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Training Images Discovered CST Category Model

Results: Weizmann Horses

Page 49: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

ST vs. CST

Degree of occlusion

artificially made in the image

Binary strength of

neighbor relationships

Real-valued strength of

neighbor relationships

Page 50: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

ST vs. CST

Input Images Segmentation Tree CST

UIUC Hoofed Animals

LabelMe

CSTs outperform STs

Especially for partial occlusion, or

When only region layout is used without containment

Page 51: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Vs. Language

Embedding = Hierarchy (and Legolike compatibility)

Neighbors = Juxtaposition

Occlusion = Only Subtrees of Object tree visible

Inter-object Interaction/combinatorics friendlier

Ordering/Multiple Counting addressed by structure

Page 52: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Instability of Segmentation Addressable

• Splitting and merging of adjacent regions

• Partial Matching

Hedau&AhujaCVPR08, Cheng and Ahuja

Page 53: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Syntax should Feed Multiple Semantics

A Representation Should Work

for Multiple Applications

Page 54: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Modeling 2.1D Texture

• Physical texels are characterized by

• Texel thickness << Texel distance

• Inter-texel occlusion

• Only a part of a texel may be visible

• Visible texel parts = Samples of

different, unknown texel parts

Page 55: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Learning Texel Model

union + PDF2.1D texture identified subimages registration

Ahuja and Todorovic, ICCV07

Page 56: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Texel Extraction Results

Page 57: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Another Example – Texture Segmentation

Page 58: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Texture Segmentation

Page 59: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Another Example: Texel Distribution

How are texels distributed across texture

Ghanem and Ahuja, Submitted

Page 60: A SYNTAX FOR IMAGE UNDERSTANDINGWhy Segments as Candidate Objects Photometric Segments useful estimates of objects Because Object Boundary Almost Always = Photometric Boundary Although

Summary

• Syntax = Connected Segmentation Tree

• Semantics = Recognition, Synthesis, ...

• Model = Stochastic Grammar

• Inference = Grammar Based Parsing/Recognition (Not

covered)

• Tools = Structural and Statistical Analysis (Not covered)